rmll visits at cern – july 2012 what is it used for? depositing archiving organizing disseminating...
TRANSCRIPT
What is it used for?• Depositing• Archiving• Organizing• Disseminating
• Any type of document~350GB of PDFs at CERN
~20TB of images and videos1M records
LOGO
What is
LOGO
‣ Integrated Digital Library / Repository software
‣ A platform of choice for managing documents in HEP
‣ also adopted in other fields (medium to big repositories)
‣ Web application
‣ Open-source GPL-2 project
‣ LAMP stack: Python (mostly), MySQL and Apache
‣ Based on open standardsMARCXML, OAI-PMH, OpenURL, OpenSearch, etc.
‣ Flexible, scriptable
Invenio’s gears• Lots of Python, with a sprinkle of C and Lisp(!)• 630K lines of Python code• MySQL ISAM for storing data• Native indexing engine• Apache + mod_wsgi + mod_xsendfile
LOGO
Invenio’s History1954 CERN library starts paper dissemination of preprints (early Open
Access)
1965 First computers at CERN library to help with cataloging
1990 Electronic distribution of preprints via FTP
1993 CERN Preprint Server, web front-end of electronic preprint catalogue. Institutional repository
1996 CERN Library Server (weblib): added books, periodicals and "other material”.
2000 CERN Document Server: multimedia material, internal notes
2002 First public release of the software under GNU-GPL.Worldwide installations and collaborations
Open Access at CERN• “Consistent with the stated position of the Collaborations and the General Conditions applicable
to Experiments at CERN, every effort will be made to publish papers under Open Access conditions, as defined by the SCOAP3 initiative. As at the date of this document, the Creative Commons Attribution ("cc by") license meets these conditions.”
• OA at CERN has a long history, the CERN Convention of 1953 states: "...the results of its experimental and theoretical work shall be published or otherwise made generally available".
LOGO
Our development Environment• Git distributed version control system• Trac for ticket tracking• VirtualBox + Vagrant for testing
deployment• We develop on SLC5/6 (based on
RHEL5/6), on Ubuntu, on Debian…
LOGO
Quality Assurance• Coding standards
• Eg. PEP8 (Style Guide for Python), etc.
• Documentation• "If the code and the comments disagree, then both are probably wrong."
– attributed to Norm Schryer
• Test suite
• ~1,000 unit/regression/web tests
• Security• XSS, CSRF, SQL injection, etc.
• Code review
• Kwalitee check: "measuring" quality• "It looks like quality, it sounds like quality, but it’s not quite
quality.”– CPAN Testing Service (quoting Michael Schwern)
LOGO
Our community
• 30 institutions worldwide• CERN + DESY + Fermilab + SLAC• EPFL …• ADS and arXiv joining forces• Translated so far into 26 languages• 45 committers (in the last year)• Free + Paid support
LOGO
An example installation
LOGO
• 1 Load balancer (HAProxy + Apache mod_proxy + mod_evasive)
• 5 Worker nodes:• 2 VMs for static files• 3 Real machines for Python handled requests
• 2 DB nodes (MySQL master + MySQL replica)• AFS distributed FS for backups and file storage• Sustained recent Higgs announcement load (230
requests per second with peaks of 800 req/s)
What’s next?• Werkzeug/Flask + Jinja2 + WTForms for the
web framework• SQLAlchemy for DB abstraction• Twitter Bootstrap + jQuery for the style• Optional Solr indexing
LOGO
What is Indico ?• Web-based event organization• Archive of events metadata and related
documents (minutes, slides, etc)• Booking service and collaboration hub
• Rooms• Videoconference• Webcast
LOGO
What is Indico ?• Started as an European Project - 2002
• First time used in 2004
• In production at CERN: http://indico.cern.ch• And in >100 institutions around the world
• GSI, DESY, Fermilab,…• http://indico-software.org/wiki/IndicoWorldWide
• Free and Open Source
LOGO
Technology• Python >2.6 + WSGI
• babel, webassets, pytz, zope.index, zope.interface, simplejson, suds, lxml, zc.queue, python-dateutil, pypdf, pyatom, reportlab, etc
• Mako 0.4.1+ as template engine• ZODB as underlying database (http
://www.zodb.org/)• Web frameworks:
• jQuery• Backbone.js
LOGO
Compatibility• Many browsers compatibility: IE8+, FF3.6+,
GChrome, Safari, etc• Working on mobile version
LOGO
Development Tools• Git as Control Version System• ~ Eclipse + PyDev• Unit and Selenium Test +
Jenkins (Continuous Integration Server)
• Sphinx for Documentation• Trac as Project Site• Github: http://github.com/indico• Transifex for i18n:
https://www.transifex.com/projects/p/indico/