avogadro: open source libraries and application for computational chemistry

1
Avogadro: Open Source Libraries and Application for Chemistry Marcus D. Hanwell 1 , and Jens Thomas 2 1: Scientific Computing Group, Kitware, Inc, 28 Corporate Drive, Clifton Park, NY 12065, USA. 2: Institute of Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK. http://openchemistry.org/ Avogadro In order to tackle upcoming molecular simulation and visualization challenges in key areas of materials science, chemistry and biology it is necessary to move beyond fixed software applications. The Avogadro project is in the final stages of an ambitious rewrite of its core data structures, algorithms and visualization capabilities. The project began as a grass roots effort to address deficiencies observed by many of the early contributors in existing commercial and open source solutions. Avogadro is now a robust, flexible solution that can tie in to and harness the power of VTK for additional analysis and visualization capabilities. Figure 1: The Avogadro 2 application shown displaying different rendering styles of different molecule views. Avogadro is developed as a set of software libraries designed with reuse in mind, and an application that makes use of these libraries along with plugins loaded at runtime to provide the bulk of the functionality. This enables scientists to rapidly make use of the functionality available, and it provides a rich set of reusable components available under a permissive, open source 3-clause BSD license to encourage extension and reuse. Editing The editing capabilities of the application are extremely important, with this being one of the major drivers for creating the original Avogadro application. A great deal of work has gone into developing a robust molecule model, along with supporting structures to support undo/redo more efficiently. Additionally, support for a large range of file formats is provided using native readers/writers, and integration with the Open Babel program. Figure 2: The bond-centric manipulation tool, and molecular orbital showing periodic table element selection. Structures can also be loaded from online chemical data repositories, with a number of editing capabilities provided. There is a simple molecule drawing tool that features element/bond order selection, live updates, hydrogen addition/removal and geometry optimization using a range of simple molecular force fields. There are also a number of tools such as simple atom position adjustment, bond-centric manipulation and Cartesian coordinate editing. Input Preparation Once the structure is complete, one of the first steps can be to prepare it for submission to a simulation package that performs a more complete geometry optimization, or calculates the electronic structure of the molecule. This has always been a strong focus for the Avogadro project, with the rewrite featuring a range of packaged input generators such as those shown below. Figure 3: The NWChem input generator (left, Python script), and GAMESS input generator (right, C++ plugin). There has also been a concentrated effort on making the development of new input generators as simple as possible. C++ plugins can be developed as before, but a simpler method is to write a Python script that the application will use to drive input generation. Open Chemistry The Open Chemistry project is a suite of applications and support libraries to improve the workflow in computational chemistry, biology, materials science and related areas. The project consists of a set of open, connected components that can tackle everything from small problems on the desktop, up to large research projects requiring significant computational power. The components can be used on its own, but the integrated components offer compelling solutions that can significantly improve complex workflows involving local and remote computation, storage, indexing and search of chemical data. Figure 4: The MoleQueue application (left) showing completed remote jobs, and MongoChem (right) showing molecular data with a 2D and 3D views (also features significant charting capabilities. Avogadro 2 is one component of this larger effort. MoleQueue and MongoChem are additional components of the Open Chemistry project. MoleQueue has been developed to make it easier for desktop applications (including Avogadro 2) to run external programs locally, and to submit jobs to remote clusters/supercomputers. MongoChem has been developed to make it easier for individuals, groups and organizations to collect and search their small molecule data sets. Visualization and Analysis Once the calculations have been performed, it is then necessary to load, visualize and analyze the data. This is another area that has seen significant development with the addition of a new set of scalable input/output routines, improved data structures and rewritten rendering components. This enables the efficient loading, analysis and visualization of systems that were too large for previous versions to work with. Avogadro is now able to make full use of the visualization capabilities of VTK, in addition to its own powerful rendering capabilities. This means that complex visualization, involving techniques such as volume rendering for point data, or streamlines for vector fields, will now become possible. Figure 5: Visualization of large, porous system (left), ambient occlusion (center) and QTAIM (right). In addition to significant improvements in these subsystems, the extension of input/output was also reexamined. In a similar vein to the input generators, a simple set of APIs were developed that enable new input and output formats to be added dynamically at runtime using simple Python scripts. The use of a scene graph and advanced impostor rendering techniques make interacting with large datasets simple while retaining even higher visual quality for small molecules. Molecular Application Toolkit The Avogadro 2 libraries represent a total rethink of providing libraries for chemical applications—moving away from providing one monolithic library with all functionality into several dedicated libraries with minimal dependencies. This means that projects wishing to take advantage of data structures and input/output can do so using just a C++ compiler, whereas applications that want to provide simple 3D rendering capabilities can do that without being required to use Qt. Figure 6: Avogadro 2 libraries with dependencies (left), and overview of software process (right). The project is composed of two separate repositories, with the ‘avogadroapp’ repository offering a full demonstration of how to use the libraries in an end-user application. The ‘avogadrolibs’ repository contains all of the libraries, with the option to only build subsets. The development process uses distributed version control (Git), testing (CTest/CDash), and automated binary generation.

Upload: marcus-hanwell

Post on 13-Dec-2014

208 views

Category:

Science


3 download

DESCRIPTION

In order to tackle upcoming molecular simulation and visualization challenges in key areas of materials science, chemistry and biology it is necessary to move beyond fixed software applications. The Avogadro project is in the final stages of an ambitious rewrite of its core data structures, algorithms and visualization capabilities. The project began as a grass roots effort to address deficiencies observed by many of the early contributors in existing commercial and open source solutions. Avogadro is now a robust, flexible solution that can tie in to and harness the power of VTK for additional analysis and visualization capabilities.

TRANSCRIPT

Page 1: Avogadro: Open Source Libraries and Application for Computational Chemistry

Avogadro: Open Source Libraries and Application for Chemistry

Marcus D. Hanwell1, and Jens Thomas2

1: Scientific Computing Group, Kitware, Inc, 28 Corporate Drive, Clifton Park, NY 12065, USA.2: Institute of Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK.

http://openchemistry.org/

Avogadro

In order to tackle upcoming molecular simulation and visualization challenges in keyareas of materials science, chemistry and biology it is necessary to move beyond fixedsoftware applications. The Avogadro project is in the final stages of an ambitious rewriteof its core data structures, algorithms and visualization capabilities. The project beganas a grass roots effort to address deficiencies observed by many of the early contributorsin existing commercial and open source solutions. Avogadro is now a robust, flexiblesolution that can tie in to and harness the power of VTK for additional analysis andvisualization capabilities.

Figure 1: The Avogadro 2 application shown displaying different rendering styles of different molecule views.

Avogadro is developed as a set of software libraries designed with reuse in mind, andan application that makes use of these libraries along with plugins loaded at runtime toprovide the bulk of the functionality. This enables scientists to rapidly make use of thefunctionality available, and it provides a rich set of reusable components available undera permissive, open source 3-clause BSD license to encourage extension and reuse.

Editing

The editing capabilities of the application are extremely important, with this being oneof the major drivers for creating the original Avogadro application. A great deal of workhas gone into developing a robust molecule model, along with supporting structures tosupport undo/redo more efficiently. Additionally, support for a large range of file formatsis provided using native readers/writers, and integration with the Open Babel program.

Figure 2: The bond-centric manipulation tool, and molecular orbital showing periodic table element selection.

Structures can also be loaded from online chemical data repositories, with a numberof editing capabilities provided. There is a simple molecule drawing tool that featureselement/bond order selection, live updates, hydrogen addition/removal and geometryoptimization using a range of simple molecular force fields. There are also a number oftools such as simple atom position adjustment, bond-centric manipulation and Cartesiancoordinate editing.

Input Preparation

Once the structure is complete, one of the first steps can be to prepare it for submission toa simulation package that performs a more complete geometry optimization, or calculatesthe electronic structure of the molecule. This has always been a strong focus for theAvogadro project, with the rewrite featuring a range of packaged input generators suchas those shown below.

Figure 3: The NWChem input generator (left, Python script), and GAMESS input generator (right, C++ plugin).

There has also been a concentrated effort on making the development of new inputgenerators as simple as possible. C++ plugins can be developed as before, but a simplermethod is to write a Python script that the application will use to drive input generation.

Open Chemistry

The Open Chemistry project is a suite of applications and support libraries to improvethe workflow in computational chemistry, biology, materials science and related areas.The project consists of a set of open, connected components that can tackle everythingfrom small problems on the desktop, up to large research projects requiring significantcomputational power. The components can be used on its own, but the integratedcomponents offer compelling solutions that can significantly improve complex workflowsinvolving local and remote computation, storage, indexing and search of chemical data.

Figure 4: The MoleQueue application (left) showing completed remote jobs, and MongoChem (right) showing molecular

data with a 2D and 3D views (also features significant charting capabilities.

Avogadro 2 is one component of this larger effort. MoleQueue and MongoChem areadditional components of the Open Chemistry project. MoleQueue has been developedto make it easier for desktop applications (including Avogadro 2) to run external programslocally, and to submit jobs to remote clusters/supercomputers. MongoChem has beendeveloped to make it easier for individuals, groups and organizations to collect and searchtheir small molecule data sets.

Visualization and Analysis

Once the calculations have been performed, it is then necessary to load, visualizeand analyze the data. This is another area that has seen significant developmentwith the addition of a new set of scalable input/output routines, improved datastructures and rewritten rendering components. This enables the efficient loading,analysis and visualization of systems that were too large for previous versions towork with. Avogadro is now able to make full use of the visualization capabilitiesof VTK, in addition to its own powerful rendering capabilities. This means thatcomplex visualization, involving techniques such as volume rendering for point data,or streamlines for vector fields, will now become possible.

Figure 5: Visualization of large, porous system (left), ambient occlusion (center) and QTAIM (right).

In addition to significant improvements in these subsystems, the extension ofinput/output was also reexamined. In a similar vein to the input generators, asimple set of APIs were developed that enable new input and output formats to beadded dynamically at runtime using simple Python scripts. The use of a scene graphand advanced impostor rendering techniques make interacting with large datasetssimple while retaining even higher visual quality for small molecules.

Molecular Application Toolkit

The Avogadro 2 libraries represent a total rethink of providing libraries forchemical applications—moving away from providing one monolithic library with allfunctionality into several dedicated libraries with minimal dependencies. This meansthat projects wishing to take advantage of data structures and input/output can doso using just a C++ compiler, whereas applications that want to provide simple 3Drendering capabilities can do that without being required to use Qt.

Figure 6: Avogadro 2 libraries with dependencies (left), and overview of software process (right).

The project is composed of two separate repositories, with the ‘avogadroapp’repository offering a full demonstration of how to use the libraries in an end-userapplication. The ‘avogadrolibs’ repository contains all of the libraries, with theoption to only build subsets. The development process uses distributed versioncontrol (Git), testing (CTest/CDash), and automated binary generation.