chemical database management j chem base and j chem cartridge: us ugm 2008
DESCRIPTION
JChem Base is a chemical database management toolkit to handle chemical structures and associated data (user-defined or predicted), stored in relational databases. JChem Cartridge provides similar functionality highly integrated into Oracle as well as an Oracle interface to other ChemAxon products. For latest developments see: http://www.chemaxon.com/product/jc_base.htmlTRANSCRIPT
Chemical Database Management
with JChem Base and Cartridge
Szabolcs Csepregi
Solutions for Cheminformatics
Outline
• ChemAxon chemical database products
• Architecture
• Features
• Example interfaces: JSP, ASP examples
• Integration with other CXN tools
• The coming Registration System API
• What is coming in JCB/Cartridge 5.1
2
Chemical database products
• JChem Base
– A library for adding chemical structures into relational
database systems. Available in Java, JSP and .NET
– Open-source web application example is available.
• JChem Cartridge for Oracle
– Extends Oracle SQL with chemical operators and index.
– SQL interface for ChemAxon functionality
• Instant JChem
– An all-in-one desktop chemical database application.
3
JChem Base application architectures
Web application
4
Client
Internet / Intranet
Server
JDBC driver
Custom servlet or JSP scripts
Query structure
SQL
Hits
Web browser
Query Structures +data
Relational database(Oracle, MySQL, MS SQL Server, DB2, etc.)
JChem class library
JChem class library
JChem class library
JChem Base application architectures
Rich client application
5
Client
Internet / Intranet
ServerJDBC driver
Rich client application
Query structure
SQL
Hits
Relational database(Oracle, MySQL, MS SQL Server, DB2, etc.)
JChem class library
JChem class library
JChem class library
Rich client application
JChem Cartridge architecture
The JChem computation engine can be on a
dedicated server to balance workload.
6
Client
Internet / Intranet
ServerOracle JChem Cartridge• PL/SQL• Java stored procedures
JChem Server
JChem Cartridge Adapter
JChem Base
Search Update
JChem core
Cache
Cache
RMI
JDBC
Client application / Application server
SQL
Compatibility and integration
Supported chemical file formats:• SMILES
• MDL MOL/RXN/SDF/RDF (v2000 and v3000)
• CML, MRV
• etc.
Database engines:• Oracle, MySQL, MS SQL Server, MS Access,
PostgreSQL, IBM DB2, Derby, etc.
All operating systems through:• Java API (JChem Base)
• .NET API (JChem Base + JNBridge) – for Windows
• SQL (Cartridge)
7
Structure searching: features• Substructure, Similarity,
Exact, Exact fragment, etc.
Search types
• Wide range of query atoms
• Query properties
• R-group queries
• Full SMARTS support
• Coordination compounds
• Link nodes
• Pseudo atoms, Lone pairs
• Relative stereo
• Reaction search features
• Hit coloring ...
www.chemaxon.com/conf/Structural_Search.ppt
8
Structure searching: optionsSome of the structure search options:
– Chemical Terms filter constraint
– Tautomer search
– Stereo on/off
– Ignore charge/isotope/radical/valence/mixture
brackets
– Vague bond matching modes: „or aromatic”;
ignore bond types
– Inverse hit list
– Maximum search time / number of hits
– SQL SELECT statement for pre-filtering
– Ordering of results
– etc. 9
Structure search: performance
10
JChem Base 5.0, Athlon X2 2.6GHz,
4GB RAM; Oracle 9.2.0.8.0
Number of
compounds
Elapsed time
Duplicates not
checked
Duplicates
checked
10,000 22 s 35 s
100,000 2 min 33 s 4 min 16 s
200,000 4 min 53 s 8 min 19 s
Query Number of hits Search time
12 0.219 s
936 0.375 s
4,608 0.734 s
65,208 5.594 s
Compound
registration:
Substructure search in a
table of 3 million
compounds:
Table typesControl allowed chemical structures and available
operations
• Molecule
• Reaction
• Combinatorial Markush
• Query
• Any structure11
Example interfaces: JSP, ASP• Example web applications: open source JSP, ASP
examples
– Marvin applets
are used for
query drawing
and structure
visualization
• Demo
12
Integration
• Integration with other ChemAxon tools:
– Custom, uniform chemical representation. (Standardizer –
see separate presentation today.)
– Automatically calculated properties by Chemical Terms
Calculated columns (Calculator plugins)
– Additional similarity calculations (Screen - JChem Base
only)
– Tautomer handling:
• Tautomer search
• Tautomer duplicate filter table/index option
• Custom tautomer transforms or canonical tautomer using
Standardizer
– Query drawing and structure visualization (Marvin)Provides the most consistent interface and back-end.
13
Integration
Additional Cartridge functionality
– JChem index (for non-JChem tables)
– Communication with Oracle optimizer
– Reaction based enumeration (Reactor)
– Format conversions – image generation also
– Markush enumeration (Calculator plugins)
– Property predictions through Chemical Terms
(Calculator plugins)
14
Registration system
• New component for registration system will be introduced from
summer, 2008 (API only)
• Main features:
– Customizable business logic
• Multilevel duplication control
• Customizable corporate registration ID
• Handling of salts, batches, lots, samples, and mixtures
– Identification, split and registration of salt and solvent structures
Storage of input structures in original format
– Mock registration (dry run)
– Pre-registration through a transitory area
– Basic, customizable implementation examples
• Separate examples for chemists and registrars
• Web and Instant JChem interfaces will follow later
15
What is coming in JChem 5.1
Structure searching– Position variation
in Markush structures
and queries
– Diastereomer search option(Same tetrahedral stereo centers, but
possibly different configurations.)
– Check sp-hybridization
search option (substructure)
Cartridge installer GUI
16
What is coming in JChem 5.1.X
(In a few months)
• Web Services interface for JChem Base
• Compound registration system API
17
Under development
• Further improvements of Markush handling
(towards patents)
• Flexible 3D pharmacophore searching
• Integration of further ChemAxon functionality in the
Cartridge:
– R-group decomposition
– Custom descriptors & similarity measures
• Include JDBC drivers in installer
• JChem for Excel
18
Summary
• JChem Base, JChem Cartridge and Instant JChem
offer comprehensive and efficient chemical
database solutions.
• They are integrated with many other ChemAxon
products and are accessible from various interfaces.
• Registration system, JChem for Excel and patent
Markush handling are coming.
19
Thank you for your attention!
For more information please visit
www.chemaxon.com
20