VTF recommendations• Model-based indicators
– Covalent geometry (E&H) outliers– Protein backbone (Ramachandran) and sidechains (rotamericity,
flips) outliers– RNA backbone (atypical suites)– Carbohydrates chirality and naming– Ligands
• Features not observed in high-quality small-molecule xtal structures and other instances in PDB
– Packing• Bad vdw clashes• Underpacking, voids• Unusual contacts• Unsatisfied hbond donors, acceptors
VTF recommendations• Data-based indicators
– Wilson plot– Data anisotropy plot– Twinning (Padilla Yeates plot)– Mislabelling of amplitudes / intensities– Translational NCS– Missed symmetry
• Data and model based indicators– R, Rfree
• Reproducibility and difference– Real-space R
• Per-residue measure of fit with 2FoFc map, normalized per residue type
VTF recommendations
• Percentile scores– Per criterion, calculate
the percentile rank against the whole set of X-ray entries and also against structures in its resolution bin
– Update the percentiles periodically
VTF recommendations
• Presentation of results for various consumers– Depositors (and annotators)– Reviewers• Concise PDF report highlighting any unusual features
– End-users (experts and non-experts)• Web-based frontends with adjustable level of detail
– Developers• Webservices and XML files
VTF recommendations
• Validation package– Be open-source and freely distributable• wwPDB sites, labs, companies
– Import/wrap existing 3rd party functionality• EDS (Uppsala), Molprobity, CCDC Mogul, WhatIf• Phenix, CCP4• RosettaHoles, pdb-care, DACA, ProSA
– Calculate recommended validation metrics and publish XML file per entry
– Present XML contents in various kinds of reports
Prototypes – Validation Viewer
Entry viewer
Residue and maps viewer
Raw data and plots of
phi-psi, omega, chi,
B-factor, occupancy, RSR, RSCC
New ligand-validation functionality• Mogul is a chemical mining engine developed by CCDC for
small-molecule xtal structures in CSD– Splits query molecule into bond, angle, torsion and ring
substructures– Finds comparable substructures from high-quality small-mol
structures in CSD
• Compares query substructures against CSD distributions– Bonds, angles: Z scores can be computed– Torsions: Z-score is undefined but gives an idea where a torsion
lies w.r.t. distribution– Rings: computes query ring’s torsion RMSD against each
comparable CSD ring, finds mean, stdev of tRMSDs to estimate a Z score for ring
Prototypes – Mogul webservice
Distribution for the
angle from Mogul
2D & 3D views of ligand
Bonds, angles, torsions, rings
with comparable
CSD fragments
Upload or select a ligand
Validation package(installed on each site)
mmCIFunder
depositionD&AAPI
ValidationXML file
(Data, Percentiles)
DistributionsCalculator(Runs yearly)
DistributionsOracle Database(Time-stamped by year)
DistributionsWebservice
(if DB only at PDBe)
D&AWebservers
D&Aclients
ReleasedValidationXML file
D&A pipeline on all sites
wwPDB sites (PDBe - ?)
Public Access
Example annotations• Atom-level
– clashes• Residue-level
– Average B factor, occupancy– Phi-psi, Rama outliers– Sidechain flips, rotamer outliers– RNA backbone and pucker values
• Atom-group-level– Covalent bond-length and angles outliers
• Chain-level– WhatIf Rama score, average RSR, NCS deviation
• Entry-level– Rfree, Clash-score– twinning, tNCS, anisotropy, fit to ideal Wilson plot
Summary• VTF recommendations will be implemented in a
validation package.
• The package will consist of modules which import/wrap 3rd party functionality .
• The package will be open-source and freely distributable.
• A process for periodically updating distributions and validation XMLs will be implemented.