iproclass protein knowledgebase

23
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase

Upload: suzuki

Post on 29-Jan-2016

55 views

Category:

Documents


0 download

DESCRIPTION

iProClass Protein Knowledgebase. Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins. Ways to get to iProClass text search. iProClass Text Search. Search tips: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: iProClass Protein Knowledgebase

Integration of Protein Family, Function, Structure

Rich Links to >90 Databases

Value-Added Reports for UniProtKB Proteins

iProClass Protein Knowledgebase

Page 2: iProClass Protein Knowledgebase

iProClass Text Search

Search!

Select field

Ways to get to iProClass text search

Add (+)/delete (-) input boxes

Search tips:1- Use “not null” or “null” to search entries that “contain” or “do not contain” information in the selected search field, respectively. In the present example, we want to search for proteins that have enzymatic activity corresponding to EC 1.14.16.1 and have 3D structure (PDB ID not null).2- Use and/or/not logical operators

Page 3: iProClass Protein Knowledgebase

iProClass Text Search Result (I)Things you can do from the result table:

1. Add search terms or start over

1

2

34

5

2. Customize the table columns

3. Save your results as table or FASTA format

4. Select entries using check boxes and perform analysis using tool bar options

5. Links to protein records, to protein names (BioThesaurus), to protein families (PIRSF)

Page 4: iProClass Protein Knowledgebase

iProClass Text Search Result (II)2. How to customize the table columns: Display PDB ID column

a- Select PDB ID in the “Fields not in display” box

c- Now PDB ID should be in the “Field in display”. Press apply button for the changes to take place.

b- Use the > to add item into the “Fields in display” box

Page 5: iProClass Protein Knowledgebase

iProClass Text Search Result (III)3. Save your results as table or FASTA format

a- Select Entries using check boxes in the Protein AC/ID column. To select all, check the box in the column heading.

b- Click on “Save Result As: Table” to store the information in the result table. This file can be opened in Excel as shown below.

c- Click on FASTA to save protein sequences.

Page 6: iProClass Protein Knowledgebase

iProClass Text Search Result (IV)

a- Select Entries using check boxes in the Protein AC/ID column. To select all, check the box in the column heading. Then select tool, e.g., Domain Display

4. Select entries using checkboxes and perform analysis using tool bar options

Domain Display shows Pfam domains present in the proteins selected

Page 7: iProClass Protein Knowledgebase

iProClass Text Search Result (V)5. Links to protein records, to protein names (BioThesaurus), to protein families (PIRSF)

Link to protein reports

Link to protein names

Link to taxonomy

Link toPIRSF report

Link to pre-computed

BLAST

Page 8: iProClass Protein Knowledgebase

iProClass Protein Report (I)

pre-computedBLAST

Rich links & extensive cross-references

Shows ID correspondence to other databases

See protein synonyms

Page 9: iProClass Protein Knowledgebase

iProClass Protein Report (II)

Integrated added-value information from other databases

Page 10: iProClass Protein Knowledgebase

iProClass Protein Report (III)

Links to different protein family classification databases

Interactive Domain and Sequence Display

Page 11: iProClass Protein Knowledgebase

iProClass Text Search Result (VII)

See protein synonyms and the source attribution

Page 12: iProClass Protein Knowledgebase

iProClass Text Search Result (VII)

Related Sequences (pre-computed BLAST) show proteins similar to the query, significantly faster than running BLAST in real time, and may also evidence tight protein clusters (related sequence number low).

Page 13: iProClass Protein Knowledgebase

iProClass Protein Knowledgebase

Page 14: iProClass Protein Knowledgebase

Batch Retrieval in iProClass

If possible, specify the type of ID

397983330413124660393

Due to the diversity of databases and the lack of consistency in protein/gene names and/or identifiers in the literature, it can be difficult to retrieve multiple entries when protein and gene identifiers come from different sources. The batch retrieval tool overcomes this problem and provides high flexibility, allowing the retrieval of multiple entries from the iProClass database by selecting a specific identifier or a combination of them.

Page 15: iProClass Protein Knowledgebase

Batch Retrieval Result Page

Choose columns to be displayed

Retrieve more sequences

Links to iProClass and UniProtKB reports

Page 16: iProClass Protein Knowledgebase

iProClass Protein Knowledgebase

Page 17: iProClass Protein Knowledgebase

Search a Pattern in iProClass

A pattern is a formula (regular expression) that represents the conserved region of a group of related proteins.

PROSITE is a database that contains patterns and profiles specific for more than a thousand protein families or domains.

Pattern search at PIR allows:1- The search of a specific PROSITE or user-defined pattern against one of the following sequence database: (i) UniProtKB is the central hub for the collection of functional information on proteins, with accurate, consistent, and rich annotation. It consists of two sections: a section containing manually-annotated records (UniProtKB/Swiss-Prot), and a section with computationally analyzed records that await full manual annotation (UniProtKB/TrEMBL). (ii) A subset of UniProtKB entries belonging to a certain organism or taxon group. (iii) UniRef100 provides clustered sets of sequences at 100% identity from UniProtKB (including splice variants and isoforms) and selected UniParc records.

P-D-x(2)-H-[DE]-[LIVF]-[LIVMFY]-G-H-[LIVMC]-[PA] Enter pattern

Enter PROSITE ID

Page 18: iProClass Protein Knowledgebase

Sequence range where pattern is found

Display the query pattern

Search a Pattern Result in iProClass

Page 19: iProClass Protein Knowledgebase

Search a Pattern in iProClassPattern search at PIR allows:2- The search of PROSITE patterns (note that profiles are excluded) in a query sequence, entering the single amino acid code sequence or its unique ID.

MNDRADFVVPDITTRKNVGLSHDANDFTLPQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFMEGLERLEVD

Enter sequence

Enter ID

Link to PROSITE documentation

Page 20: iProClass Protein Knowledgebase

iProClass Protein Knowledgebase

Page 21: iProClass Protein Knowledgebase

Protein ID Mapping Service Maps between UniProtKB and more than 30 other data sources to support data interoperability among disparate data sources and to allow integration and querying of data from heterogeneous molecular biology databases.

Enter IDs

Load file with ID list

Page 22: iProClass Protein Knowledgebase

Protein ID Mapping Service Example: we want to obtain a list of Entrez Gene IDs for a group of UniProtKB proteins

Enter IDs

IDs can be cut and pasted if needed or saved as a text file using the "save as" option provided by your web browser.

P04176 P16331 P00439 P17276

Select ID type forsource database

Select ID type fortarget database

Mapping

Page 23: iProClass Protein Knowledgebase

iProClass Protein Knowledgebase

The iProClass Integrated database for protein functional analysis Wu CH, Huang H, Nikolskaya A, Hu Z, Yeh LS, Barker WC.Computational Biology and Chemistry, 28: 87-96, 2004.

iProClass is freely available for academic institutions. Vendors and commercial entities who want to use and/or redistribute iProClass need to contact PIR to request a license ([email protected]).

Cite iProClass:

iProClass Distribution: