Protein Feature View: Overview
The Protein Feature View provides graphical summaries of full-length protein sequences from UniProtKB and how they relate to PDB entries. The Protein Feature View loads annotations from external databases such as Pfam and Phosphosite, domain annotations from SCOP and SCOPe, and regions for which homology models are available from the Protein Model Portal. There are also tracks available that display predicted regions of protein disorder (computed with JRONN) and hydrophobic regions, computed using a sliding window approach. Predicted disordered and hydrophobic regions are in red; ordered and hydrophilic regions are in blue.
For human proteins that can be mapped to the human genome, a track describes the projection of the protein structure onto the genome. ProteinFeatureView is currently available for all SwissProt entries (also those without PDB structures), as well as the small subset of TREMBL entries that can get linked to PDB.
By default, representative PDB entries are displayed to provide an overview of UniProtKB sequence regions that are included in PDB entries. The view can be expanded by pressing the "+" icon or by selecting the "Extended" menu option to show all available PDB entries (many, in some cases).
Protein Feature Header Section
Learn more about the protein
The header section of the Protein Feature View displays information from UniProtKB about the function, catalytic activity and subunit structure (if available) of the sequence. The header section also contains an option to select Protein Feature Views from related organisms with the same gene name.
Protein Feature View for Ribulose bisphosphate carboxylase large chain (P23755). Other organisms with the same gene name can be selected from the menu. The number of available PDB structures is shown in gray circles.
The Action button has an option to map sequence motifs in the Protein Feature View as shown below.
Active site sequence motif Gx[DN]FxKxDE (Ribulose bisphosphate carboxylase large chain active site) mapped onto Protein Feature View (red box around mapped region). Note, X matches any amino acid, and [DN] matches either D or N. See Sequence Motif help page for details
Quality of Protein Structures
On Structure Summary pages, the Protein Feature View shows a track that provides a per-residue perspective of the wwPDB validation report.
The track uses color coding to indicate the number of bond angle outliers, bond length outliers, and clashes for a given residue.
- Green: no outliers
- Yellow: 1 outlier
- Orange: 2 outliers
- Red: 3 or more outliers
Shown as an example below is one of the chains of PDB ID 4HHB, a structure of hemoglobin originally released in 1984. As modern refinement and validation tools were not available in 1984, the validation track is mostly orange and red due to the presence of a large number of geometric outliers.
By comparison, (see below) the validation track for one of the chains of the hemoglobin structure PDB ID 2W72, which was released in 2009, shows many fewer geometric outliers, although it does have several residues that are poorly fit into the electron density (RSRZ>2)
For more details on wwPDB validation reports please see the wwPDB website or read the article that describes the recommendations of the X-ray Validation Task Force.
What do all the tracks on the Protein Feature View represent?
The vertical color bar to the left indicates data provenance.