Searches protein and nucleic acid sequences using the mmseqs2 method. The method finds similar protein or nucleic acid chains in the PDB. Mmseqs2 is similar to the well-known BLAST method, but achieves better performance at comparable levels of sensitivity. See the corresponding publication for more details.
Sequences can be searched in two ways:
- By PDB ID and Chain ID. Type in a PDB ID in the PDB ID text box and select a Chain ID from the pull-down menu. This is useful to find all sequences that are similar to the sequence from the specified chain.
- By sequence. Paste the sequence in one-letter code format in the Sequence text box. Be sure to remove any other information that might be at the top of the pasted sequence (e.g. FASTA headers).
Note: sequences must be at least 20 residues long. For shorter sequences try the Sequence Motif Search.
The E value, or Expect value, is a parameter that describes the number of hits one can expect to see just by chance when searching a database of a particular size. For example, an E value of 1 indicates that a result will contain one sequence with similar score simply by chance. The scoring takes chain length into consideration and therefore shorter sequences can have identical matches with high E value.
The Sequence Identity Cutoff (%) filter removes the entries of low sequence similarity. The cutoff value is a percentage value between 0 to 100.