|advertisement: compare things at compare-stuff.com!|
Many of the laboratories which have developed structure comparison algorithms have also undertaken the difficult task of classifying protein structures. The Protein Data Bank, or PDB[Bernstein et al., 1977], is the most obvious source of material, since it is where the majority of experimentally determined structures are deposited and is freely available to all. There are currently in the order of 5000 crystallographic protein structures and 900 NMR protein structures in the PDB; added to which are a number of nucleic acid, carbohydrate and peptide structures and theoretical models. Pre-processing and pairwise comparison of the proteins in the PDB is a major task. As discussed above, the splitting of multi-domain proteins into their constituent domains is difficult to automate, yet it is an essential part of any classification effort.