|advertisement: compare things at compare-stuff.com!|
A number of successful methods[Bowie et al., 1990,Bowie et al., 1991,Abagyan et al., 1994,Matsuo & Nishikawa, 1995,Hubbard & Park, 1995,Fischer & Eisenberg, 1996,Defay & Cohen, 1996,Taylor, 1997,Rost et al., 1997,Rice & Eisenberg, 1997, amongst others] encode 3D structural information into strings of symbols or profiles against which 1D strings derived from the query sequence are aligned. Bowie et al.bowie:1991 defined 18 structural environments on the basis of secondary structure, solvent accessibility and burial by polar atoms. Profiles of scores for each of the 20 amino acids were then calculated for each of the structural environment classes based on their observed frequencies in a database of structures. More recently Abagyan et al.abagyan:prot94 calculated profiles based on the side-chain modelling energies of alternate amino acid substitutions in the library structure. An earlier attempt at side-chain replacement for fold recognition was too rigid to detect all but very similar proteins[Ponder & Richards, 1987].
Another algorithm developed by Bowie et al.bowie:prot90 transformed query sequences into strings of characters representing three levels of conserved hydrophobicity. Library structures were likewise converted into strings representing three levels of solvent accessibility. Dynamic programming alignments using a substitution matrix derived from database counts were able to detect remote homologies. More recently a large number of methods have followed this paradigm [Fischer & Eisenberg, 1996,Defay & Cohen, 1996,Hubbard & Park, 1995,Rice & Eisenberg, 1997,Rost et al., 1997]; most have included secondary structure prediction information (see below) for the query sequence.