Classification of protein structure Threading (protein sequence)
1 classification of protein structure
1.1 method
1.2 comparison homology modeling
1.3 more threading
classification of protein structure
the structural classification of proteins (scop) database provides detailed , comprehensive description of structural , evolutionary relationships of known structure. proteins classified reflect both structural , evolutionary relatedness. many levels exist in hierarchy, principal levels family, superfamily , fold, described below.
family (clear evolutionary relationship): proteins clustered families evolutionarily related. generally, means pairwise residue identities between proteins 30% , greater. however, in cases similar functions , structures provide definitive evidence of common descent in absence of high sequence identity; example, many globins form family though members have sequence identities of 15%.
superfamily (probable common evolutionary origin): proteins have low sequence identities, structural , functional features suggest common evolutionary origin probable, placed in superfamilies. example, actin, atpase domain of heat shock protein, , hexakinase form superfamily.
fold (major structural similarity): proteins defined having common fold if have same major secondary structures in same arrangement , same topological connections. different proteins same fold have peripheral elements of secondary structure , turn regions differ in size , conformation. in cases, these differing peripheral regions may comprise half structure. proteins placed in same fold category may not have common evolutionary origin: structural similarities arise physics , chemistry of proteins favoring packing arrangements , chain topologies.
method
a general paradigm of protein threading consists of following 4 steps:
the construction of structure template database: select protein structures protein structure databases structural templates. involves selecting protein structures databases such pdb, fssp, scop, or cath, after removing protein structures high sequence similarities.
the design of scoring function: design scoring function measure fitness between target sequences , templates based on knowledge of known relationships between structures , sequences. scoring function should contain mutation potential, environment fitness potential, pairwise potential, secondary structure compatibilities, , gap penalties. quality of energy function closely related prediction accuracy, alignment accuracy.
threading alignment: align target sequence each of structure templates optimizing designed scoring function. step 1 of major tasks of threading-based structure prediction programs take account pairwise contact potential; otherwise, dynamic programming algorithm can fulfill it.
threading prediction: select threading alignment statistically probable threading prediction. construct structure model target placing backbone atoms of target sequence @ aligned backbone positions of selected structural template.
comparison homology modeling
homology modeling , protein threading both template-based methods , there no rigorous boundary between them in terms of prediction techniques. protein structures of targets different. homology modeling targets have homologous proteins known structure (usually/maybe of same family), while protein threading targets fold-level homology found. in other words, homology modeling easier targets , protein threading harder targets.
homology modeling treats template in alignment sequence, , sequence homology used prediction. protein threading treats template in alignment structure, , both sequence , structure information extracted alignment used prediction. when there no significant homology found, protein threading can make prediction based on structure information. explains why protein threading may more effective homology modeling in many cases.
in practice, when sequence identity in sequence sequence alignment low (i.e. <25%), homology modeling may not produce significant prediction. in case, if there distant homology found target, protein threading can generate prediction.
more threading
fold recognition methods can broadly divided 2 types: 1, derive 1-d profile each structure in fold library , align target sequence these profiles; , 2, consider full 3-d structure of protein template. simple example of profile representation take each amino acid in structure , label according whether buried in core of protein or exposed on surface. more elaborate profiles might take account local secondary structure (e.g. whether amino acid part of alpha helix) or evolutionary information (how conserved amino acid is). in 3-d representation, structure modeled set of inter-atomic distances, i.e. distances calculated between or of atom pairs in structure. richer , far more flexible description of structure, harder use in calculating alignment. profile-based fold recognition approach first described bowie, lüthy , david eisenberg in 1991. term threading first coined david jones, william r. taylor , janet thornton in 1992, , referred use of full 3-d structure atomic representation of protein template in fold recognition. today, terms threading , fold recognition (though incorrectly) used interchangeably.
fold recognition methods used , effective because believed there strictly limited number of different protein folds in nature, result of evolution due constraints imposed basic physics , chemistry of polypeptide chains. there is, therefore, chance (currently 70-80%) protein has similar fold target protein has been studied x-ray crystallography or nuclear magnetic resonance (nmr) spectroscopy , can found in pdb. there 1300 different protein folds known, new folds still being discovered every year due in significant part ongoing structural genomics projects.
many different algorithms have been proposed finding correct threading of sequence onto structure, though many make use of dynamic programming in form. full 3-d threading, problem of identifying best alignment difficult (it np-hard problem models of threading). researchers have made use of many combinatorial optimization methods such conditional random fields, simulated annealing, branch , bound , linear programming, searching arrive @ heuristic solutions. louisiana
it interesting compare threading methods methods attempt align 2 protein structures (protein structural alignment), , indeed many of same algorithms have been applied both problems.
Comments
Post a Comment