|
| Browse by Superfamily | Browse by Reaction | Search by Enzyme | Search by Reaction |
|
![]() |
||||||||
| SFLD Detailed Description | ||||||||
![]() |
||||||||
|
The Structure-Function Linkage Database (SFLD) is a tool for the investigation of protein sequence, structure, and function. In particular, it aims to provide explicit information concerning how a given protein, or family of proteins, delivers chemical functionality. It was developed by the Babbitt Laboratory in collaboration with the UCSF Resource for Biocomputing, Visualization, and Informatics. It is organized around the concept of mechanistically diverse superfamilies. Members of such superfamilies are evolutionarily related and, in addition to structural similarities, retain a conserved aspect of function. For example, all members of a superfamily could catalyze the same partial reaction or stabilize the same type of intermediate. (For more information, see reviews [Babbitt, 2003] and [Gerlt and Babbitt, 2001].) Superfamilies in the database are divided into families consisting of enzymes that perform the same overall reaction. Some superfamilies include an intermediate classification of families into subgroups, whose definitions are superfamily-specific. One of the best-characterized examples of a mechanistically diverse enzyme superfamily is the enolase superfamily (ES). As of October 2004, the ES contained over 700 different sequences representing 11 different experimentally characterized functions (eight published and three additional functions yet unpublished). Analyses of available sequences and structures suggest perhaps dozens of new functions are yet to be characterized. Several of the experimentally characterized ES functions are shown in Figure 1. Remarkably, all of these different reactions are mediated by highly similar overall structures and active sites.
The similarities in these active sites are associated with a partial reaction common to all members of the superfamily, i.e., abstraction of a proton on a carbon alpha to a carboxylate group. The active site machinery associated with this proton abstraction step (and consequent metal-assisted stabilization of the enolate anion intermediate that results) is conserved over all structurally characterized members of the superfamily [Babbitt, et. al., 1996]. Although their pairwise sequence identities can be as low at 10%, all sequences assigned to the superfamily show conservation of the proton abstraction machinery and metal-binding ligands. From superfamily analysis, approximately half of enolase superfamily sequences can be assigned a specific function that proceeds from the common type of intermediate to produce a range of different products. But even for the hundreds of ES sequences for which we cannot assign a specific function, we can confidently predict that all of their overall reactions will go through an enolate anion intermediate and that all of their substrates will contain the substructure moiety associated with the proton abstraction step. Thus, the superfamily context provides a rules-based approach for inference of function for all of its members. The SFLD stores information about not only the overall reactions catalyzed, but also the mechanistic steps conserved by
families and superfamilies such as the enolase superfamily, and the conserved sequence and structural features that perform
them. The SFLD includes curated sequence alignments for superfamilies, subgroups, and families, along with the
corresponding Hidden Markov Models (HMMs).
The SFLD can be used in different ways:
The SFLD allows the nature of the data to be examined as well. Nearly every important assignment of function, structure, or family classification comes with a detailed evidence code to allow users to understand how each decision was made. Links to literature references are included when available. In addition, the SFLD includes metadata fields in nearly every table, allowing curators to enter additional information about specific elements.
The SFLD will continue to evolve, both in the addition of content and in the development of new methods to explore and
display the data. Users should be aware of some current limitations. While there will
undoubtedly be growing pains, the goal is to create a resource of use to a wide community. Comments and suggestions
are welcome (send e-mail to sfld-help@cgl.ucsf.edu).
Users wishing to cite the SFLD should use the reference: The SFLD is developed by the Babbitt Laboratory in collaboration with the UCSF Resource for Biocomputing, Visualization, and Informatics. Funding is provided by NIH grant R01-GM60595 (Babbitt), NSF grant DBI-0234768 (Babbitt), and NIH grant P41-RR01081 (Ferrin). |
||||||||
![]() |
||||||||
Contact us at sfld-help@cgl.ucsf.edu.