About the SFLD Archive

The Structure-Function Linkage Database (SFLD) is a hierarchical classification of enzymes that relates specific sequence-structure features to specific chemical capabilities. It was developed by the Babbitt Laboratory in collaboration with the UCSF Resource for Biocomputing, Visualization, and Informatics . As of April 2019, the database is in static format, and is not being updated. Regular SFLD users may notice significant changes to the database, as interactive features are no longer available. However, the superfamily hierarchy may be browsed, and archived alignments, sequence similarity networks, and reaction similarity networks are available for download. Though SFLD sequence sets, alignments, and networks are not being updated, they may still provide a useful starting point for exploration and hypothesis generation.

Organization and Terminology

The SFLD classifies evolutionarily related enzymes according to shared chemical functions and maps these shared functions to conserved active site features. The classification is hierarchical, where broader levels encompass more distantly related proteins with fewer shared features.

Shared chemical functions could include:

  • catalyzing a specific partial reaction
  • stabilizing a specific type of reaction intermediate
  • binding a metal ion or other cofactor

Levels of classification:

  • A family is a set of evolutionarily related enzymes that catalyze the same overall reaction.
  • A superfamily is a broader set of evolutionarily related enzymes with a shared chemical function that maps to a conserved set of active site features. Superfamilies in which members can be highly divergent and catalyze many different overall reactions are termed functionally diverse or mechanistically diverse. Such superfamilies tend to exhibit complicated structure-function relationships and pose challenges to protein annotation and design.

For example, the figure below shows striking conservation of active site residues among diverse members of the enolase superfamily. The conserved residues (left) participate in a common partial reaction, proton abstraction, within very different overall reactions (right):

enolase SF active sites enolase SF reactions
Above: Superposition of active sites of diverse members of the enolase superfamily. Sidechain and ligand carbon atoms are shown in a different color for each structure, divalent metal ions in yellow, oxygens red, and nitrogens blue. Right: Some of the chemical reactions catalyzed by enolase superfamily members. The proton abstracted to initiate each reaction is shown in red.

Additional levels in the hierarchy:

  • A functional domain (or enzyme functional domain) is a single member of a family, either a whole protein or the domain(s) responsible for the enzymatic activity.
  • A subgroup is a set of evolutionarily related enzymes that have more shared features than the superfamily as a whole, but may still catalyze different overall reactions (narrower than a superfamily but possibly including more than one family).

See also the SFLD glossary.

Philosophy and Scope

The SFLD contains highly curated, functionally diverse superfamilies.

Among enzyme resources, the SFLD is unique in its emphasis on how conserved residues map to catalysis of partial reactions or other shared functions at a finer level of detail than overall reactions. Coverage of the enzyme universe is limited because deciphering sequence-structure-function relationships in functionally diverse superfamilies includes steps that are difficult to automate. The SFLD provides evidence codes to clarify the source of a given piece of information and to provide a sense of its reliability. See also the SFLD caveats.

Types of Data Available

  • step-by-step reaction mechanisms
  • alignments of representative sequences at different levels of classification, with important functional residues specified (available for download on Superfamily pages)
  • Hidden Markov Models (available for download on Superfamily pages)
  • Links to 3D protein structures at the PDB
  • active site images for many families with known 3D structures
  • molecule, reaction, and sequence similarity networks for display and analysis with Cytoscape

Selected References

If you're citing the SFLD, please use:

The Structure-Function Linkage Database. Akiva E, Brown S, Almonacid DE, Barber AE 2nd, Custer AF, Hicks MA, Huang CC, Lauck F, Mashiyama ST, Meng EC, Mischel D, Morris JH, Ojha S, Schnoes AM, Stryke D, Yunes JM, Ferrin TE, Holliday GL, Babbitt PC. Nucleic Acids Res. 2014 Jan 1;42(1):D521-30.

Other relevant literature:

Divergent evolution in enolase superfamily: strategies for assigning functions. Gerlt JA, Babbitt PC, Jacobson MP, Almo SC. J Biol Chem. 2012 Jan 2;287(1):29-34.

Inference of functional properties from large-scale analysis of enzyme superfamilies. Brown SD, Babbitt PC. J Biol Chem. 2012 Jan 2;287(1):35-42.

Toward mechanistic classification of enzyme functions. Almonacid DE, Babbitt PC. Curr Opin Chem Biol. 2011 Jun;15(3):435-42.

Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. Schnoes AM, Brown SD, Dodevski I, Babbitt PC. PLoS Comput Biol. 2009 Dec;5(12):e1000605.

Using sequence similarity networks for visualization of relationships across diverse protein superfamilies. Atkinson HJ, Morris JH, Ferrin TE, Babbitt PC. PLoS One. 2009;4(2):e4345.

Using the Structure-Function Linkage Database to characterize functional domains in enzymes. Brown S, Babbitt P. Curr Protoc Bioinformatics. 2006 Mar;Chapter 2:Unit 2.10.

Leveraging enzyme structure-function relationships for functional inference and experimental design: the Structure-Function Linkage Database. Pegg SC, Brown SD, Ojha S, Seffernick J, Meng EC, Morris JH, Chang PJ, Huang CC, Ferrin TE, Babbitt PC. Biochemistry. 2006 Feb 28;45(8):2545-55.

Definitions of enzyme function for the structural genomics era. Babbitt PC. Curr Opin Chem Biol. 2003 Apr;7(2):230-7.

Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. Gerlt JA, Babbitt PC. Annu Rev Biochem. 2001;70:209-46.