The SFLD Glossary


The SFLD links to several external databases. Depending on the available information, a given sequence may link to its corresponding entry in one or more of the following:

NCBI Entrez Protein via GI number
NCBI Entrez Protein via accession.version number
NCBI RefSeq collection of non-redundant, well-annotated sequences
UniProtKB/SwissProt (the manually reviewed section of UniProt Knowledgebase )
UniProtKB/TrEMBL (the unreviewed section of UniProt Knowledgebase )
Deciphering Enzyme Specificity
Microbes Online integrated resource for comparative genomics of microbial species
The SEED curated genomic data
ModBase 3D structures from comparative (homology) modeling
RCSB Protein Data Bank of experimentally determined 3D structures
Phylogenetic Reference Proteomes at UniProtKB

Evidence code

A three-letter code designating data source or method of derivation. See the list of codes.

Enzyme Structure Function Ontology

Enzyme Structure Function Ontology
Bioportal page for the Ontology


A set of evolutionarily related enzymes that catalyze the same overall reaction; a subset of a superfamily.

Functional domain

A single member of a family, either a whole protein or the domain(s) responsible for the enzymatic activity. Also known as an enzyme functional domain (EFD).

Hidden Markov Model (HMM)

A statistical model used in the SFLD to describe sequences in a family, subgroup, or superfamily. Input sequences are compared to the SFLD HMMs; highly significant hits suggest how proteins may be classified, and by association, what reactions they may catalyze.

Overall reaction

The chemical transformation of substrate(s) to product(s) catalyzed by an enzyme, often expressed as a series of partial reactions.

Partial reaction

A mechanistic step within the overall reaction catalyzed by an enzyme.

Sequence similarity network

Representation of a set of proteins as nodes, with links between the nodes indicating similarity in amino acid sequence. The SFLD provides two types of sequence similarity networks: One Sequence per Node, in which each node represents a unique sequence, and Representative, in which each node can represent multiple related sequences. These networks can be viewed, manipulated, and analyzed using Cytoscape . (More...)


Line notation systems for symbolizing chemical structures and reactions (SMARTS is a generalization of SMILES ).


A set of evolutionarily related enzymes from the same superfamily but broader than a family; definitions are superfamily-specific.


A set of evolutionarily related enzymes whose members retain a conserved aspect of function, performed by conserved active site features. For example, all members of a superfamily might catalyze the same partial reaction or stabilize the same type of intermediate using a characteristic set of conserved residues. Although the defining aspect of function is conserved across a superfamily, its members can be highly divergent and catalyze quite different overall reactions (such a superfamily may be called mechanistically diverse or functionally diverse). For more information, see the references.


A set of evolutionarily related enzymes whose members share a similar active site architecture but utilize this conserved architecture in substantially different ways. In some cases, the α-carbons of key catalytic residues may be superimposable, while in other cases the side chains of key catalytic residues may superimpose although the corresponding α-carbons come from different regions in the fold. In still other cases, cofactors or substrates may assume the position of key catalytic residues.