SFLD Caveats

Superfamily content

  • due to the intense level of curation, the Core SFLD currently contains a limited set of superfamilies; however, superfamilies will be added and updated as analyses are performed; see also the Extended SFLD (XSFLD)
  • the Core SFLD (as opposed to the XSFLD) is primarily concerned with mechanistically diverse superfamilies, not superfamilies whose members all catalyze essentially the same chemical transformation (e.g., phosphorylation)

Sequence and structure content

  • the SFLD is intended to include representative sets of sequences rather than all known sequences belonging to a family, subgroup, or superfamily
  • the SFLD may include multiple sequences that actually describe the same protein, but are considered unique because of differences in length or altered residues (for example, a phosphorylated residue in a structure may be represented by an X in the corresponding sequence file)
  • the SFLD includes mutant sequences, but if a family functional residue is mutated, that sequence will only be annotated to the superfamily level (not to the family)

Reaction information

  • SMILES/SMARTS were not designed to represent enzymatic reactions, and are insufficient for describing active site features such as metal ions or interactions that stabilize an intermediate or transition state
  • not all reactions have been assigned an EC number
  • for very promiscuous enzymes, not all reactions catalyzed may be included, or even known
  • the reaction information consists primarily of GIF images

Hidden Markov Models (HMMs)

  • HMMs are not built for families with insufficient sequence data (those with only one member or two highly similar members)
  • the statistical significance of a match to an HMM is not the same as biological significance; other information such as conservation of residues important for function, operon context, etc., should also be considered when available

Sequence similarity networks

Curation status

  • Entries are marked as valid (curator-validated) or pending (not yet fully curated, highlighted in green). Pending entries may not meet the criteria for inclusion in their respective groups, but they provide a starting point for more detailed curation.