Overview of Help-Pages
GEMS Launcher Logo

Available Model Information

Model Name: Short name for the model. In case of modules the name is composed of the names for the two transcription factor families involved.
Model: The table contains the parameters that were used to define the model using FastM.

For each individual element

  • the type of the element (e.g. Matrix or IUPAC),
  • the name of the element (e.g. matrix family used),
  • the DNA strand on which the element is found,
  • the parameters of the element (e.g. core and matrix similarity), and
  • the allowed distance range (in basepairs) to the following element
are listed.

Distances are calculated from the middle of an element to the middle of the next element (anchor position in case of matrices / matrix families). The total length of a model is defined as the sum of the element distances.

Origin: This section shows information on the origin of the model, i.e.
  • a reference for further information on the model (e.g. experimental verification of the function of the model)
  • the gene description
  • the accession number of the sequence that was used for definition of the model
  • the organisms in which the module was defined and/or is evolutionarily conserved.
    If no organism is given, the module was defined in an organism that is not yet available in ElDorado. Please see the gene description in this case.
Sequence: The sequence of the promoter module is shown together with the accession number from which it has been extracted (±5 bp of the module match).
Function: A short description of the function of the model (e.g. how the model is involved in the regulation of the described promoter).
Quality Assessment: The quality assessment shows the results of quality checks applied to the models.

Examples are:

  • Number of false positive matches in a negative test set (false positive rate) with default ModelInspector threshold.
  • Number of matches per 10,000 basepairs in the human genome with default ModelInspector threshold.
Promoter Matches: The value given is the percentage of promoters in which a match to the module is found. The absolute number of promoter sequences with module match is given in parentheses. The following promoter sequences were scanned:
  • Vertebrate Modules: 370,000 human, mouse, and rat promoter sequences with an average length of 1100 bp
  • Plant Modules: 62,000 promoter sequences of Arabidopsis thaliana and rice with an average length of 1013 bp

Further information: