Research Centre Information Centre AgeFactDB
Release 1 - Aug 15, 2013

Browse

⇑ Return to Help

Sequence Format

Sequences can be specified in two different formats:

[(-) icon] FASTA format

  • The FASTA format is a text-based format representing either nucleic acid sequences or peptide sequences.
  • A sequence begins with a single description line, followed by one or more lines of sequence data.
    example:
    >sequence_1 example sequence 1
    STAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVTP
    LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
    GTLQDGTRRFTCRGKPIHHFLGTSTFSQY
  • The description line starts with a '>' sign, followed immediately by a name. After a space character may follow a comment.
    example:
    >name comment
    The '>' sign is essential. The name and comment must not be included.
  • Multiple sequences can be specified by concatenating single sequences.
    example:
    >sequence_1 example sequence 1
    STAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVTP
    LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
    GTLQDGTRRFTCRGKPIHHFLGTSTFSQY
    >sequence_2 example sequence 2
    LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
    >sequence_3
    VESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPRIHHFLGTSTF
    EVAPPKAYEVRIKMVGVTTVKPGDKVIPLFTPQCGKCRVCKNPES
    Important: Sequences types (peptide/nucleic acid) can not be mixed!
  • Amino acids are indicated by the standard IUPAC one-letter codes.
  • Nucleotides are indicated by the IUPAC ambiguity codes.
  • The letter 'X' is used for a position where any amino acid is accepted.

[(-) icon] Raw sequence

  • Only direct sequence information is allowed, no additional information.
    example:
    STAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVTP
    LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
    GTLQDGTRRFTCRGKPIHHFLGTSTFSQY
  • Only a single sequence can be specified.
  • Amino acids are indicated by the standard IUPAC one-letter codes.
  • Nucleotides are indicated by the IUPAC ambiguity codes.
  • The letter 'X' is used for a position where any amino acid is accepted.