Browse

More lists

⇑ Return to Help

Sequence Format

Sequences can be specified in two different formats:

FASTA format

The FASTA format is a text-based format representing either nucleic acid sequences or peptide sequences.

A sequence begins with a single description line, followed by one or more lines of sequence data.

example:

>sequence_1 example sequence 1
STAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVTP
LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
GTLQDGTRRFTCRGKPIHHFLGTSTFSQY

The description line starts with a '>' sign, followed immediately by a name. After a space character may follow a comment.
example:
>name comment
The '>' sign is essential. The name and comment must not be included.

Multiple sequences can be specified by concatenating single sequences.

example:

>sequence_1 example sequence 1
STAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVTP
LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
GTLQDGTRRFTCRGKPIHHFLGTSTFSQY
>sequence_2 example sequence 2
LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
>sequence_3
VESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPRIHHFLGTSTF
EVAPPKAYEVRIKMVGVTTVKPGDKVIPLFTPQCGKCRVCKNPES

Important: Sequences types (peptide/nucleic acid) can not be mixed!

Amino acids are indicated by the standard IUPAC one-letter codes.
Nucleotides are indicated by the IUPAC ambiguity codes.
The letter 'X' is used for a position where any amino acid is accepted.

Raw sequence

Only direct sequence information is allowed, no additional information.

example:

STAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVTP
LPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNPR
GTLQDGTRRFTCRGKPIHHFLGTSTFSQY

Only a single sequence can be specified.
Amino acids are indicated by the standard IUPAC one-letter codes.
Nucleotides are indicated by the IUPAC ambiguity codes.
The letter 'X' is used for a position where any amino acid is accepted.