Research Centre Information Centre AgeFactDB
Release 1 - Aug 15, 2013

Browse

⇑ Return to Help

Pattern Syntax

Sequence patterns can be specified in three different formats:

[(-) icon] PROSITE format

[(-) icon] Using wildcards

[(-) icon] Regular expression


Appendix

[(-) icon] IUPAC one-letter codes for amino acids

Amino AcidOne-letter CodeThree-letter Code
Alanine AAla
Arginine RArg
Asparagine NAsn
Aspartic AcidDAsp
Cysteine CCys
Glutamic AcidEGlu
Glutamine QGln
Glycine GGly
Histidine HHis
Isoleucine IIle
Lysine KLys
Methionine MMet
PhenylalanineFPhe
Proline PPro
Serine SSer
Threonine TThr
Tryptophan WTrp
Tyrosine YTyr
Valine VVal

[(-) icon] IUPAC ambiguity codes for nucleotides

In contrast to the original ambiguity codes, 'T' and 'U' are not equivalent.
This enables the distinction between Thymine and Uracil within a search.
Nucleotide(s)Code
AdenineA
CytosineC
GuanineG
ThymineT
UracilU
A or CM
A or C or GV
A or C or T or UH
A or GR
A or G or T or UD
A or T or UW
C or GS
C or G or T or UB
C or T or UY
G or T or UK
any nucleotideN
gap.