The process of characterizing sequences of biomolecules, particularly the nucleotides of nucleic acids (DNA or RNA) or the amino acids of proteins. Once the order of nucleotides of, say, a genome fragment has been established by DNA sequencing, the sequence data can be analysed using computer software. This will automatically identify such features as open reading frames, promoters, enhancers, and repetitive DNA and translate any putative coding sequences into corresponding amino acid sequences. The unknown sequence is compared with existing sequence data held on any of numerous databases. Likely homology is revealed by its degree of alignment with other DNA sequences, which will provide clues about its evolutionary relationships with other biomolecules (see phylogenomics) and possible membership of a protein family. Moreover, coding sequences that correlate with particular functional domains in the corresponding protein can be identified. See also bioinformatics.
Subjects: Biological Sciences — Chemistry.