Options
2014
Book Article
Title
Molecular morphology: Higher order characters derivable from sequence information
Abstract
With the rapid technological developments in high-throughput sequencing, large amounts of genomic sequence data have become available also for nonmodel organisms. Phylogenomics has been highly successful in utilizing this wealth of information but by no means all open phylogenetic problems have been solved with sequence-based methods alone. Genome sequences implicitly contain a plethora of higher-level features that convey phylogenetic information in their own right. The long list includes genes, protein domains, microRNAs, introns and their relative positions within genes, regulatory DNA elements, insertion points of transposable elements, patterns of insertions and deletions of sequence elements, and the relative order of such annotation items along the genomic DNA. Molecular morphology attempts to systematically extract them from raw data and harnesses them for phylogenetic purposes. This poses a series of practical and theoretical problems, ranging from the issues of obtaining unbiased and comparable genome annotations, to questions on how features are to be treated that can be defined only in a comparative setting, to the mathematics of data types that cannot be captured by character matrices.