Aarhus University Seal

Context dependent DNA evolutionary models

by Jens Ledet Jensen
Research Reports Number 458 (May 2005)

This paper is about stochastic models for the evolution of DNA. For a set of aligned DNA sequences, connected in a phylogenetic tree, the models should be able to explain - in probabilistic terms - the differences seen in the sequences. From the estimates of the parameters in the model one can start to make biologically interpretations and conclusions concerning the evolutionary forces at work.

In parallel with the increase in computing power, models have become more complex. Starting with Markov processes on a space with 4 states, and extended to Markov processes with 64 states, we are today studying models on spaces with $4^n$ (or $64^n$) number of states with $n$ well above one hundred, say. For such models it is no longer possible to calculate the transition probability analytically, and often Markov chain Monte Carlo is used in connection with likelihood analysis. This is also the approach taken in this paper, and a time discretization of the process is presented in order to make the calculations more feasible. Apart from the time discretization we introduce a set of simple estimating equations, together with an EM type algorithm, for finding the parameter estimates. A detailed derivation of the asymptotic properties of the estimates is also given.

Format available: PDF (325 KB)