Aarhus University Seal

Markov Renewal Methods in Restart Problems in Complex Systems

by Søren Asmussen, Lester Lipsky and Stephen Thompson
Thiele Research Reports Number 1 (January 2015)

A task with ideal execution time $L$ such as the execution of a computer program or the transmission of a file on a data link may fail, and the task then needs to be restarted. The task is handled by a complex system with features similar to the ones in classical reliability: failures may be mitigated by using server redundancy in parallel or $k$-out-of-$n$ arrangements, standbys may be cold or warm, one or more repairmen may take care of failed components, etc. The total task time $X$ (including restarts and pauses in failed states) is investigated with particular emphasis on the tail $\mathbb{P}(X>x)$. A general alternating Markov renewal model is proposed and an asymptotic exponential form $\mathbb{P}(X>x)\sim C\e^{-\gamma x}$ identified for the case of a deterministic task time $L\equiv \ell$. The rate $\gamma$ is given by equating the spectral radius of a certain matrix to 1, and the asymptotic form of $\gamma=\gamma(\ell)$ as $\ell\to\infty$ is derived, leading to the asymptotics of $\mathbb{P}(X>x)$ for random task times $L$. A main finding is that $X$ is always heavy-tailed if $L$ has unbounded support. The case where the Markov renewal model is derived by lumping in a continuous-time finite Markov process with exponential holding times is given special attention, and the study includes analysis of the effect of processing rates that differ with state or time.

Keywords: Alternating renewal process, computer reliability, data transmission, failure rate, fault-tolerant computing, heavy tails, Markov renewal equation, matrix perturbation, phase-type distribution, RESTART, tail asymptotics, Perron-Frobenius theory, phase-type distribution, spectral radius, Tauberian theorem.

Format available: PDF (625 KB)