Aarhus University Seal / Aarhus Universitets segl

Highdimensional classification

Britta Anker Bak
Foredrag for studerende
Fredag, 12 oktober, 2012, at 14:30-15:30, in Aud. D4 (1531-219)
Imagine you have measured the value of ten thousands of genes of a person, and want to determine if it is in the high risk for a certain type of cancer or not. The genes are expected to contain some information- but which of the many genes are most informative for classification?

On low-dimensional datasets well-known procedures works well and often leads to almost perfect classification. But when the number of variables p is much larger than the number of observations n, our optimal classifier 'Fishers rule' will asymptotically classify just as bad as a random guess.

Our saviour is the independence rule. In absence of most information it give up on estimating correlation. In this case we prove an asymptotically upper bound on the classification error.

Preresuiquites: 'Introduktion til matematisk modellering' og 'Modellering 1'.

Kontaktperson: Søren Fuglede Jørgensen