Structures in High-Dimensional Data: Intrinsic Dimension and Cluster Analysis
Författare
Summary, in English
The first paper in this thesis concerns one specific aspect of a manifold structure, namely its dimension, also called the intrinsic dimension of the data. A novel estimator of intrinsic dimension, taking advantage of ``the curse of dimensionality'', is proposed and evaluated. It is shown that it has in general less bias than estimators from the literature and can therefore better distinguish manifolds with different dimensions.
The second and third paper in this thesis concern cluster analysis of data generated by flow cytometry---a high-throughput single-cell measurement technology. In this area, clustering is performed routinely by manual assignment of data in two-dimensional plots, to identify cell populations. It is a tedious and subjective task, especially since data often has four, eight, twelve or even more dimensions, and the analysts need to decide which two dimensions to look at together, and in which order.
In the second paper of the thesis a new pipeline for automated cell population identification is proposed, which can process multiple flow cytometry samples in parallel using a hierarchical model that shares information between the clusterings of the samples, thus making corresponding clusters in different samples similar while allowing for variation in cluster location and shape.
In the third and final paper of the thesis, statistical tests for unimodality are investigated as a tool for quality control of automated cell population identification algorithms. It is shown that the different tests have different interpretations of unimodality and thus accept different kinds of clusters as sufficiently close to unimodal.
Avdelning/ar
- eSSENCE: The e-Science Collaboration
- Matematik LTH
- BioCARE: Biomarkers in Cancer Medicine improving Health Care, Education and Innovation
Publiceringsår
2016-08-16
Språk
Engelska
Fulltext
- Available as PDF - 16 MB
- Download statistics
Dokumenttyp
Doktorsavhandling
Förlag
Centre for Mathematical Sciences, Lund University
Ämne
- Computational Mathematics
Status
Published
Handledare
- Magnus Fontes
ISBN/ISSN/Övrigt
- ISBN: 978-91-7623-921-6
- ISBN: 978-91-7623-920-9
Försvarsdatum
9 september 2016
Försvarstid
13:15
Försvarsplats
Lecture hall MA:1, Annexet, Sölvegatan 20, Lund University, Faculty of Engineering
Opponent
- Benno Schwikowski (Dr.)