Webbläsaren som du använder stöds inte av denna webbplats. Alla versioner av Internet Explorer stöds inte längre, av oss eller Microsoft (läs mer här: * https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Var god och använd en modern webbläsare för att ta del av denna webbplats, som t.ex. nyaste versioner av Edge, Chrome, Firefox eller Safari osv.

Fast iterative gene clustering based on information theoretic criteria for selecting the cluster structure.

Författare

  • Ciprian Doru Giurcăneanu
  • Ioan Tăbuş
  • Jaakko Astola
  • Juha Ollila
  • Mauno Vihinen

Summary, in English

Grouping of genes into clusters according to their expression levels is important for deriving biological information, e.g., on gene functions based on microarray and other related analyses. The paper introduces the selection of the number of clusters based on the minimum description length (MDL) principle for the selection of the number of clusters in gene expression data. The main feature of the new method is the ability to evaluate in a fast way the number of clusters according to the sound MDL principle, without exhaustive evaluations over all possible partitions of the gene set. The estimation method can be used in conjunction with various clustering algorithms. A recent clustering algorithm using principal component analysis, the "gene shaving" (GS) procedure, can be modified to make use of the new MDL estimation method, replacing the Gap statistics originally used in GS algorithm. The resulting clustering algorithm is shown to perform better than GS-Gap and CEM (classification expectation maximization), in the simulations using artificial data. The proposed method is applied to B-cell differentiation data, and the resulting clusters are compared with those found by self-organizing maps (SOM).

Publiceringsår

2004

Språk

Engelska

Sidor

660-682

Publikation/Tidskrift/Serie

Journal of Computational Biology

Volym

11

Issue

4

Dokumenttyp

Artikel i tidskrift

Förlag

Mary Ann Liebert, Inc.

Ämne

  • Medical Genetics

Nyckelord

  • B-Lymphocytes: cytology
  • B-Lymphocytes: physiology
  • Gene Expression Profiling: statistics & numerical data

Status

Published

ISBN/ISSN/Övrigt

  • ISSN: 1557-8666