Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions
AUTOR(ES)
Kluger, Yuval
FONTE
Cold Spring Harbor Laboratory Press
RESUMO
Global analyses of RNA expression levels are useful for classifying genes and overall phenotypes. Often these classification problems are linked, and one wants to find “marker genes” that are differentially expressed in particular sets of “conditions.” We have developed a method that simultaneously clusters genes and conditions, finding distinctive “checkerboard” patterns in matrices of gene expression data, if they exist. In a cancer context, these checkerboards correspond to genes that are markedly up- or downregulated in patients with particular types of tumors. Our method, spectral biclustering, is based on the observation that checkerboard structures in matrices of expression data can be found in eigenvectors corresponding to characteristic expression patterns across genes or conditions. In addition, these eigenvectors can be readily identified by commonly used linear algebra approaches, in particular the singular value decomposition (SVD), coupled with closely integrated normalization steps. We present a number of variants of the approach, depending on whether the normalization over genes and conditions is done independently or in a coupled fashion. We then apply spectral biclustering to a selection of publicly available cancer expression data sets, and examine the degree to which the approach is able to identify checkerboard structures. Furthermore, we compare the performance of our biclustering methods against a number of reasonable benchmarks (e.g., direct application of SVD or normalized cuts to raw data).
ACESSO AO ARTIGO
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=430175Documentos Relacionados
- Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect
- Exploring Expression Data: Identification and Analysis of Coexpressed Genes
- Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation
- Testing for differentially expressed genes with microarray data
- Coexpression Analysis of Human Genes Across Many Microarray Data Sets