A statistical method to discover significant combinations of genetic aberrations associated with cancer using comparative genomic hybridization profiles

A statistical method to discover significant combinations of genetic aberrations associated with cancer using comparative genomic hybridization profiles

M.A. Newton

October 2001 Technical Report 148 , Department of Biostatistics and Medical Informatics, UW Madison

Abstract:

I introduce a model-based statistical methodology for the analysis of copy-number variations in cancer genomes measured by comparative genomic hybridization. The stochastic model involves random genomic instability in an unobserved progenitor cell followed by selection of cell lineages in which oncogenic pathways have been opened. I investigate sampling properties of the model and describe Markov chain Monte Carlo methodology for model fitting. A double-Polya-urn prior is introduced to characterize prior information about the oncogenic pathway structure. The methodology is tested and used to reanalyze genomic aberrations from 116 renal cell carcinomas. In addition to point estimates of the underlying oncogenic pathways, the methodology produces posterior probabilities that any given aberration is relevant to oncogenesis and pairwise posterior probabilities that pairs of aberrations reside on a common pathway. It infers the set of sporadic aberrations, and provides a model-based clustering of all measured aberrations. From the model one can compute the posterior probability that a tumor followed any one of the oncogenic pathways, thus also providing a model-based clustering of the tumors. Limitations and possible extensions of the methodology are discussed.

Key words: correlated binary data; genetic instability; Markov chain Monte Carlo; model-based clustering; oncogenic pathways; selection

Further details:

More on this research program