[New England
      Complex Systems Institute]
[Home] [Research] [Education] [Current Section: Activities & Events] [Community] [News] [The Complex World] [About Complex Systems] [About NECSI]
International Conference on Complex Systems (ICCS2006)

A “random chemistry” algorithm for detecting epistatic genetic interactions

Margaret J. Eppstein
Department of Computer Science, University of Vermont

Joshua L. Payne
Department of Computer Science, University of Vermont

Bill C. White
Computational Genetics Laboratory, Dartmouth College

Jason H. Moore
Computational Genetics Laboratory, Dartmouth College

     Full text: Not available
     Last modified: April 25, 2006

There are estimated to be on the order of 106 single nucleotide polymorphisms (SNPs) existing as standing variation in the human genome. Certain combinations of these SNPs can interact in complex ways to predispose individuals for a variety of common diseases, even though individual SNPs may have no ill effects. Detecting which handfuls of SNPs exhibit these nonlinear epistatic interactions is a computationally daunting combinatorial optimization task. Solution of this important problem is further exacerbated by low disease heritability, small sample sizes, little or no marginal effects for individual SNPs, and unknown a priori information regarding how many, if any, SNPs interact epistatically. Brute force approaches are not computationally feasible for all but small sets of SNPs. Trying to use individual or growing subsets of SNPs as building blocks for detection of larger combinations (e.g., via genetic algorithms or genetic programming) is no better than random search, if there is no predictive power until the correct combination of SNPs is found. Hill-climbing in this needle-in-a-haystack landscape is simply not possible. Here, we explore the potential for an alternative methodology, inspired by Kauffman’s “random chemistry” approach to detecting small autocatalytic sets of molecules from within large sets. Starting from a large parent set of candidate SNPs, we generate a small population of overlapping child sets, where each child set comprises a random half of the SNPs from the parent set. These sets are then evaluated using an approximate and noisy fitness function that estimates whether or not the set is likely to contain a subset of epistatically interacting SNPs. The child sets with the highest fitness are then merged into a new parent set, and the process is repeated over subsequent generations until the sets in the population become small enough to support a brute force fitness evaluation. Preliminary results from synthetic data sets with a range of heritabilities are presented.

Conference Home   |   Conference Topics   |   Application to Attend
Submit Abstract/Paper   |   Accommodation and Travel   |   Information for Participants

Maintained by NECSI Webmaster    Copyright © 2000-2005 New England Complex Systems Institute. All rights reserved.