[New England
      Complex Systems Institute]
[Home] [Research] [Education] [Current Section: Activities & Events] [Community] [News] [The Complex World] [About Complex Systems] [About NECSI]
International Conference on Complex Systems (ICCS2006)

Remembrance of experiments past: a redescription approach for knowledge discovery in complex systems

Samantha Kleinberg
Courant Institute Bioinformatics Group, New York University

Marco Antoniotti
Courant Institute Bioinformatics Group, New York University

Satish Tadepalli
Department of Computer Science, Virginia Tech

Naren Ramakrishnan
Department of Computer Science, Virginia Tech

Bud Mishra
Courant Institute Bioinformatics Group, New York University & NYU School of Medicine

     Full text: PDF
     Last modified: August 14, 2006

Abstract
A complex system creates a “whole that is larger than the sum of its parts,” by coordinating many interacting simpler component processes. Yet, each of these processes is difficult to decipher as their visible signatures are only seen in a syntactic background, devoid of the context. Examples of such visible datasets are time-course description of gene-expression abundance levels, neural spike-trains, or click-streams for web pages. It has now become rather effortless to collect voluminous datasets of this nature; but how can we make sense of them and draw significant conclusions? For instance, in the case of time-course gene-expression datasets, rather than following small sets of known genes, can we develop a holistic approach that provides a view of the entire system as it evolves through time?

We have developed GOALIE (Gene-Ontology for Algorithmic Logic and Invariant Extraction) – a systems biology application that presents global and dynamic perspectives (e.g., invariants) inferred collectively over a gene-expression dataset. Such perspectives are important in order to obtain a process-level understanding of the underlying cellular machinery; especially how cells react, respond, and recover from environmental changes. GOALIE uncovers formal temporal logic models of biological processes by redescribing time course microarray data into the vocabulary of biological processes and then piecing these redescriptions together into a Kripke structure. In such a model, possible worlds encode transcriptional states and are connected to future possible worlds by state transitions. An HKM (Hidden Kripke Model) constructed in this manner then supports various query, inference, and comparative assessment tasks, besides providing descriptive process-level summaries. The formal basis for GOALIE is a multi-attribute information bottleneck (IB) formulation, where we aim to retain the most relevant information about states and their transitions while at the same time compressing the number of syntactic signatures used for representing the data. We describe the mathematical formulation, software implementation, and a case study of the yeast (S. cerevisiae) cell cycle




Conference Home   |   Conference Topics   |   Application to Attend
Submit Abstract/Paper   |   Accommodation and Travel   |   Information for Participants


Maintained by NECSI Webmaster    Copyright © 2000-2005 New England Complex Systems Institute. All rights reserved.