Multiscale analysis can glean important information from Big Data. These articles provide an introduction to the multiscale ideas of information and universality.


Changing disease to health and economic instability to growth are among the complex challenges we face today. How can we turn the massive quantities of data that are increasingly available towards addressing these pressing problems? The data provide abundant detail, but generally carry no labels for guidance about which pieces of information are important for determining successful interventions. The questions we need to address are about properties of complex systems—human physiology, the global economy. Addressing questions about such systems requires disentangling the intricate dependencies and multiple causes and effects of behaviors, and recognizing that behaviors range across scales from microscopic to macroscopic.

Here we argue that the key to addressing these questions is to focus on the way behavior at different scales are related, and how dependencies within a system lead to the large scale patterns of behavior that can be characterized directly without mapping all of the intricate details. The approach builds upon an understanding of how to aggregate component behaviors to identify larger scale behaviors, an approach developed in the “renormalization group” study of phase transitions in physics, and generalized here to multiscale information theory. In this framework, information itself has scale and larger scale information is the most important information to know, with progressively finer scale information only of importance to provide detail when necessary. This analysis focuses attention on information characterizing how to affect the largest scale behaviors of the system. The method is a shortcut for the infeasible effort of mapping all of the causes and effects that extend from molecular to global scales of biological and social systems. Specific causes and effects that need to be studied are only a few compared to the many that underly the system behavior at all scales. It is thus a tremendous simplification compared to extending traditional approaches to the desired outcome. On the other hand, the resulting formalism is challenging to execute in particular circumstances. When properly applied, the result is clear guidance about how to intervene and solve major problems, a result justifying the high level of directed effort involved. Successful examples demonstrate it is possible to apply it to a wide variety of scientific questions and real world problems, though many aspects of how to proceed more generally remain to be developed. Where the effects of interest range across scales, other methods must be applied. Both opportunities and limitations of the method can be understood within the general formalism that describes the approach.

The approach is complementary to many other strategies that are usefully applied to complex systems, including not just big data but also network models, agent based models, game theory, system dynamics, machine learning, stochastic modeling, coupled differential equations, and other frameworks whose starting point is a particular representational framework. It is closer in spirit to fractals and chaos in their attention to the role of scales, but as with the other approaches, it does not adopt their specific representational strategies. In the approach described here the strategy is to describe the largest scale behaviors with a minimal but faithful representation, and each of the different representational strategies, or combinations of them, may be used as appropriate.


Space of Possibilities: