Statistics and Biodiversity
Since Darwin, evolutionary theorists have sought to describe the origins of biodiversity. In the 1920s Ronald Fisher developed the genetics based Neo-Darwinian theory in conjunction with statistics. Biodiversity was a central motivation of Fisher’s work, as prior theories could not describe the high level of biodiversity found in nature.
At the time, statistics was a powerful new approach to describing how parental genomes combine to become those of offspring (Mendelian inheritance). Combining different genes leads to a higher diversity than just averaging over traits that the parents have. This is a reason for a higher diversity. However, this is not the entire story. Populations found in the wild are generally much more diverse than is predicted by the statistical approach.
In the approach Fisher used, each offspring is treated as one instance of all possible combinations of parents. This corresponds to a well mixed population where each parent can mate with every other one. In a well mixed population biodiversity disappears exponentially fast. NECSI has gone beyond traditional statistics, improving on Fisher’s results, as even greater diversity arises from the breakdown of the assumption of mixing.
Diversity persists much longer when a population is not well mixed. Indeed, many of experiments that test population biology are done in a laboratory where populations are mixed. In this case, the assumptions of the theory and the experimental conditions match. Laboratory populations are known to be very homogeneous in genotype and natural populations, the “wildtype,” are much more diverse, consistent with our results under conditions that that they are not well mixed.
The mathematics Fisher developed continues to play a central role in the analysis of heredity and evolution of traits. A full understanding of what happens when the approximations of statistics are not used is only beginning.
Fisher, R. A., The Genetical Theory of Natural Selection (Clarendon Press, Oxford, 1930).
Y. Bar-Yam, From big data to important information, Complexity doi: 10.1002/cplx.21785 (April 25, 2016).
Mendelian inheritance is an example of a statistical approach.
Gene-Centered View
The gene-center view of evolution, popularized in 1976 by Richard Dawkins in The Selfish Gene, is a statistical approach. He argued that natural selection exerts its force on single genes. As far as the individual genes are concerned, the rest of the genome, organism, and species are merely vehicles for its own reproduction.
Dawkins explains Fisher’s math using what he calls the “rowers’ analogy.” By following this example, we can see the assumptions of the statistical approach—and how they break down.
The analogy involves a group of rowers running races. The rowers represent the gene pool, and the boats represent organisms. Pairs of rowers run heats, and the winners return to the rower pool (duplicating themselves to keep the population size the same).
To represent alleles (different version of the same gene) each rower either speaks English or German. Pairs of same-language rowers are better able to coordinate and win more raises.
If the initial pool happens to start with more speakers of one language (say English), then that language will proliferate, rapidly wiping out the other language group entirely
Yaneer Bar-Yam, Non-technical explanation of the breakdown of Neo-Darwinian — Gene Centered view, New England Complex Systems Institute (February 29, 2016).
Breakdown of the Mean Field
Dawkins’ analogy seems reasonable, but a hidden assumption has surprisingly far reaching consequences.
The mean field approximation places winning rowers back into the pool and pairs them up again at random. This is like assuming random mating and well mixed populations in real organisms. But what if, instead of returning to and from the pool randomly, winners went to the back of a line, with new pairs being selected from the front.
The outcome is remarkably different.
Same-language pairs still have an advantage, but instead of the more populous group wiping out the other, they both forms clumps of their own type. Within its own patches, the minority group is not affected by the larger group. The boundaries between these patches will move and change over time, which is an important behavior not captured by the mean field approximation. Even if one group eventually wins out, it will take much longer.
This result reflects the real world, where English and German speakers both still exist within their own countries. Similarly in biology, diversity remains much higher than in the gene centered view.
Yaneer Bar-Yam, Non-technical explanation of the breakdown of Neo-Darwinian — Gene Centered view, New England Complex Systems Institute (February 29, 2016).
Spontaneous Pattern Formation
Speciation and biodiversity are both examples of spontaneous pattern formation, the appearance of diversity not directly created by external forces. Such patterns can form and change in spatially distributed species even in the context of a spatially homogeneous environment. The spatial distribution and dynamics of biodiversity are controlled by the movement of boundaries between domains of the different genotypes.
Spontaneous pattern formation may lead to the maintenance of genetic diversity of a species in a contiguous habitat, despite reproductive mixing. However, diversity can persisted significantly longer in larger habitats and habitats with irregular geographical features. The link between habitat structure and biodiversity can be leveraged for designing of conservation areas. The areas with the most diverse geography and populations will be the most important to target, and existing conservation areas can be greatly strengthened by incorporating different types of neighboring environments.
Spontaneous pattern formation arises from the breakdown of the mean field approximation. Understanding the relationship between spatial geography and population dynamics is key to understanding biodiversity.
Incorporating these ideas into conservation efforts can stop the mass loss of species diversity before it is too late.
Hiroki Sayama, Les Kaufman, and Yaneer Bar-Yam, Spontaneous pattern formation and diversity in spatially structured evolutionary ecology, Journal of Conservation Biology 17: 893 (2003)
Hiroki Sayama, Marcus A.M. de Aguiar, Yaneer Bar-Yam, and Michel Baranger, Spontaneous pattern formation and genetic invasion in locally mating and competing populations, Physical Review E 65: 051919 (2002).
Hiroki Sayama, Les Kaufman, and Yaneer Bar-Yam, Symmetry breaking and coarsening in spatially distributed evolutionary processes including sexual reproduction and disruptive selection, Physical Review E 62: 7065-7069 (2000).
Yaneer Bar-Yam, Formalizing the gene-centered view of evolution, Advances in Complex Systems 2, 277-281 (1999).
Yaneer Bar-Yam, Making Things Work, Chapter 6 (Knowledge Press, 2004).
speciation
Biodiversity is often measured by counting the number of species in an area. An important source of biodiversity is speciation, or the separation of one species into two or more species over time. The process by which this takes place has been a subject of much controversy.
If we start by considering a species to be well mixed within a habitat and at every generation of mating, then how does it stop being mixed? If, on the other hand, we include a non-mean field description of a species, then speciation results from progressive separation of types that are more likely to reproduce with those of the same type. The spatial population version of this idea has been shown to describe natural biodiversity very well.
M.A.M. de Aguiar, M. Baranger, E.M. Baptestini, L. Kaufman, Y. Bar-Yam, Global patterns of speciation and diversity, Nature 460: 384-387 (2009).
A. B. Martins, M. A. M. de Aguiar, and Yaneer Bar-Yam, Evolution and Stability of Ring Species, PNAS 201217034 (March 11, 2013).
Population Dynamics
Because geography is important to understanding evolution beyond the mean field theory, populations dynamics place a crucial role in maintaining biodiversity. Erik Rauch, a NECSI researcher, described how genetic diversity within a single species can vary unevenly across its habitat. Small sub-populations can exhibit surprising amounts of diversity not predicted by traditional theory. These biodiversity hotspots are a key target for conservation efforts, because these small but important populations are more susceptible to disruption and extinction.
Drivers of extinction like habitat loss and climate change can drastically reduce biodiversity, leaving fragmented populations. Smaller populations are inherently more unstable, but the duration of the extinction event is crucial. Affected populations can recover more readily from short-term extinction events. This means that there is still a chance to halt activities that are driving thousands of species towards extinction. Many will still be able to rebound, but action must be taken quickly before this narrow window of opportunity closes and too much diversity is lost.
There is a short time in which action to save biodiversity can save much of the diversity that appears to be lost.
Erik M. Rauch and Yaneer Bar-Yam, Estimating the total genetic diversity of a spatial field population from a sample and implications of its dependence on habitat areas, Proceedings of the National Academy of Sciences 102: 9826-9829 (2005).
Erik M. Rauch and Yaneer Bar-Yam, Theory predicts uneven distribution of genetic diversity within species, Nature 431: 449-452 (September 23, 2004).
Benjamin Allen, Mark Kon, and Yaneer Bar-Yam, A new phylogenetic diversity measure generalizing the Shannon index and its application to Phyllostomid bats, 174(2) (2009).