Spurred by the conference I have been attending (see previous post), and specifically work on evolution of diversity and complexity, I have decided to post one of my unsuccessful research grant applications as an open research idea. This is an application to the Leverhulme Trust that did not get past the first stage a couple of years ago. I really like this idea, but will never have the time to do it by myself, nor am I likely to find a suitable funder. So I am putting the idea into the public domain: after all, it serves no good to anyone sitting on my hard disk.. If you or someone you know would like to pick up on the idea, collaborate with me, maybe even come to work with me (perhaps through a Marie Curie application?) or even work completely independently of me, here it is.
Evolution of unbounded novelty and diversity using computer models of metabolism
Evolution has led to a continuous emergence of novel species, resulting in the diversity and complexity of life that we observe today. It is commonly presumed that the conditions set out by Darwin, of diversity, heredity and selection, are sufficient to explain this emergence. However, this cannot be tested in laboratory timescales, and computer simulations of evolution have been unsuccessful in producing an unending progression of novel, diverse and increasingly complex species, referred to as open-ended evolution (1). The formulation and validation of necessary and sufficient conditions for open-ended evolution is one of the biggest unsolved problems in biology (2).
We aim to address this research gap. Our hypothesis is that to obtain open-ended evolution, there must be positive feedback from the development of novelty in one species leading to the construction of new niches for other species to exploit. We propose that this positive feedback is a missing component from the current formulation of the theory of evolution. We will test this hypothesis using a computer evolution approach, by building a simulation of evolvable artificial single-cell organisms.
The novelty of our proposed approach is the use of real biochemistry to provide a rich and varied context for evolution. In the simulations, different ‘species’ of ‘organism’ will be distinguished because they possess different sets of enzymes. To survive and reproduce, organisms will be required to produce certain quantities and proportions of key chemicals: fatty acids, amino acids, nucleic acids and carbohydrates. Organisms will live in spatial patches, grow on available nutrients, and die to release chemicals that can be reused by other organisms. Fitness in the simulations will be identical to fitness in real biology: species that are better able to survive and reproduce are fitter than those that are not. Diversity will be driven by the gain or loss of enzymes, enabling organisms to use different resources and/or manufacture different biochemicals.
Our approach is radically different from previous attempts to evolve open-ended novelty. Laboratory approaches, commonly using the bacterium Escherichia coli, have been successful in evolving novel carbon source utilization (3) and diversification into two sub-types (4, 5). These results are exciting. However, experimental approaches are fundamentally limited because it is impossible to study millions of years of evolution in a laboratory setting. We cannot make broader generalizations from these results.
Computer simulations allow study of evolution on longer timescales and more focussed conditions than possible in the laboratory. The most celebrated examples are Tierra (6) and Avida (7). However, these do not exhibit open-ended evolution, but convergence to a small number of species (8). Other approaches evolve models that already contain complex building blocks, such as neural controllers or locomotive mechanisms (9, 10). Even though these approaches have evolved novelty, they start with considerable granularity and complexity. It is not clear to what extent these systems allow for unbounded novelty beyond the complexity inherent in the components. Moreover, all of these approaches lack the reusability of compounds and functional plasticity of enzymes present in real chemistry, which underpins biological evolution.
Although it is tempting to perceive the diversity of life in terms of the range of plants and animals familiar to us, single-cell organisms (bacteria and archaea) account for over 95% of genetic diversity (11). Even the bacterial species E. coli contains more genetic diversity than that which distinguishes humans from social amoebae (12). Single-cell organisms differ not by their developmental complexity (limbs, organs etc.), but their metabolic complexity: their capacity to use different biochemical sources to provide energy and reproduce. Thus we propose that it is sufficient to consider the richness of the biochemical world of single-cell creatures to obtain open-ended evolution. The positive feedback required by our hypothesis arises because the emergence of a new metabolic pathway in one species, leading to biosynthesis of novel compounds, provides opportunities for other species to evolve to exploit those compounds.
The use of an appropriately rich model chemistry is central for a successful simulation environment. A common approach is to use an artificial chemistry, such as string chemistries (13), or rich chemistries using molecular graphs (14, 15). A novelty of our approach is the use of real biochemistry. This bridges the gap between previous unrealistic computational approaches (6,7), and real-life evolution (3-5). There are three advantages of moving closer to the biology. First, we know that biology has the open-ended evolution property that we seek to reproduce. Second, there are now considerable data available that we can utilize. And third, the results obtained will be more applicable to the biological world.
Preliminary research from DJS’s group on modelling evolution has focussed on the evolution of transcription regulation. We have found that networks evolve repressor functions and hierarchical regulation to control energy usage (16-19) and that basal expression is necessary to obtain realistic network evolution (17). These results will inform the architecture of the transcription regulatory elements in the proposed simulations.
Technical Programme of Work
- Compilation of biochemical compound and enzyme reaction database, to include all known biochemicals and reactions, using relevant sources including KEGG (20) and MetaCyc (21).
- Compilation of data of free energies of formation for each biochemical compound. Measured free energies are available in databases including BRENDA (22), XPDB (23). Where measurements are unavailable, the group contribution method will be used to make suitable estimates (24, 25).
- Development of evolvable system for simulation of a lineage of organisms with a given set of enzymes, and given any input (local concentrations of biochemicals). Ordinary differential equations will be used to model both the metabolic reactions, and the transcription control mechanisms associated with expression of relevant enzymes.
- Development of spatial array that includes organisms and biochemical concentrations in each compartment, and appropriate levels of mixing between neighbouring spatial compartments.
- Development of evolutionary timescale simulation, to include mutations in enzyme sets and regulatory control, competition for resources, growth and death, and thus selection of most successful strains.
- Analysis for diversity and novelty in simulation results, using evolutionary activity statistics (8).
References
1. Bedau, M.A. 2008. In S. Bullock et al. (eds.) Artificial Life XI: Proceedings of the Eleventh International Conference on the Simulation and Synthesis of Living Systems, p. 750. MIT Press, Cambridge, MA.
2. Bedau, M.A. et al. 2000. Artificial Life 6: 363–376.
3. Blount, Z.D. et al. 2008. Proc Natl Acad Sci USA 105:7899-7906.
4. Rozen, D.E. and Lenski, R.E. 2000. Am Nat. 155: 24-35.
5. Rozen, D.E. et al. 2009. Ecol Lett. 12:34-44.
6. Ray, T.S. 1992. In Langton, C. G. et al. (Eds.) (1992). Artificial life II. Redwood City, CA: Addison-Wesley. Pp 371–408.
7. Ofria, C. and Wilke, C.O. 2004. Artificial Life 10:191-229.
8. Bullock, S. and Bedau, M.A. 2006. Artificial Life 12: 1–5.
9. Channon, A. 2001. In J. Kelemen & P. Sosik (Eds.). Advances in Artificial Life 2159 pp. 417-426. Springer-Verlag.
10. Turk, G. 2010. In Hellerman, H. et al. (eds). Proceedings of the Twelfth International Conference on the Synthesis and Simulation of Living Systems. Pp496 – 503. MIT Press.
11. Ciccarelli F.D. et al. 2006. Science 311:1283-1287.
12. Lukjacenko, O. et al. 2010. Microb Ecol. 60: 708-20.
13. Hickinbotham, S. et al. 2010. In Hellerman, H. et al. (eds). Proceedings of the Twelfth International Conference on the Synthesis and Simulation of Living Systems. pp24-31. MIT Press.
14. Benko, G, et al. 2003. J Chem Inf Comput Sci, 43:1085–93.
15. Ullrich, A. et al. 2011. Artificial Life 17: 87-108.
16. Jenkins, D.J. and Stekel, D.J. 2010. Journal of Molecular Evolution 71: 128-40.
17. Jenkins, D.J. and Stekel, D.J. 2010. Journal of Molecular Evolution 70: 215-231.
18. Jenkins, D.J. and Stekel, D.J. 2009. Artificial Life 15: 259-91.
19. Stekel, D.J. and Jenkins, D.J. 2008. BMC systems biology, 2:6.
20. Kanehisa, M. et al. 2008. Nucleic Acids Res. 36, D480-484.
21. Caspi, R. et al. 2008. Nucleic Acids Res. 36: D623–631.
22. Scheer, M. et al. 2011. Nucleic Acids Res. 39: D670-676.
23. Goldberg, R.N. et al. 2004. Bioinformatics 20: 2874-2877.
24. Jankowski, M.D. et al. 2008. Biophys. J. 95: 1487-99.