It is raining DNA outside. On the bank of the Oxford canal at the bottom of my garden is a large willow tree, and it is pumping downy seeds into the air. […] [spreading] DNA whose coded characters spell out specific instructions for building willow trees that will shed a new generation of downy seeds. […] It is raining instructions out there; it’s raining programs; it’s raining tree-growing, fluff-spreading, algorithms. That is not a metaphor, it is the plain truth. It couldn’t be any plainer if it were raining floppy discs.

—Richard Dawkins, The Blind Watchmaker [Dawkins, 2015, p. 158]

In the previous section, we have discussed how entities could assemble themselves with the help of their environment, with dividing bubbles of oil being a primary candidate as a precursor for life. While the discussed liposome structures were able to self-assemble and—with the help of the environment—even replicate themselves, they were strictly dependent on their environment and had no way of actually evolving. Now, let us look at the opposite side—pure information—and combine both the chemistry and the information—the phenotype and the genotype—in the next section on the origin of life.


Example

Imagine you have a population of artificial organisms whose only task is to provide a number. To solve this problem, we will be using an evolutionary algorithm which works by applying mutation, procreation, and testing of the organisms against the environment. The environment is the fitness function which selects those organisms with the best guesses. They procreate, their offspring undergoes some mutation, and then the offspring guesses a number.
 
Let us say 7 is the number to guess and in the starting population, all formulas result in “1.” They have all “guessed” similarly well, so they all procreate equally. In the new generation, there are now some that guess 1, some that guess 0, and some that guess 2. Those that guessed 2 procreate the most because 2 is closer to 7 and soon they overtake the whole population. This repeats in subsequent generations, with those guessing closer to 7 procreating the most. After a few generations, you will end up with a population which mostly guesses 7, with a few mutants left and right guessing more or less than 7.


Tierra

Of course, it is important to note that just simulating an evolutionary algorithm on the computer is not “life.” One step closer to nature is the project Tierra by the biologist Tom Ray. In Tierra, the world is one-dimensional. All the organisms reside in the memory of a computer. Initially, a single organism is placed in the world. This initial organism consists of a series of instructions to copy itself. After a copy is finished, that copy is activated and starts copying itself, too. In addition, there are mutations that might improve or destroy the copying process: this fulfills all three conditions of evolution (copying, mutation, operating for its own advantage).

Figures 4.11, 4.12, 4.13, and 4.14 show an evolutionary race between hosts and parasites in a soup of the Tierra Synthetic Life program developed by Tom Ray. Each individual creature is represented by a colored bar, colors correspond to genome size [Ray].

Figure 4.11:Hosts, red, are very common. Parasites, yellow, have appeared but are still rare. Photo credit: Marc Cygnus

Figure 4.12:Hosts are now rare because parasites have become very common. Immune hosts, blue, have appeared but are rare. Photo credit: Marc Cygnus

Figure 4.13:Immune hosts are increasing in frequency, separating the parasites into the top of memory. Photo credit: Marc Cygnus

Figure 4.14: Immune hosts now dominate memory, while parasites and susceptible hosts decline in frequency. The parasites will soon be driven to extinction. Photo credit: Marc Cygnus

What does the initial organism that is put into the memory look like? Here is a (simplified) description:

1.
Beginning marker
2.
Set first counter to address of beginning marker
3.
Set second counter to address of end marker
4.
Find free space in memory, save in third counter
5.
Start new child organism at address of third counter
6.
Loop marker
7.
Copy one instruction from address of first counter to address of third counter
8.
Increase first counter
9.
Increase third counter
10.
If first counter is smaller than the second counter, jump to Loop marker
11.
Activate child
12.
Jump to Beginning marker
13.
End marker

The actual program is longer (80 instructions) as it uses less powerful instructions and markers that need several memory slots. But this basic setup was enough first to fill the memory with copies of the program, and then to have the programs optimize themselves: shorter programs get copied faster and eventually outrun the older, longer instruction sets.

Eventually, a mutation caused one program to change the beginning marker of a neighboring program: the first virus program emerged. The virus co-opted the copy process of other programs in the memory to have them build copies of the virus instead. Instead of going through all the 13 steps themselves, the virus looked for a host program (or found them by luck) and manipulated the second and third command to point to the beginning and end of the virus instead of the host program. This way, the host program made copies of the virus instead of itself, while the virus was able to focus all its processing time on finding hosts and manipulating the beginning and end counter commands.

This then led to a back-and-forth with programs trying to protect themselves against those parasites by hiding instructions for the beginning and end markers better, or even running a form of self-diagnosis. Later, programs developed that worked together with other programs to defend themselves and protect their code.

The setup of Tierra is one step closer to the origin of life as it removes one layer of abstraction: There is no separation between the phenotype (the body) and genotype (the genetic information building the body). Instead, the genetic information is directly exposed to the environment. But where Tierra differs from the real world is that these instructions are read and executed by Tierraitself. Like a computer that reads a program from the hard disk or a DVD, the “reader” and “executor” are not part of the program itself. But in real life, the genetic code is not magically turned into new cells the way it happens in Tierra. In nature, the genetic code is read by specialized cell machinery (which had yet to evolve on the ancient Earth) that uses it as a blueprint to build proteins.

 Cellular Automata

So, what are other examples of self-replicating machines?

If we want to find an underlying rule—as we did with the folding or magnetic pendulum—we need to examine how these cells work. The challenge here is that the whole pattern cannot fit into the genetic code. In addition, there is no central organ but the cells themselves to coordinate the creation of the pattern. Hence, we are looking for rules that work on an individual cell. What we are looking for are cellular rules—or cellular automata as they are called in computer science.

Cellular automata are comparable to biological cells. A single cell alone does not know about the body as a whole, it can only communicate with its neighbors using chemical signals. Despite these limited abilities, if you have a lot of cells, they can work like a computer (although a lot more slowly) and produce seemingly complex results.


Example

Rule 30 is a one-dimensional binary cellular automaton rule. In this rule set, each cell can only be either in state 1 or state 0. The 30-rule says that a cell changes its state to 1 when exactly one of these conditions are met: only the cell on its left is in state 1, only the cell on its right is in state 1, only the cell itself is in state 1, or only the cell itself and the right neighbor are in state 1. In all other cases, it changes to (or remains at) state 0. When starting with a single cell in state 1, and adding a new line for each new application of the rule, the following pattern emerges (see Figure 4.15).

Figure 4.15: The first 256 lines of the output of the “rule 30” cellular automata.

As we can see, this very simple rule can produce a complex looking pattern when applied multiple times. Interestingly, this pattern can be found on the shell of the sea mollusc (see Figure 4.16). Similarly, cellular automata can be found for zebra stripes (see Figure 4.17). In that regard, we can determine that the pigmentation is neither complex, nor random: we can find rules that—when applied repeatedly—produce these patterns. Let us now use these cellular automata to create rules and patterns with self-replicating properties.

Figure 4.16:The shell of a mollusc showing a pattern reminiscent to the output of a cellular automata. Copyright 2005 Richard Ling.

Figure 4.17: The pigmentation pattern on a zebra (image source: shutterstock).

  Game of Life

Historically, the question about machinery that builds itself was first posed by John von Neumann (who was also involved in creating computers) in the 1940s. In his book Theory of Self-reproducing Automata, he discussed the design of machinery—a “universal constructor”—that can build copies of itself. It is a software program that runs on a computer-simulated infinite two-dimensional plane that is divided into individual, square cells, each with one of 32 states. It consists of a data tape (the genotype) and a constructor that uses the data to construct a new constructor and copy the tape.

While it showed self-replication, the inner workings were too complex to serve as a good example. A more simple and more famous example of artificial life is John Conway’s Game of Life. It is also a cellular automata and loosely represents basic ideas from nature. Here, each cell can only be “alive” (1) or “dead” (0). If a dead cell has three live neighbors, it becomes alive (it is “born”). If an alive cell has two to three live neighbors, it survives. If there are more or less neighbors than that, it dies. In Game of Life, there are many special configurations with various outcomes, including static patterns or oscillating patterns with various intervals (see Figure 4.18).

These simple rules make the world of Game of Life computationally complete. That means that it is possible to run any sort of algorithm within such a system, given the right start configuration. More complex examples include digital clocks,  [Stackexchange, 2017, cf.] a programmable computer simulation connecting to a printer,  [Loizeau, 2016, cf.] or even a simulation of Game of Life within Game of Life itself  [Bradbury, 2012, cf.]. The idea of self-replication was later moved to the standard Game of Life rule set. Gemini is the first self-replicating structure created within Game of Life. It consists of a long instruction tape that coordinates the replication, and one “arm” on either end. It takes around 33 million steps to construct a copy of itself [Scientist, 2010, cf.].

Figure 4.18: A selection of special configurations in Game of Life.

Given their complexity, it is hard to see how either Gemini or the Universal Constructor could evolve out of a non-living environment. The only way imaginable that they could have evolved out of a random configuration of their environment is pure chance. Given their size, the probability is basically zero. But despite both of these artificial machines looking contrived, they show how it is, in principle, possible to build a self-replicating machine with very basic rules and without centralized control. This is the situation we face when looking at chemistry: there is no “central processor” like in Tierra; all rules of the system work at all places, dependent only on the configuration of molecules that float around.

Despite the basic similarities, there are many differences between a Game of Life organism (and the Universal Constructor of John von Neumann) and real life forms. Game of Life certainly has a number of structures that look like they were “building blocks,” but they are never used to build anything; they are byproducts or temporary placeholders. Strictly speaking, the “building blocks” in Game of Life are but “1” (black cells) and “0” (white cells). The building blocks “come alive” only by the computer interpreting them in a specific way. The individual bits that represent an organism in those worlds have no inherent properties that could affect the computer’s memory.

While each example in this section can serve as a good analogy to nature, and how self-replication and even evolution works—a computer simulation allows us to watch evolution in real time without having to wait for thousands of years—none of the examples help us to explain how life originally developed. With this in mind, let us move on, combining the ideas of copying information with the ideas of self-assembling structures in chemistry in order to explain the origin of life.