Enlarge /. OK, which of you is the father?
Shortly before the publication of the first Neanderthal genome, some researchers had seen evidence that there might be something strange lurking in the statistics of the human genome. The publication of the genome cleared all doubts about these clues and provided a clear identity for the strangeness: a few percent of the bases in European and Asian populations came from our relatives, who had meanwhile become extinct.
But what if we didn't have the certainty that the Neanderthal genome offers? This is the situation we are in now, as several studies have recently identified "ghost lines" – branches of branches in the human family tree for which we do not have a DNA sequence, but which can be found on the genomes of the populations living today. The existence of these ghost lines is based on statistical arguments and therefore depends heavily on statistical methods and underlying assumptions, which means that they tend to disagree within the community that studies human evolution.
Now researchers at the University of Utah argue that they have evidence of a very old line of spirits that contributes to Neanderthals and Denisovans (and therefore indirectly to us). This will undoubtedly be a claim that others have in the field competition, also because the evidence comes from an analysis that would also revise the data of many key events in human evolution. However, it is interesting to see how scientists deal with a question that may never be answered by final data.
In search of ghosts
Ghost lines have made their presence known in two ways. In the first case, DNA sequences from different populations can reveal common parentage groups. For example, Indians have sequences derived from an ancestral population that contributed DNA to modern East Asians and from another population that contributed to modern Siberians. In West Africans, we have found a significant contribution from a population that does not appear to have contributed to another existing population (along with contributions from groups that currently have descendants).
Although the contribution of this population is in the range of normal human variation, we still do not know who they were or where they interacted with the ancestors of West Africans. You are a historical ghost at the moment, although further studies could provide more and more details.
But there is evidence of additional ghost lines in our past. In these cases, the contribution comes from something outside the normal range of human variability. Take Neanderthal DNA, for example. European and Asian populations all have common ancestors who seem to have left Africa about 50,000 years ago and therefore have a relatively small range of variations in their DNA. In contrast, Neanderthals have split off the line that created modern humans hundreds of thousands of years ago and have been largely separated since then. You had a lot
Time to build their own variations that are different from their lineage and are not found in modern human populations.
Enlarge /. Neanderthals contributed DNA that had developed its own variations after hundreds of thousands of years of reproductive isolation.
For example, the DNA Neanderthals contributed to Eurasian populations, including variants that are far out of the range that we see in other parts of the genome. And although we know about Neanderthals, it is possible that you will receive a similar contribution from a group that we do not know about.
The problem is that this type of branching cannot be identified on a single basis. There is no way to distinguish a variant that has recently emerged from a mutation from a variant that comes from a distantly related line. In the following illustration, we take some well-known branches of the most recent human family tree and add a potential ghost line. We can imagine an example in which modern humans and Neanderthals have an A at a certain point in the genome, while Denisovans have a G.
Enlarge /. If a human line has a certain variation, we cannot say whether it originated in that line or was contributed by crossing with a separate branch unless we consider many additional variations.
One explanation for this is that modern humans got their A from Neanderthals who we know are mixed with us. However, this cross has mainly contributed to non-African populations, so this is unlikely. Another option is that a mutation has occurred in the Denisovan line. A third possibility is that thanks to a completely separate human lineage that mingled with them, the G reached the population of the Denisovans. These two options cannot be distinguished at the individual basic level.
Test all things
In order to distinguish between all possible models of our evolutionary past, we have to take into account both the information known to us – that Neanderthal DNA is rare in African populations, for example – and statistical arguments. DNA variants are usually inherited together. So if a contribution comes from a ghost line, there are likely to be some unusual variations in the genome that are closely spaced. With enough solid knowledge and careful statistical analysis of enough genomes, it should be possible to find out which models are more likely and which can be excluded.
This is more or less what this new research has done. It starts with two Neanderthal genomes, a Denisovan genome and one genome each from modern English, French and Yoruban populations. Then different models of potential evolutionary stories are created – a branch here, a little crossing there – and determine how well each model is supported by the statistics. If there are enough models to test, there should be a pattern that prefers a collection of similar trees. And this model should better match the things we already know.
The rough outline of the tree that emerges from this analysis fits fairly well with the things seen in other analyzes. The relatively young gene flow from Neanderthals to modern humans is there, as is an earlier one from the ancestors of modern humans to early Neanderthals. There is also an indication of the gene flow from a ghost population into the Denisovan line that has been observed in other studies. This line of spirits should have occupied part of Eurasia as a contemporary of the Neanderthals and Denisovans, which is certainly possible since the two groups we know have managed to get there.
Trees over trees
In the earlier parts of the preferred tree, however, it gets a little strange. The same ghost line would also have contributed DNA to the common ancestor of the Neanderthals and Denisovans, suggesting that it was a certain line at the time of their separation from the part of the tree that modern humans also belong to. However, there is no evidence that it contributes to the modern human lineage (except perhaps indirectly through its contribution to Neanderthals). This would indicate that the spirit line was outside of Africa at the beginning of the modern human line and only met the ancestors of the Neanderthals / Denisovans after their migration to Eurasia.
It is possible, but the only lines we know existed outside of Africa at that time were variants of Homo erectus, a much earlier line.
What are we missing?
That brings us to the dates of the different divisions. The authors use a low estimate of the mutation rate / generation to find out when the parent splits occur. This leads to early divisions for all lineages compared to estimates from other sources. But even taking that into account, the parentage breakdowns are older than most other estimates in the literature.
And that has a pretty dramatic impact on the origin of the ghost line. Even using a mutation rate that creates a relatively recent split, the ghost line would have been an independent branch of the human family tree about two million years ago. This is correct around the same time that Homo erectus appears in the fossil record. So this tree would have an extremely early branch of H. erectus that moved to Asia and was isolated from the rest of the human line until the ancestors of the Neanderthals emerged about a million years later.
There is no shortage of reasons to be skeptical, including the rapid isolation of the line from the lines remaining in Africa and the fact that fertility was still possible after so long in reproductive isolation. That and the fact that the data doesn't match so much else in the literature guarantee that the paper will be controversial.
But the paper would never be the last word, as the analysis it describes does not even attempt to include a number of additional events in human evolution that we know are important. We know that Denisovan DNA contributed to a number of Aisan and Pacific lines, but these lines do not contain sequences from modern humans. We also know that another line of ghosts from the branch that led to modern humans contributed DNA to a small group of West African populations, including an entire Y chromosome. These are not represented here either.
It is not difficult to understand why. More sequences and more branches would mean a longer computing time for each tree evaluated, and adding additional potential branches means that far more trees need to be evaluated overall. However, incorporating this type of well-defined intersection case can provide a strong validation of all results obtained from this analysis.
Fortunately, all of the data is out there, and no doubt someone will find the computer time to make sure it gets done sometime. However, this is a case where, given the age of these events, it is unlikely that certainty will be given by extracting a genome from the ghost line. And the remaining signals in populations from which we can get genomes may not be strong enough to remove ambiguities. So it will be interesting to see how researchers in the field deal with all of these remaining uncertainties.
Science Advances, 2019. DOI: 10.1126 / sciadv.aay5483 (About DOIs).