A unified genealogy of modern and ancient genomes
‘We have basically built a huge family tree, a genealogy for all of humanity that models as exactly as we can the history that generated all the variation in the modern human genome. This genealogy allows us to see how every person’s genetic sequence relates to every other, along all the points of the genome.’
Since individual genomic regions are only inherited from one parent, either the mother or the father, the ancestry of each point on the genome can be thought of as a tree. The set of trees, known as a “tree sequence” or “ancestral recombination graph”, links genetic regions back through time to ancestors where the genetic variation first appeared. Lead author Dr Anthony Wilder Wohns, who undertook the research as part of his DPhil at the BDI and is now a postdoctoral researcher at the Broad Institute of MIT and Harvard, said:
‘Essentially, we are reconstructing the genomes of our ancestors and using them to form a series of linked evolutionary trees that we call a “tree sequence”. We can then estimate when and where these ancestors lived. The power of our approach is that it makes very few assumptions about the underlying data and can also include both modern and ancient DNA samples.’
The study integrated data on modern and ancient human genomes from eight different databases, and included a total of 3,609 individual genome sequences from 215 populations. The ancient genomes included three Neanderthal genomes, a Denisovan* genome, and a family of four people who lived in Siberia around 4.6 thousand years ago. The algorithms predicted where common ancestors must be present in the evolutionary trees to explain the patterns of genetic variation. The resulting network contained almost 27 million ancestors. After adding location data on these sample genomes, the authors used the network to estimate where the predicted common ancestors had lived. The results successfully recaptured key events in human evolutionary history, including the migration out of Africa (a video simulation can be seen here). Although the genealogical map is already an extremely rich resource, the research team plans to make it even more comprehensive by continuing to incorporate genetic data as it becomes available. Because tree sequences store data in a highly efficient way, the dataset could easily accommodate millions of additional genomes. Dr Wong said:
‘This study is laying the groundwork for the next generation of DNA sequencing. As the quality of genome sequences from modern and ancient DNA samples improves, the tree will become even more accurate and we will eventually be able to generate a single, unified map that explains the descent of all the human genetic variation we see today.’
Dr Wohns added:
‘While humans are the focus of this study, the method is valid for most living things; from orang-utans to bacteria. It could be particularly beneficial in medical genetics, in separating out true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history.’
Notes
* Denisovans are a type of extinct human that is distantly related to Neanderthals. They are thought to have lived in Siberia and East Asia from about 400,000 years ago until around 40,000 years ago. ** The study was a collaboration between the BDI, Oxford; the Broad Institute of MIT and Harvard, USA; Harvard University, USA and University of Vienna, Austria.