A genomic portrait of American populations
Date
2021-07-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Ameerika populatsioonide evolutsiooni on käsitlenud mitmed multidistsiplinaarsed uuringud. Meie teadmised Ameerika maailmajao geneetilise mitmekesisuse kujunemisest on endiselt ebatäielikud, ehkki geneetilised uuringud lisavad sel teemal pidevalt uusi detaile. Uute tehnoloogiate nagu järgmise põlvkonna sekveneerimine (NGS) väljaarendamine koos teiste tehniliste edasiminekutega avavad võimaluse eraldada ja analüüsida DNA-d iidsetest proovidest, tehes "iidsest genoomikast" (aDNA) ühe paljudest põhilistest tööriistadest meie esivanemate mineviku mõistmiseks. Veelgi enam, need tehnoloogiad on tohutult suurendanud genoomsete andmete hulka kogu maailmast, sealhulgas Ameerika mandritelt.
Ehkki Ameerika maailmajagu oli viimane, milleni meie sapiens’i esivanemad jõudsid, on selle geneetilise varieeruvuse protsessid olnud väga keerukad. Nende uuringud on rohkem kui kolme kümnendi jooksul olnud paljude geneetikaalaste teadustööde teemaks. Algul domineerisid Ameerika populatsioonide populatsioonigeneetilistes uuringutes uniparentaalsed geneetilised süsteemid, alustades mitokondriaalse DNA-ga (mtDNA) ja peagi kaasates Y-kromosoomi (chrY) analüüsi. Viimasest selgus, et põlisameeriklaste kaks chrY asutajahaplogruppi olid tõenäoliselt hg C ja hg Q, mida leiti vastavalt umbes 5% ja 75% põlisameerika meestest. Kuid nende haplogruppide resolutsioon ei paranenud oluliselt enne kui mõne aasta eest.
Selle doktoritöö esimese publikatsiooni (Ref I) eesmärgiks on uurida Ameerika maailmajao geneetilist ajalugu meeste perspektiivist, lahates suure täpsusastmega üleameerikalist haplogruppi Q, ning koostada kõikehõlmav ja detailne haplogrupp Q ja selle alamliinide fülogeograafia.
Uniparentaalseid geneetilisi süsteeme võib pidada kaheks lookuseks, mida kasutatakse inimese ajaloo nais- ja meesperspektiivi mõistmiseks. Nad saavad kirjeldada ainult kaht esivanemat neist tuhandetest, kes on seotud tänapäeva populatsioonide geneetilise pärandi kujundamisega. Olulisem arv esivanemaid on genoomis esindatud autosomaalsetes markerites. Seega on autosomaalsed markerid hädavajalikud Ameerika maailmajao populatsioonide liikumiste ajastuse ja dünaamika mõistmiseks. Tänu arheoloogilistele ja geneetilistele tõenditele tunnistatakse nüüdseks, et esimesed Põhja-Ameerikasse jõudnud inimesed tulid Siberist, ületades pärast hilist jääaega Beringi maakitsuse. Algsetele asulakohtadele järgnesid ulatuslikud inimeste ränded, mis jõudsid Lõuna-Ameerika lõunaossa suhteliselt kiiresti, juba ~15 000 aastat tagasi. Mitu hiljutist uuringut on selle teema kohta uut informatsiooni andnud, rekonstrueerides Ameerika maailmajao erinevate piirkondade põliselanike rühmade genoomset ajalugu, kuid Isthmo-Colombia piirkond on seni puudu.
Seega rakendab selle doktoritöö teine publikatsioon (Ref II) nii iidse kui ka tänapäeva DNA andmete analüüsi, et rekonstrueerida Isthmo-Colombia piirkonna genoomset ajalugu. Selle eesmärgiks on teha kindlaks Panama põlispopulatsioonide genoomne taust, et hinnata maakitsuse sisest varieeruvust ja selgitada Kolumbuse-eelsete ameeriklaste genoomset ajalugu, hinnates Isthmo-Colombia piirkonna sidemeid ülejäänud Ameerika maailmajaoga.
Lisaks esialgsetele rännetele pärinevad Ameerika populatsioonid mitmest segunemisest, alates koloniseerimisest ja Atlandi orjakaubandusest. Peale selle toimus viimase kahe sajandi jooksul palju rändelaineid, millele järgnes kohalik segunemine, ning nende mõju on suuresti uurimata.
Selle doktoritöö kolmas publikatsioon (Ref III) uurib, kuidas hilisemad ränded kujundasid segunenud Ameerika populatsioonide genoomset tausta. Täpsemalt on selle uuringu eesmärgiks rekonstrueerida kõrgel lahutusastmel põlvnemise komponendid, anda hinnang segunemise ajale, uurida erinevate mandrite põlvnemise demograafilist evolutsiooni pärast segunemist ning hinnata soost sõltuva geenivoolu dünaamika ulatust ja tugevust segunenud Ameerika populatsioonides.
Käesoleva doktoritöö peamised tulemused ja järeldused on järgmised:
• Tehti kindlaks ja dateeriti kõrge resolutsiooniga haplogrupp Q fülogeneesipuu, mis annab uut informatsiooni oma Euraasia ja Ameerika harude geograafilise jaotuse kohta tänapäeva ja iidsetes proovides.
• Esimest korda tuvastati kaks eristuvat Y-kromosoomi liini, mis peegeldavad hiljutistes genoomsetes uuringutes varem kirjeldatud kaht peamist põlvnemiskomponenti (SNA ja NNA). Nende liinide lahknemine toimus tõenäoliselt Beringi maakitsuse idaosas enne Ameerika maailmajakku sisenemist, milleks kasutati kaht teed: ranniku (SNA, Q-Z780/Q-M848) ja sisemaa teed (NNA, Q-Y4276). Sinna jõudnuna segunesid need kaks põlvnemiskomponenti Põhja-Ameerikas tõenäoliselt väga vara, millele viitab iidne Kennewicki mees, kelle tuumagenoom kuulub SNA komponenti (Q-M848), kuid mtDNA haplogrupp on NNA-st (X2a).
• Avastati SNA liinide kaks märkimisväärset ekspansiooni Meso- ja Lõuna-Ameerikas, üks umbes 15 000 aastat tagasi, kohe pärast esmaasustamist, ja teine 3000 aastat tagasi pärast klimaatilisi muutusi ja kohalikke kultuurilisi nihkeid.
• Panama sees tuvastati märkimisväärne geneetiline struktuur, mis kattus üldjoontes käesolevas uuringus analüüsitud mineviku ja praeguste põliselanike rühmadega. Need rühmad on ka tuhandeid aastaid suguluses olnud, eriti Kariibi mere piirkonnas Panama lääne- ja Costa Rica kaguosa piiril. Ida-Panama põliselanike rühmade vahel ning Emberá ja hispaanlaste-eelsete panamalaste vahel, kes elasid Vana Panamat ümbritsevas piirkonnas enne kontakti eurooplastega, leiti vähem geneetilisi sarnasusi.
• Ameerika maailmajao iidsete põliselanike seas avastati varem kirjeldamata põlvnemiskomponent. See komponent esineb ainult selles piirkonnas ning on tuvastatav iidsetes hispaanlaste-eelsetes indiviidides ja inimestes, kes ise identifitseerivad end tänapäeva põliselanike, Aafrika ja latiino-põliselanike rühmade järglastena. See jõudis Panama maakitsusele rohkem kui 10 000 aastat tagasi, levis varases Holotseenis lokaalselt ning jättis tänapäevani püsivaid genoomseid jälgi, eriti Guna rahva hulgas.
• Euroopa geneetiline panus Ameerika populatsioonidesse peegeldab kolonisatsiooni aegset geopoliitilist olukorda. Avastati mitu sekundaarset Euroopa allikat, mis panustasid arvestatavasse osassse Ameerika populatsioonidest, nt Itaalia Brasiilias ja Argentiinas, Kesk-Euroopa Brasiilias. Tuletati Aafrika allikate eristuv panus Ameerika populatsioonidesse.
• Segunemise ajad langevad kokku rändelainetega Euroopast ja peegeldavad ekspluateeritud Aafrika piirkondade muutumist ajas.
• Segunemise demograafilise mõju analüüsist selgub üldine languse ja taastumise muster mitmes uuritavas populatsioonis, mis vastab koloniaalajastu algusele ja lõpule. Kuid Peruud ja Mehhikot iseloomustavad erinevad demograafilised trajektoorid.
• Soost sõltuva segunemise dünaamika analüüs viitab sellele, et tänapäeva populatsioonidesse on panustanud rohkem Ameerika naisi kui mehi. Vastupidiselt oli Euroopa meeste panus olulisem kui samalt mandrilt pärinevate naiste oma. Sellele vastandlikult ilmnes mõnes populatsioonis, kuid mitte kõigis, tõendeid suuremast naiste panusest, mis on osaliselt vastuolus ajalooliste andmetega Aafrika päritolust.
The evolution of American populations has been the subject of several multidisciplinary studies. Our knowledge regarding the formation of the genetic diversity of the Americas is still incomplete, although genetic studies are constantly adding new details on this topic. The development of new technologies, such as Next Generation Sequencing (NGS), together with other technical improvements, lead to the possibility of extracting and analysing DNA from ancient specimens, making "ancient genomic" (aDNA) one of the many fundamental tools to understand our ancestor's past. Moreover, these technologies enormously increased the number of worldwide genomic data, including those from the Americas. Although the Americas were the last continents to be reached by our sapiens ancestors, their genetic variation processes have been extremely complex. Their studies have been the topic of many genetic surveys for more than three decades. In the beginning, uniparental systems dominated the population genetics research of American populations. It started with mitochondrial DNA (mtDNA) and soon included the Y chromosome (chrY) analysis. The latter revealed that the two founding Native American chrY haplogroups probably were Hg C and Hg Q, accounting for about 5% and 75% of Native American males, respectively. However, the resolution of these haplogroups did not undergo substantial improvements until a few years ago. The first publication included in this dissertation (Ref I) aims to investigate from a male perspective the genetic history of the Americas through a fine dissection of the Pan-American haplogroup Q and to reconstruct a comprehensive and detailed haplogroup Q phylogeography and that of its sub-lineages. The uniparental systems could be considered as two loci that are used to understand the female and male perspective of human history. They can describe only two ancestors of the thousands involved in shaping the genetic legacy of modern populations. The genomic representation of a more significant number of ancestors is encrypted in the autosomal markers. Therefore, autosomal markers are crucial to understanding the timing and the dynamics of population movements in the Americas. Thanks to archaeological and genetic evidence, it is now accepted that the first people arriving in North America came from Siberia, passing through Beringia after late Glacial times. Initial settlements were followed by widespread people movements that reached southern South America relatively fast, as early as ~15 thousand years ago. Several recent studies have provided new information about this subject, reconstructing the genomic history of indigenous groups from different regions of the Americas, but the Isthmo-Colombian area is still lacking. Hence, the second publication of this thesis (Ref II) employed both ancient and modern DNA data analysis to reconstruct the genomic history of the Isthmo-Colombian area. It aims to define the genomic background of Panamanian indigenous populations to evaluate the intra-Isthmus variability and shed light on pre-Columbian Americans' genomic history assessing the connection between the Isthmo-Colombian area and the rest of the Americas. Besides the first migrations, American populations result from several admixture events since the colonial era and the Atlantic slave trade. Moreover, many waves of migration followed by local admixture occurred in the last two centuries, the impact of which has been largely unexplored. The third reference in this thesis (Ref III) explores how more recent migrations shaped the genomic background of admixed American populations. In particular, this study aims to reconstruct the fine-scale ancestry composition, estimate the time of admixture, examine the demographic evolution of different continental ancestries after the admixture and assess the extent and magnitude of sex-biased gene-flow dynamics in admixed American populations. The main results and conclusions of this research thesis are the following: • A high-resolution haplogroup Q phylogeny that presents new insights into its Eurasian and American branches' geographic distribution in modern and ancient samples was ascertained and dated. • For the first time, two distinct Y chromosome lineages reflecting the two main ancestral components (SNA and NNA) earlier described by recent genomic studies were observed. The differentiation of these lineages probably occurred in eastern Beringia before entering the Americas through two routes: the coastal (SNA, Q-Z780/Q-M848) and the internal route (NNA, Q-Y4276). Once there, these two ancestral components probably admixed very early in North America, as suggested by the ancient Kennewick nuclear genome belonging to SNA (Q-M848) yet carrying an NNA mtDNA haplogroup (X2a). • Two significant expansions of the SNA lineages in Meso- and South America, one around 15 kya, early after the first peopling, and another at 3 kya, following climatic changes and local cultural shifts, were revealed. • A remarkable genomic structure within Panama was identified, mainly overlapping with past and present Indigenous groups analysed in this study. These groups also show relatedness, especially in the Caribbean region on the border between western Panama and southeastern Costa Rica over thousands of years. Fewer genetic similarities were identified between the Indigenous groups located in eastern Panama and between the Emberá and the pre-Hispanic Panamanians who lived in the area around Old Panama before European contact. • A previously undescribed ancestry among ancient Indigenous peoples of the Americas was revealed. This ancestry is unique to the region and detectable in the ancient pre-Hispanic individuals and the self-identified descendants of current Indigenous, African and Hispano-Indigenous groups. It reached the Panama land bridge over 10 thousand years ago, expanded locally during the early Holocene, and left genomic traces up to the present day, especially among the Guna. • The European genetic contribution in American populations mirrors the geopolitical situation during colonisation. Several European secondary sources contributing to a substantial proportion of American populations were revealed, e.g. Italy in Brazil and Argentina, Central Europe in Brazil. A differential contribution of African sources among American populations was inferred. • Times of admixture are concordant with migration waves from Europe and reflect differences in African areas exploited through time. • The investigation of the demographic impact of admixture reveals a general decline and recovery pattern in several populations under study corresponding to the beginning and the end of the Colonial Era. However, Peru and Mexico are characterised by different demographic trajectories. • The analysis of sex-biased admixture dynamics suggests that a higher number of American females than males have contributed to the modern populations. In contrast, European males had a more significant contribution than females from the same continent. In contrast, some populations, but not all, showed evidence for a higher female contribution, partially conflicting with historical records for African ancestry.
The evolution of American populations has been the subject of several multidisciplinary studies. Our knowledge regarding the formation of the genetic diversity of the Americas is still incomplete, although genetic studies are constantly adding new details on this topic. The development of new technologies, such as Next Generation Sequencing (NGS), together with other technical improvements, lead to the possibility of extracting and analysing DNA from ancient specimens, making "ancient genomic" (aDNA) one of the many fundamental tools to understand our ancestor's past. Moreover, these technologies enormously increased the number of worldwide genomic data, including those from the Americas. Although the Americas were the last continents to be reached by our sapiens ancestors, their genetic variation processes have been extremely complex. Their studies have been the topic of many genetic surveys for more than three decades. In the beginning, uniparental systems dominated the population genetics research of American populations. It started with mitochondrial DNA (mtDNA) and soon included the Y chromosome (chrY) analysis. The latter revealed that the two founding Native American chrY haplogroups probably were Hg C and Hg Q, accounting for about 5% and 75% of Native American males, respectively. However, the resolution of these haplogroups did not undergo substantial improvements until a few years ago. The first publication included in this dissertation (Ref I) aims to investigate from a male perspective the genetic history of the Americas through a fine dissection of the Pan-American haplogroup Q and to reconstruct a comprehensive and detailed haplogroup Q phylogeography and that of its sub-lineages. The uniparental systems could be considered as two loci that are used to understand the female and male perspective of human history. They can describe only two ancestors of the thousands involved in shaping the genetic legacy of modern populations. The genomic representation of a more significant number of ancestors is encrypted in the autosomal markers. Therefore, autosomal markers are crucial to understanding the timing and the dynamics of population movements in the Americas. Thanks to archaeological and genetic evidence, it is now accepted that the first people arriving in North America came from Siberia, passing through Beringia after late Glacial times. Initial settlements were followed by widespread people movements that reached southern South America relatively fast, as early as ~15 thousand years ago. Several recent studies have provided new information about this subject, reconstructing the genomic history of indigenous groups from different regions of the Americas, but the Isthmo-Colombian area is still lacking. Hence, the second publication of this thesis (Ref II) employed both ancient and modern DNA data analysis to reconstruct the genomic history of the Isthmo-Colombian area. It aims to define the genomic background of Panamanian indigenous populations to evaluate the intra-Isthmus variability and shed light on pre-Columbian Americans' genomic history assessing the connection between the Isthmo-Colombian area and the rest of the Americas. Besides the first migrations, American populations result from several admixture events since the colonial era and the Atlantic slave trade. Moreover, many waves of migration followed by local admixture occurred in the last two centuries, the impact of which has been largely unexplored. The third reference in this thesis (Ref III) explores how more recent migrations shaped the genomic background of admixed American populations. In particular, this study aims to reconstruct the fine-scale ancestry composition, estimate the time of admixture, examine the demographic evolution of different continental ancestries after the admixture and assess the extent and magnitude of sex-biased gene-flow dynamics in admixed American populations. The main results and conclusions of this research thesis are the following: • A high-resolution haplogroup Q phylogeny that presents new insights into its Eurasian and American branches' geographic distribution in modern and ancient samples was ascertained and dated. • For the first time, two distinct Y chromosome lineages reflecting the two main ancestral components (SNA and NNA) earlier described by recent genomic studies were observed. The differentiation of these lineages probably occurred in eastern Beringia before entering the Americas through two routes: the coastal (SNA, Q-Z780/Q-M848) and the internal route (NNA, Q-Y4276). Once there, these two ancestral components probably admixed very early in North America, as suggested by the ancient Kennewick nuclear genome belonging to SNA (Q-M848) yet carrying an NNA mtDNA haplogroup (X2a). • Two significant expansions of the SNA lineages in Meso- and South America, one around 15 kya, early after the first peopling, and another at 3 kya, following climatic changes and local cultural shifts, were revealed. • A remarkable genomic structure within Panama was identified, mainly overlapping with past and present Indigenous groups analysed in this study. These groups also show relatedness, especially in the Caribbean region on the border between western Panama and southeastern Costa Rica over thousands of years. Fewer genetic similarities were identified between the Indigenous groups located in eastern Panama and between the Emberá and the pre-Hispanic Panamanians who lived in the area around Old Panama before European contact. • A previously undescribed ancestry among ancient Indigenous peoples of the Americas was revealed. This ancestry is unique to the region and detectable in the ancient pre-Hispanic individuals and the self-identified descendants of current Indigenous, African and Hispano-Indigenous groups. It reached the Panama land bridge over 10 thousand years ago, expanded locally during the early Holocene, and left genomic traces up to the present day, especially among the Guna. • The European genetic contribution in American populations mirrors the geopolitical situation during colonisation. Several European secondary sources contributing to a substantial proportion of American populations were revealed, e.g. Italy in Brazil and Argentina, Central Europe in Brazil. A differential contribution of African sources among American populations was inferred. • Times of admixture are concordant with migration waves from Europe and reflect differences in African areas exploited through time. • The investigation of the demographic impact of admixture reveals a general decline and recovery pattern in several populations under study corresponding to the beginning and the end of the Colonial Era. However, Peru and Mexico are characterised by different demographic trajectories. • The analysis of sex-biased admixture dynamics suggests that a higher number of American females than males have contributed to the modern populations. In contrast, European males had a more significant contribution than females from the same continent. In contrast, some populations, but not all, showed evidence for a higher female contribution, partially conflicting with historical records for African ancestry.
Description
Väitekirja elektrooniline versioon ei sisalda publikatsioone
Keywords
America, population, population genetics, genetic diversity, phylogeny