PMC Articles

Stable population structure in Europe since the Iron Age, despite high mobility

PMCID: PMC10827293

PMID: 38288729


Abstract

Ancient DNA research in the past decade has revealed that European population structure changed dramatically in the prehistoric period (14,000–3000 years before present, YBP), reflecting the widespread introduction of Neolithic farmer and Bronze Age Steppe ancestries. However, little is known about how population structure changed from the historical period onward (3000 YBP – present). To address this, we collected whole genomes from 204 individuals from Europe and the Mediterranean, many of which are the first historical period genomes from their region (e.g. Armenia and France). We found that most regions show remarkable inter-individual heterogeneity. At least 7% of historical individuals carry ancestry uncommon in the region where they were sampled, some indicating cross-Mediterranean contacts. Despite this high level of mobility, overall population structure across western Eurasia is relatively stable through the historical period up to the present, mirroring geography. We show that, under standard population genetics models with local panmixia, the observed level of dispersal would lead to a collapse of population structure. Persistent population structure thus suggests a lower effective migration rate than indicated by the observed dispersal. We hypothesize that this phenomenon can be explained by extensive transient dispersal arising from drastically improved transportation networks and the Roman Empire’s mobilization of people for trade, labor, and military. This work highlights the utility of ancient DNA in elucidating finer scale human population dynamics in recent history.


Full Text

Prehistoric ancient genomes have allowed disentangling the movements of people and technologies across two major demographic transitions in prehistoric western Eurasia: first the farming transition ~7500 BCE (Lazaridis et al., 2014; Skoglund et al., 2012), and later the Bronze Age Steppe migrations ~3500 BCE (Haak et al., 2015). Over the course of generations, genetically differentiated peoples across western Eurasia came together and admixed. As a result, most present-day European genomes can be modeled as a three-way mixture of these prehistoric groups: Western Hunter-Gatherers, Neolithic farmers, and Bronze Age Herders from the Steppe (Haak et al., 2015; Lazaridis et al., 2014) with minor contributions from other groups (Antonio et al., 2019; Fernandes et al., 2020; Lazaridis et al., 2016; Morrison et al., 2020; Mathieson et al., 2018). These ancestry components are present at different proportions across western Eurasia, leading to a pattern where the genetic structure of Europe mirrors its geography (Novembre et al., 2008).
Given that the major ancestry components of present-day west Eurasians were largely established by the end of the Bronze Age, it is unclear how and what types of demographic processes impacted the genetic make-up of western Eurasia over the last ~3000 years, from the end of the Bronze Age to present-day. Recent studies of historical period genomes from individual regions shed light on this question; they paint a picture of heterogeneity and mobility, rather than of stable population structure. In the city of Rome alone, the population was dynamic and harbored a large diversity of ancestries from across Europe and the Mediterranean from the Iron Age (~1000 BCE) through the Imperial Roman period (27 BCE-300 CE; Antonio et al., 2019). Historical genomes from the Iberian Peninsula also highlight gene flow from across the Mediterranean (Olalde et al., 2019).
These regional reports fit well with archaeological and historical records. By the Iron Age, sea travel was already common, enabling peoples from across the Mediterranean to come into contact for trade (Abulafia, 2011; Broodbank, 2013). Subsequently, the Roman Empire leveraged its organization, labor force, and military prowess to build upon existing waterways and roads throughout Europe and create a united Mediterranean for the only time in history (Beard, 2015; Harper, 2017; Symonds, 2017). Not only did the Empire provide a means for movement, it also provided a reason for individuals to move. Empire building activities, broadly categorized into military, labor, and trade, pulled in people and resources from inside and outside the Empire (Scheidel, 2019).
We collected whole genomes from 204 individuals across 53 archaeological sites in 18 countries spanning Europe and the Mediterranean (Figure 1—figure supplement 1), 26 of these individuals were recently reported (Moots et al., 2022). This collection includes the first historical genomes (Iron Age and later, i.e. after 1000 BCE) from present-day Armenia, Algeria, Austria, and France. Dates for 126 samples were directly determined through radiocarbon dating, and were used alongside archaeological contexts to infer dates for the remaining samples.
For downstream integration with published data, pseudohaploid genotypes were called for the 1240 k SNP panel (Mathieson et al., 2015), resulting in a median of 685,058 SNPs (167,000–1,029,345) per sample. We analyzed newly reported genomes in conjunction with 2033 present-day genomes, 1998 prehistoric genomes, and 764 published historical period genomes (Clemente et al., 2021; Kovacevic et al., 2014, Mallick et al., 2023; Pagani et al., 2016; Saupe et al., 2021; Žegarac et al., 2021, primary AADR sources cited in Materials and methods). Genomes were grouped by regions and time periods (Figure 1) and analyzed using principal component analysis (PCA) and qpAdm modeling (Haak et al., 2015).
Each circle represents a location, the size of the circle corresponds to the number of individuals sampled from that location. Circles are colored by their time period: Bronze Age is green (Pian Sultano), Iron Age is yellow (two recently reported sites Tarquinia and Kerkouane), Imperial Rome and Late Antiquity is dark blue, Medieval Ages and Early Modern are light blue (Palazzo della Cancelleria, Velić, Gardun, Mirine-Fulfinum). Note that the Bronze Age and Iron Age sites were recently reported in Moots et al., 2022.
To investigate historical population structure, we categorized the data into 14 geographical regions, split into three sub-periods of the historical period: Iron Age (1000–1 BCE), Imperial Rome & Late Antiquity (1–700 CE), and Medieval Ages & Early Modern (700–1950 CE). We then characterized inter-individual heterogeneity within these spatio-temporal groups by examining (1) variation of projections onto a PCA space of present-day genomes (Figure 2—figure supplement 1), (2) genetic groups identified by qpAdm and clustering across time within a region, and (3) admixture modeling of genetic groups.
A majority of regions have highly heterogeneous populations in at least one historical time period (Figure 2—figure supplement 2). This is illustrated by both the visual spread in PCA and the genetically distinct clusters of individuals based on pairwise modeling with qpAdm (Haak et al., 2015; Harney et al., 2021). On average, we identified 10 genetic clusters within each region present during the historical period, with a minimum of two and a maximum of 23. With genetically similar samples grouped together, we have more power relative to individual-level analyses when performing admixture modeling on clusters of interest using qpAdm.
Regional vignettes reveal various patterns of historical population structure. In Armenia, for example, the population is highly homogeneous at any given time (Figure 2). After the Copper Age, there are two distinct genetic clusters, separated by a temporal split around 772–403 BCE (Figure 2BC). The earlier cluster (C1) includes newly reported samples (n=5) from Beniamin and published ones (n=6) from five other sites. This cluster cannot be modeled by any single source of ancestry using existing data. The later cluster (C3), which contains newly reported samples (n=12) from Beniamin dating between 403 BCE-500 CE, is genetically similar to present-day Armenians (excluding two Kurdish individuals; Figure 2C). Despite the split, there is evidence of partial continuity between the earlier and later clusters: the later (C3) can be modeled using around 50% of the earlier cluster (C1) and an additional source of Steppe ancestry. Historical genomes from Northern Europe, particularly newly reported genomes from Lithuania and Poland, exhibit a similar level of homogeneity (Figure 2—figure supplement 2).
Each row displays data from a single study region. The first column shows a map with the sampling locations for the individuals, while columns two through four show the individuals projected onto a PCA space of present-day genomes (gray points) (populations are labeled in the far right panel in row 1 and in Figure 2—figure supplement 1). Individual ancient genomes in the map and PCA panels are colored by ancestry clusters identified using qpAdm. Colors are not matched across regions. Star points are putative outliers, that is individuals with ancestry that is underrepresented in the region. They are not colored by ancestry clusters so as to reduce visual clutter.
In contrast to the homogeneity of the Armenian population, most of the regions, including Italy, Southeastern Europe, and Western Europe, had strikingly heterogeneous populations. Newly collected samples reinforce previous findings of high heterogeneity in Rome, including a large portion of the population having affinities for present-day Near Eastern populations (Antonio et al., 2019; Posth et al., 2021; Figure 3—figure supplement 1). Interestingly, Southeastern European and Western European individuals during the Imperial Roman & Late Antiquity period also exhibit high heterogeneity, on par with that of contemporaneous Italy (Figures 3 and 4).
(A) Sampling locations of genetic clusters are represented by a single point per location. Outlier ancestries are black stars, all others are open circles colored by genetic cluster. (B) Colored bars span the minimum and maximum of the date ranges of samples (95% confidence interval from radiocarbon dating or archaeological range). Points are the mean of an individual’s date range. (C) Projections of the ancient genomes onto a PCA of present-day genomes (gray points). Population labels for the PCA reference space are shown in Figure 2C. Present-day genomes from Southeastern Europe are shown with dark gray open circles.
Ancient Italian genomes (colored points) from the Imperial Roman and Late Antiquity period were projected onto principal components of present-day genomes (gray points, populations labeled in Figure 2—figure supplement 1). Present-day Italian genomes are highlighted by a gray filled ellipse. Star points are outliers and circle points are non-outliers. Outlier clusters that can be modeled using contemporaneous populations are labeled with the potential source region.
(A) Sampling locations of genetic clusters are represented by a single point per location. Outlier ancestries are black stars, all others are open circles colored by genetic cluster. (B) Colored bars span the minimum and maximum of the date ranges of samples (95% confidence interval from radiocarbon dating or archaeological range). Points are the mean of an individual’s date range. (C) Projections of the ancient genomes onto a PCA of present-day genomes (gray points). Population labels for the PCA reference space are shown in Figure 2C. Present-day genomes from Southeastern Europe are shown with dark gray open circles.
Furthermore, these ancestries are often shared across regions. In Southeastern Europe, a core group of individuals have ancestry similar to that of present-day and contemporaneous Central Europeans (C6), while other clusters have ancestry similar to that of Northern Europeans (C7) and Eastern Mediterraneans (C8) (Figure 3C). These ancestry groups are found in contemporaneous Italy and Western Europe as well (Figure 4C, Figure 3—figure supplement 1). We also observe individuals of eastern nomadic ancestry, similar to that of Sarmatian individuals previously reported, in both Western Europe (C8, n=2) and Southeastern Europe (C2, n=2).
In total, we identified 11% of individuals as outliers, and could connect 7% of individuals to a putative source in a different region (Figure 5A). Based on the regions where these outliers and their sources originated, we created a network to illustrate their movements (Figure 5B). This network reveals the interconnectedness of Europe and the Mediterranean during the historical period. For example, as discussed above, the Armenian population is quite homogeneous (Figure 2). Unsurprisingly, no outliers were found within Armenia; however, we found outlier individuals in the Levant and Italy who can be putatively traced back to Armenia according to their ancestry (Figure 5C; blue outgoing arrows from Armenia). In contrast, the heterogeneous population in Italy connects it to many other regions, with bi-directional movement in most cases. In North Africa, outliers found in Iron Age Tunisia (Moots et al., 2022) indicate movements from many regions in Europe, and North African-like outliers were found in Italy and Austria (Western Europe). North African ancestry in Italy is supported by a single previously reported individual from the Imperial Roman period (R132; Antonio et al., 2019). Similar North African ancestry in Western Europe is supported by a single individual, R10667, from Wels, Austria, a site located on the frontier of the Roman Empire (C28 in Figure 4). This individual from Austria can be modeled using Canary Islander individuals from the Medieval Ages or an Iron Age outlier (distinguished by having more sub-Saharan ancestry) from Kerkouane, a Punic city near Carthage in modern-day Tunisia.
The 7% estimate for outliers with source should be considered conservative for the proportion of ‘non-local’ individuals. There are several cases where a cluster comprises more than 5% of the individuals in the region, but are clearly of a different ancestry than the majority and seem to be transient (only found in a single sub-period of the historical period). For example, in Southeastern Europe (Figure 3B), Imperial Roman & Late Antiquity individuals in C8 are (1) of distant ancestry (Near Eastern) and (2) not found in previous or subsequent time periods. However, since there are five individuals in this cluster, it does not meet our strict criteria for outlier consideration. Additionally, many clusters of underrepresented ancestry cannot be modeled as one-component models because they are recently admixed (i.e. require two or more ancestry components) or of ancestry not sampled elsewhere. Thus, we expect the actual proportion of individuals involved in long distance movements to be higher than reported here.
The remarkable amount of heterogeneity and mobility in the historical period leads to the question of what impact this might have had on population structure over time. To investigate this, we sought to quantify the overall change in population structure across time, from prehistoric to present-day. To assess the spatial structure of population differentiation, we calculated FST across groups of individuals on a sliding spatial grid in each time period and related it to their mean geographic distance. In each time period, we recovered the classical pattern of isolation-by-distance (Figure 6A), where individuals closer in geographic space are also more similar genetically. Across time periods, we see a large decrease in overall FST from the Mesolithic & Neolithic periods to the Bronze Age (approximately 10,000–2300 BCE), coinciding with the major prehistoric migrations (Haak et al., 2015; Lazaridis et al., 2014). From the Bronze Age onward, however, FST does not decrease further with time, indicating that the level of genetic differentiation across space is relatively stable from the Bronze Age to present-day.
To assess not only the amount, but also the structure of geographic population differentiation, we compared the ‘genetic maps’ of historical period and present-day genomes. To construct these ‘maps’, we performed principal component analysis on 829 present-day European and Mediterranean genomes sampled across geographical space (Figure 6B, bottom) and projected historical period genomes onto the same PC space. Echoing close correspondence between genetic structure and geographic space in present-day Europeans (Novembre et al., 2008), we recovered similar spatial structure for historical samples as well, although noisier due to a narrower sampling distribution and higher local genetic heterogeneity (Figure 6B, middle). The similarity in structure between present-day and historical period is especially striking in comparison to a projection of prehistoric genomes, which shows much weaker correspondence to the present-day PCA as well as to geographic space (Figure 6B, top). Together, our analyses indicate that European and Mediterranean population structure has been relatively stable over the last 3000 years.
This raises the question: is it surprising for stable population structure to be maintained in the presence of ~7–11% long-range migration? To address this, we simulated Wright-Fisher populations evolving neutrally in continuous space. In these simulations, spatial population structure is established through local mate choice and limited dispersal, which we calibrated to approximately match the spatial differentiation observed in historical-period Europe (Figure 6A, Figure 7A and Figure 7—figure supplement 1, maximum FST of ~0.03). We then allowed a proportion of the population to disperse longer distances, empirically matching the migration distances we observed in the data during the historical period (Figure 7—figure supplement 2). Even with long-range dispersal as low as 4%, we observe decreasing FST over 120 generations (~3000 years with a generation time of 25 years) as individuals become less differentiated genetically across space (Figure 7B). At 8%, FST decreases dramatically within 120 generations as spatial structure collapses to the point that it is hardly detectable in the first two principal components (Figure 7C). These simulations indicate that under a basic spatial population genetics model we would expect structure to collapse by present-day given the levels of movement we observe.
A possible explanation for this apparent paradox is that our simulations did not capture some key features of human behavior and population dynamics. In the simulated populations, migration implies both movement and reproduction with local random mate choice. However, in real human populations migration can be more complex: people do not necessarily reproduce where they migrate, and reproduction is not necessarily random. We hypothesize that in the historical period there was an increasing decoupling of movement and reproduction, compared to prehistoric times. For the spread of Farmer and Steppe ancestry, we know that these prehistoric migrations would take hundreds of years to traverse the continent (Allentoft et al., 2015; Haak et al., 2015; Lazaridis et al., 2016). In contrast, in the historical period, there were dense travel networks of roads and waterways as well as clear incentives for cross-Mediterranean and cross-continental movement (Abulafia, 2011; Beard, 2015; Broodbank, 2013; Symonds, 2017). This enabled people to travel cross-continental distances on the order of weeks or months, well within their lifetimes (Figure 5—figure supplements 2 and 3; Scheidel, 2015).
The Roman Empire is particularly important in understanding how transient mobility could become a unique hallmark of this period. During the expansion of the Empire, existing and new cities quickly expanded as hubs for trade and labor. Urban-military complexes emerged along the frontier as military forces established themselves and drew in local communities which sought protection or economic benefit (Verhagen et al., 2019). To support these rapidly growing economic city-centers, human capital beyond the local population was necessary, thus drawing in people from far away places either freely or forcibly (e.g. slavery, military). According to a longstanding historical hypothesis, the Urban Graveyard Effect, the influx of migrants in city-centers disproportionately contributed to the death rate over the birth rate; a process which would contribute to observing individuals as ‘transient’ migrants (Tacoma, 2016). Long-range, transient migration, combined with the Roman Empire’s highly efficient travel networks (Scheidel et al., 2007; Oleson, 2009; Scheidel, 2015) may explain the genetically heterogeneous populations, especially along the frontier regions (e.g. Serbia, Croatia, and Austria).
With transient mobility as the main contributor to the observed heterogeneity, it remains unclear what additional demographic processes contributed to the maintenance of spatial genetic structure. The collapse of the Empire involved a loss of urban-military complexes and depopulation of cities, followed by ruralization (Burgess, 2007; Dey, 2015; Roymans et al., 2020). Without the Empire incentivising trade and movement, there may have been little motive for individuals to remain in these suddenly remote regions.
If this hypothesis is true, we would expect a reduction in local genetic heterogeneity after the collapse of the Empire. Unfortunately, we do not have this period sampled densely enough to assess this comprehensively. The lack of samples is further amplified by the fact that ancient DNA comes from archaeological excavations, which tend to be enriched in urban areas; a stone mausoleum in the city center, for example, will produce more surface scatter than a wood farmhouse, making urban areas more likely for excavation (Bowes, 2011). This makes it difficult to comprehensively address differences in rural versus urban demography. Collecting more genetic data from both urban and rural contexts across the historical period will be a valuable future step in understanding how spatial population structure was maintained. Furthermore, it could elucidate the role of other historical events and peoples, such as the Franks, Lombards, Visigoths, and Huns, during the Migration Period.
The archaeological context for ancient individuals reported in this study is detailed in Supplementary file 1. Site descriptions were written by the contributing archaeologists. Descriptions of individual-level burials are included where possible.
In the study we use a single date estimate (for both newly reported and published samples) which is the midpoint of the 95% confidence interval when using AMS dates, and the average of the lower and upper bound inference dates when using archaeological context for dating. The dating approach used for each sample is included in files Supplementary file 1 (archaeological context) and Supplementary file 2 (sample metadata). The full AMS and calibration results are reported in Supplementary file 3.
The 204 ancient genomes reported in this study, 26 of which were recently reported in Moots et al., 2022, represent a subset of samples screened from 53 archaeological sites across 18 countries. We isolated and finely ground the cochlear regions of the petrous bones in dedicated clean room facilities at the University of Vienna following the protocols described in Pinhasi et al., 2019; Pinhasi et al., 2015. Using 50 mg of bone powder, DNA was extracted by 18 hr incubation of the powder in a solution of Proteinase-K and EDTA. DNA was eluted in 50 μl 10 mM Tris-HCl, 1 mM EDTA, 0.05% Tween-20, pH 8.0 as in Dabney et al., 2013; Rohland and Hofreiter, 2007. 12.5–25 µL of DNA extract was used to prepare partial uracil–DNA–glycosylase (UDG) double stranded libraries as described in Rohland et al., 2015. After a partial (30 minute) UDG treatment, library preparation followed a modified version of the Meyer and Kircher, 2010 protocol (Meyer and Kircher, 2010): the initial DNA fragmentation step was not required and MinElute PCR purification kits (Qiagen) were used for all library clean-up steps. Libraries were measured on a qPCR to determine the ideal cycle number to avoid over or under-amplification. 10-20 uL of library was then double indexed using Agilent PfuTurbo Cx HotStart DNA Polymerase with conditions: 95 °C for 5 min followed by the qPCR-determined cycles of 95 °C for 15 s, 60 °C for 30 s and 72 °C for 30 s with a final elongation at 72 °C for 5 min. After indexing, the libraries were purified using the MinElute system (Qiagen) and eluted in 25 μL of 1 mM EDTA, 0.05% Tween-20. Libraries were screened based on Qubit concentration and visual validation of Bioanalyzer peaks for an initial low coverage (NextSeq or Novaseq) screening run.
Adapters were removed from sequence reads using Cutadapt (v1.14) (Martin, 2011). Then, for each sample, reads were processed further (a) with the 2 base pairs at either end of the reads trimmed off and (b) without trimming. Since partial UDG treatment was performed on the libraries, a damage signature consisting of elevated C>T transitions on the 5’ end and G>A transitions on the 3’ end should remain at the ends of reads. Therefore, analyzing untrimmed, aligned reads would allow us to assess the amount of the ancient DNA damage signature present in a sample, and to use this as a criteria for authenticating that the sampled DNA is ancient. Other than the variable trimming parameter for the ends of the reads, all other parameters remained the same for both screening and high coverage sequencing data.
Following variable trimming, reads were filtered for minimum length of 30, then aligned to hg19 using bwa (0.7.15-r1140; Li and Durbin, 2009), with seed length disabled (-l 350). For each sample, aligned reads were sorted by coordinate using Picard’s SortSam (version 2.9.0–1-gf5b9f50-SNAPSHOT) and read groups were added using Picard’s AddOrReplaceReadGroups (version 2.9.0–1-gf5b9f50-SNAPSHOT) (http://broadinstitute.github.io/picard/). Reads with mapping quality <25 (including unaligned reads) were filtered out. For higher coverage sequencing runs, this process was parallelized by splitting raw fastq files and merging after alignment, sorting, and quality filtering. Duplicates were removed using samtools rmdup (http://www.htslib.org/doc/samtools.html). Genome-wide and chromosomal coverage were assessed using depth-cover (version 1.0.3, https://github.com/jalvz/depth-cover; Alvarez, 2017).
Samples were screened and selected using the following criteria: (1)>20% reads aligned to the hg19 build of the human genome; (2) a C>T mismatch rate at the 5’-end and G>A at the 3’-end of the sequencing read of 4% or above (characterized with mapDamage v2.0.8) (Jónsson et al., 2013); (3) library complexity estimates indicating that a minimum coverage of 0.5 x would be achievable with further sequencing, (4) with a contamination level ≤ 5%.
Contamination rates were estimated with three methods: (1) damage pattern and polymorphism in mitochondrial DNA with Schmutzi (Renaud et al., 2015), (2) atypical ratios of coverages of X and Y chromosomes to autosomes calculated with ANGSD (Korneliussen et al., 2014), and (3) for male samples, high heterozygosity on non-pseudo-autosomal region of the X chromosome (chrX:5000000–154900000 in hg19) with the ‘contamination’ tool in ANGSD (Korneliussen et al., 2014). If the contamination estimate for any of these three methods was above 5%, we considered the sample contaminated and excluded it from downstream analysis (n=9).
Pseudohaploid genotypes for study samples were called by randomly choosing one allele from each site where there was read coverage, following the approach and software provided by Stephan Schiffels (https://github.com/stschiff/sequenceTools; Schiffels, 2022). Variants were called for the 1240 k SNP panel, which is commonly used for capture-based sequencing of ancient samples (Mathieson et al., 2015). For the newly reported samples, a median of 685,058 SNPs (167,000–1,029,345) were covered per sample. Data was output in eigenstrat format. This pipeline was also used to call genotypes for two published ancient DNA datasets which at the time were only available in BAM (sequence read) format (Clemente et al., 2021; Žegarac et al., 2021).
Newly processed pseudohaploid data was merged with several datasets. Most of the published data was retrieved from the Allen Ancient Data Resource (AADR) v44.3 (January 2021) (Allen Ancient DNA Resource, 2021; Mallick et al., 2023): a compilation of pseudohaploid and diploid genotypes for 5,225 ancient and 3,720 present-day individuals (1000
Auton et al., 2015; Agranat-Tamir et al., 2020; Allentoft et al., 2015; Amorim et al., 2018; Antonio et al., 2019; Bergström et al., 2020; Biagini et al., 2019; Brace et al., 2019; Broushaki et al., 2016; Brunel et al., 2020; Cassidy et al., 2020; Cassidy et al., 2016; Damgaard et al., 2018; de Barros Damgaard et al., 2018; Ebenesersdóttir et al., 2018; Feldman et al., 2019a; Feldman et al., 2019b; Fernandes et al., 2020; Fernandes et al., 2018; Fregel et al., 2018; Fu et al., 2016; Furtwängler et al., 2020; Gamba et al., 2014; Gokhman et al., 2020; González-Fortes et al., 2019; González-Fortes et al., 2017; Günther et al., 2018; Günther et al., 2015; Haber et al., 2020; Haber et al., 2019; Haber et al., 2017; Harney et al., 2018; Broushaki et al., 2016; Järve et al., 2019; Jeong et al., 2019; Jones et al., 2017; Jones et al., 2015; Keller et al., 2012; Kılınç et al., 2016; Krzewińska et al., 2018b; Krzewińska et al., 2018a; Lamnidis et al., 2018; Lazaridis et al., 2017; Lazaridis et al., 2016; Lazaridis et al., 2014; Linderholm et al., 2020; Lipson et al., 2017; Mallick et al., 2016; Malmström et al., 2019; Morrison et al., 2020; Margaryan et al., 2020; Martiniano et al., 2017; Martiniano et al., 2016; Mathieson et al., 2018; Mathieson et al., 2015; Mittnik et al., 2019; Mittnik et al., 2018; Narasimhan et al., 2019; Nikitin et al., 2019; Olalde et al., 2019; Olalde et al., 2018; Olalde et al., 2015; Olalde et al., 2014; Omrak et al., 2016; O’Sullivan et al., 2018; Patterson et al., 2012; Prüfer et al., 2017; Rivollat et al., 2020; Rodríguez-Varela et al., 2017; Saag et al., 2019; Saag et al., 2017; Sánchez-Quinto et al., 2019; Schiffels et al., 2016; Schroeder et al., 2019; Schuenemann et al., 2017; Sikora et al., 2017; Skoglund et al., 2014; Skourtanioti et al., 2020; Unterländer et al., 2017; Valdiosera et al., 2018; van den Brink et al., 2017; Veeramah et al., 2018; Villalba-Mouco et al., 2019; Wang et al., 2019; Zalloua et al., 2018). We also included relevant genetic data made available by authors that were not in the AADR: present-day genomes from the Balkans (Kovacevic et al., 2014), present-day genomes from 4 Poles, 3 Germans, and 2 Moldavians (Pagani et al., 2016), and Bronze Age Italian genomes (Saupe et al., 2021). Pseudohaploid genotypes for published Bronze Age Aegean genomes (Clemente et al., 2021) and Bronze Age Serbian genomes (Žegarac et al., 2021) were generated from BAM files using our pipeline. All published genomes were filtered for contamination based on reported contamination levels in the original study and SNP coverage based on the genomic data. All published samples that contributed to this study are listed in Supplementary file 4. To ensure maximum overlap with present-day and ancient samples in analyses, the merged dataset was subset to SNPs in the Human Origin Panel array, resulting in a total of 481,259 SNPs. For PCA and qpAdm modeling, SNPs that are transitions at CpG sites (n=76,678) were excluded since they may have arisen from DNA damage as opposed to true genetic variation.
Principal component analysis was performed on genotypes from present-day and Mediterranean individuals using smartpca v16000 (https://github.com/chrchang/eigensoft/blob/master/POPGEN/README; Chang, 2013). The following parameters were used: 5 outlier iterations (numoutlieriter), 10 principal components along which to remove outliers (numoutlierevec), altnormstyle set to NO, with least squares projection turned on (lsqproject set to YES). To calculate principal components only using present-day individuals, a file (poplistname) was provided with the population names of present-day individuals, randomly subsampled per population. After outlier removal (which removed 55 samples), 829 individuals and 480,712 SNPs were used in the initial analysis. All individuals (non ‘reference’ present-day genomes, and all of the ancient individuals) whose population was not listed in the poplistname file were projected onto the calculated principal components. In the paper, we refer to the individuals used in the calculation of principal components as belonging to the ‘reference PCA space’. These ‘reference’ genomes were used to calculate the PCs because (1) they represent a wide range of present-day variation and (2) the genotypes tend to be of high quality.
In the figures, present-day genomes used in the reference space are generally colored gray in order to illustrate the background space of genetic variation. To reduce visual clutter and emphasize the ancient genomes, these present-day ‘reference’ genomes are typically unlabeled. Labels for these populations are shown in Figure 2—figure supplement 1.
To assess the extent of genetic differentiation across geographic space within a time period, we calculated the Fixation index (FST) between groups of individuals on a sliding spatial grid. Each grid cell measured 10 degrees longitude by 10 degrees latitude, and was slid by 1 degree in both directions (north and east) nine times to build a total of 10 spatial grids. For each of these grids, pairwise FST was calculated between all populated 10-by-10 grid cells using Hudson’s estimator, correcting for unequal sample size (Bhatia et al., 2013). In addition to FST, we also calculated the average geographic distance (in kilometers) between all individuals across pairs of grid cells to assess how spatial distance relates to genetic differentiation. To visualize this relationship, we used lowess smoothing as implemented in python’s statsmodels package (statsmodels.api.nonparametric.lowess, v. 0.12.2). To infer confidence intervals for the lowess smoothing estimates, we devised a spatial bootstrapping procedure. Our bootstrap approach samples pairs of grid cells in a way that always samples all overlapping cells or none of them, so individuals are either fully included in a bootstrap replicate or not at all. This prevents double-counting individuals since they contribute to several comparisons across space due to the sliding grid.
In this workflow, we heavily utilize what we call one-component models, where we test whether two individuals or clusters of individuals form a clade relative to a chosen set of reference populations (which we define below). To clarify what we mean by one-component model, assume that we have two focal individuals i1 and i2, as well as a set of reference populations. Using the terminology of admixtools 2 as well as (Harney et al., 2021), the following four tests are equivalent with respect to the resulting p-value:
where we use the R-style notation of c(i1, i2) to denote a vector consisting of i1 and i2. In all four cases, we are testing against the null hypothesis that the two individuals do form a clade, with low p-values indicating a rejection of that hypothesis. We will call any implementation of this test a ‘one-component qpAdm model’, but the equivalence stated above shows that our one-component models are equivalent to what was called a ‘qpWave analysis’ by Fernandes et al., 2020.
For our reference populations, we chose a set similar to those previously used to model Eurasian historical genomes (Fernandes et al., 2020). However, we added two Asian populations (Laos_Hoabinhian and Onge) based on evidence of gene flow from further east in a subset of our data. The purpose of a set of reference populations in the qpAdm modeling setting is to represent components of ancestry which are differentially related to the focal individuals being tested (i.e. ‘left’, or ‘sources’ and ‘target’) in order to resolve differences in ancestry, but distally related enough to minimize the chance of recent gene flow between ‘left’ and ‘right’. Our final set of references is:
Model competition involves re-testing the model fit of each potential source after adding another potential source to the right group, first described in Lazaridis et al., 2016, and more recently detailed in Harney et al., 2021. The idea is that if a population in the right group has significantly more allele sharing with the target than the source, then the model will be rejected. (This is why right group populations are chosen to be distal, yet relevant, to target and source). We use this property to our advantage by rotating all n-1 valid sources for the target through the right group set, one at a time, for the same target and a source x (the one source that is not included in the rotation). If the previously valid source x is rejected when including another valid source y in the right group then we remove source x from the list of potential sources. Note that this does not make source y the best source, only a better one than x. Thus, this scheme only eliminates sub-optimal sources, rather than selecting a best source.
Among the outliers identified, we did not find a significant sex bias compared to non-outliers. Overall, there are more males than females in the dataset. However, the proportions of males in non-outliers, outliers with source, and outliers without source do not differ significantly by a Chi-squared test (p-value = 0.4117, df = 2; Figure 5—figure supplement 1). When outliers (with and without source) are treated as one group, there is still no significant association with outlier status and sex (p-value = 0.633, df = 1).
For targeted analyses of other clusters beyond just outlier candidates, for example to annotate Figures 2—4, we also used cluster-based qpAdm. In addition to one-component models as described above, we also used two-component models of admixture. These models test the hypothesis that a focal target cluster can be modeled as a two-way admixture of two sources (or ‘left’ populations). As above, a p-value below the threshold rejects this hypothesis, that is the proposed admixture model is not a good fit and a different model needs to be considered to disentangle the admixture scenario in question.
To assess how spatial population structure would be impacted by different modes of dispersal, we set up forward simulations in continuous 2D space using SLiM v. 3.6 (Haller and Messer, 2019). The aim of these simulations was to approximate the extent of spatial population structure we observe by the beginning of the Iron Age in Western Eurasia, after the major prehistoric migrations had taken place. To achieve that, we decided not to attempt simulating the precise ancestry composition of populations in different regions at that time, but rather to simulate simply the extent of spatial structure as measured by the relationship of population differentiation (FST) and geographic distance. We chose the SLiM simulation framework to make use of its extensive feature set to simulate individuals in continuous space. We simulated diploid genomes made up of a single, 108 bp long chromosome, with recombination rate and mutation rate set to 10–8. We used the default Wright-Fisher simulation mode, where a single population of constant size N is simulated with non-overlapping populations, that is each generation is made up of offspring generated from the previous generation. Spatial structure is established by associating each individual with a continuous 2D coordinate (i.e. latitude and longitude), and by using these coordinates to govern three demographic processes: mate choice, competition, and dispersal. An overview of how these processes can be set up to interact in SLiM can be found in Recipe 15.4 of SLiM v. 3.6 (see e.g. here: https://github.com/MesserLab/SLiM/tree/v3.6/SLiMgui/Recipes; Haller, 2021). Briefly, for mate choice, a Gaussian interaction function with maxDistance = 0.1, maxStrength = 1.0, sigma = 0.02 is used to govern a mateChoice callback using the strength of that interaction function. For competition, another Gaussian interaction function with maxDistance = 0.3, maxStrength = 3.0, sigma = 0.1 is used to calculate competition using the totalNeighborStrength vector of that interaction function to scale an individual’s relative fitness as 1.1 – competition / N. Finally, we establish local dispersal through a modifyChild callback, where a newly generated offspring’s position is drawn from a Gaussian centered at the location of the maternal individual with standard deviation sigmaDisp.
We let this population evolve forward in time for 2000+120 generations during which the processes outlined above lead to spatial population structure, where individuals sampled closely together in 2D space are also more closely related genetically. We do not simulate mutations in SLiM, as this poses a major computational burden. Instead, we use tree sequence recording (Haller et al., 2019; Kelleher et al., 2018) to track the full genealogy of all individuals in the simulation which are either alive at the end of the simulation, or explicitly sampled through time using the treeSeqRememberIndividuals function of SLiM. While 2000 generations are enough to establish spatial structure under the parameters we consider, it is by far insufficient for all sampled individuals to fully coalesce. To accurately assess neutral variation however, we need all sampled individuals to have a common ancestor at some point in the past, as mutations may have arisen at any point leading back to this ultimate coalescence event. Therefore, we approximate the deep history of our population with a panmictic population simulated backwards in time using the coalescent with recombination as implemented in msprime (Kelleher et al., 2016). This process has been referred to as ‘recapitation’ (Haller et al., 2019), where an incomplete genealogy with multiple roots (from SLiM) is ‘recapitated’ using coalescent simulation backwards in time. This is made possible by using the tree sequence data structure to record and simulate genealogies in both SLiM and msprime.
To assess the relationship between FST and spatial distance, we split geographic space into a 10-by-10 grid and calculated all pairwise FST between inhabited grid cells using Hudson’s estimator with unequal sample size correction (Bhatia et al., 2013), as well as the average geographic (euclidean) distance between individuals across grid cells. We used this FST analysis to calibrate the base dispersal sigmaDisp as well as the population size N, so that FST at maximum distance (FSTmax) would approximately match the FSTmax we observed at the start of the historical period (~0.03). We used grid search with a range of sigmaDisp and N values, and found the parameter pair N=50,000 & sigmaDisp = 0.02 to qualitatively produce the closest match (Figure 7—figure supplement 1). We use this parameter set as our base model of population structure without long-range dispersal, where we allow spatial structure to establish over 2000 generations, and then observe the FST – Distance relationship over the following 120 generations for a total of 2120 generations simulated in SLiM (Figure 7A).
We aimed to choose a sigmaDispLR that approximately matches the empirical distribution of long-range dispersing individuals we observe in the analysis displayed in Figure 4. Since the euclidean distances in the simulation are on a different scale than the geodesic distances observed in the data, we aimed to match qualitatively the relationship of long-range dispersal distances to random distances that could be observed if two populated locations were drawn at random. We visually analyzed this relationship from the data, and then performed a search across a range of possible sigmaDistLR to find a qualitative match. This led us to choose a value of sigmaDistLR = 0.20 (Figure 7—figure supplement 2).
Finally, we analyzed simulated ‘present-day’ genomes (i.e. after 2120 generations of SLiM) using PCA. We used the sklearn.decomposition.PCA module (scikit-learn v. 0.24.2) with the svd_solver == ‘arpack’ option to run non-probabilistic PCA to calculate the first 10 principal components. Similarly to how the empirical data was analyzed with smartpca, we also did 5 rounds of iterative outlier removal, removing individuals from the PCA that deviated by more than six standard deviations along any of the 10 principal components. The number of variants contributing to these PCA were 624,617, 625,669, and 626,052 for Figure 7A, B and C respectively, and thus comparable to the number of variants contributing to our data analysis.
This study follows ethics guidelines adopted by the ancient DNA field (Alpaslan-Roodenberg et al., 2021). A clear plan of research was laid out before the collection of samples, leading us to focus on sampling from under-sampled historical regions in Eurasia and therefore minimize unnecessary destruction of human remains. The research intent for these samples was clearly communicated to caretakers of the samples prior to collection. Local anthropologists, archaeologists, and museum directors from each geographic region were involved in the sample acquisition, extraction from skeletal material, and interpretation of genetic results. The genetic findings regarding individual samples from each region were communicated to local collaborators, all of whom were included as co-authors on the paper and were supportive of the final results. The involvement of our local collaborators was essential for the interpretation of the genetic results through their input on the historical and archaeological characterization of the specimens. We have supported our local collaborators with immediate access to the raw genetic data, and by communicating results in written and oral forums. Authorities responsible for all archaeological sites provided written documentation for their specimens to be included in this study through collaboration with the Pinasi Lab (Vienna, Austria).
Projecting historical genomes into the present-day PCA space allows for a convenient visualization that is common in the field of ancient DNA and exhibits an established connection to geographic space that is easy to interpret. This is true especially for more recent ancient and historical genomes, as spatial population structure approaches that of present day. However, there are two challenges: (1) a two-dimensional representation of a fairly high-dimensional ancestry space necessarily incurs some amount of information loss and (2) we know that some axes of genetic variation are not well-represented by the present-day PCA space. This is evident, for example, by projecting our qpAdm reference populations into the present-day PCA, where some ancestries which we know to be quite differentiated project closely together (Author response image 3). Despite this limitation, we continue to use the PCA representation as it is well resolved for visualization and maximizes geographical correspondence across Eurasia.
On the other hand, the qpAdm reference space (used in clustering and outlier detection) has higher resolution to distinguish ancestries by more comprehensively capturing the fairly high-dimensional space of different ancestries. This includes many ancestries that are not well resolved in the present-day PCA space, yet are relevant to our sample set, for example distinguishing Iranian Neolithic ancestry against ancestries from further into central and east Asia, as well as distinguishing between North African and Middle Eastern ancestries (Author response image 3).
To investigate the differences between these two reference spaces, we chose pairwise outgroup-f3 statistics (to Mbuti) as a pairwise similarity metric representing the reference space of f-statistics and qpAdm in a way that’s minimally affected by population-specific drift. We related this similarity measure to the euclidean distance on the first two PCs between the same set of populations (Author response image 4). This analysis shows that while there is almost a linear correspondence between these pairwise measures for some populations, others comparisons fall off the diagonal in a manner consistent with PCA projection (Author response image 3), where samples are close together in PCA but not very similar according to outgroup-f3. Taken together, these analyses highlight the non-equivalence of the two reference spaces.
The fact that historical period individuals are “most closely related to the folks that contribute to present-day people that roughly live in the same geographic location” is exactly the point we were hoping to make with Figures 6 B and C. We do realize, however, that the fact that one set of samples is projected into the PC space established by the other may suggest that this is an obvious result. To make it more clear that it is not, we added an additional panel to Figure 6, which shows pre-historical samples projected into the present-day PC space. This figure shows that pre-historical individuals project all across the PCA space and often outside of present-day diversity, with degraded correlation of geographic location and projection location (see also Author response image 5). This illustrates the contrast we were hoping to communicate, where projection locations of historical individuals start to “settle” close to present-day individuals from similar geographic locations, especially in contrast with pre-historic individuals.


Sections

"[{\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib66\", \"bib120\", \"bib42\", \"bib42\", \"bib66\", \"bib8\", \"bib32\", \"bib67\", \"bib85\", \"bib80\", \"bib88\"], \"section\": \"Introduction\", \"text\": \"Prehistoric ancient genomes have allowed disentangling the movements of people and technologies across two major demographic transitions in prehistoric western Eurasia: first the farming transition ~7500 BCE (Lazaridis et al., 2014; Skoglund et al., 2012), and later the Bronze Age Steppe migrations ~3500 BCE (Haak et al., 2015). Over the course of generations, genetically differentiated peoples across western Eurasia came together and admixed. As a result, most present-day European genomes can be modeled as a three-way mixture of these prehistoric groups: Western Hunter-Gatherers, Neolithic farmers, and Bronze Age Herders from the Steppe (Haak et al., 2015; Lazaridis et al., 2014) with minor contributions from other groups (Antonio et al., 2019; Fernandes et al., 2020; Lazaridis et al., 2016; Morrison et al., 2020; Mathieson et al., 2018). These ancestry components are present at different proportions across western Eurasia, leading to a pattern where the genetic structure of Europe mirrors its geography (Novembre et al., 2008).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib8\", \"bib92\"], \"section\": \"Introduction\", \"text\": \"Given that the major ancestry components of present-day west Eurasians were largely established by the end of the Bronze Age, it is unclear how and what types of demographic processes impacted the genetic make-up of western Eurasia over the last ~3000 years, from the end of the Bronze Age to present-day. Recent studies of historical period genomes from individual regions shed light on this question; they paint a picture of heterogeneity and mobility, rather than of stable population structure. In the city of Rome alone, the population was dynamic and harbored a large diversity of ancestries from across Europe and the Mediterranean from the Iron Age (~1000 BCE) through the Imperial Roman period (27 BCE-300 CE; Antonio et al., 2019). Historical genomes from the Iberian Peninsula also highlight gene flow from across the Mediterranean (Olalde et al., 2019).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib1\", \"bib16\", \"bib10\", \"bib51\", \"bib123\", \"bib114\"], \"section\": \"Introduction\", \"text\": \"These regional reports fit well with archaeological and historical records. By the Iron Age, sea travel was already common, enabling peoples from across the Mediterranean to come into contact for trade (Abulafia, 2011; Broodbank, 2013). Subsequently, the Roman Empire leveraged its organization, labor force, and military prowess to build upon existing waterways and roads throughout Europe and create a united Mediterranean for the only time in history (Beard, 2015; Harper, 2017; Symonds, 2017). Not only did the Empire provide a means for movement, it also provided a reason for individuals to move. Empire building activities, broadly categorized into military, labor, and trade, pulled in people and resources from inside and outside the Empire (Scheidel, 2019).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig1s1\", \"bib84\"], \"section\": \"204 new historical genomes from Europe and the Mediterranean\", \"text\": \"We collected whole genomes from 204 individuals across 53 archaeological sites in 18 countries spanning Europe and the Mediterranean (Figure 1\\u2014figure supplement 1), 26 of these individuals were recently reported (Moots et al., 2022). This collection includes the first historical genomes (Iron Age and later, i.e. after 1000 BCE) from present-day Armenia, Algeria, Austria, and France. Dates for 126 samples were directly determined through radiocarbon dating, and were used alongside archaeological contexts to infer dates for the remaining samples.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib79\", \"bib23\", \"bib62\", \"bib73\", \"bib96\", \"bib111\", \"bib133\", \"fig1\", \"bib42\"], \"section\": \"204 new historical genomes from Europe and the Mediterranean\", \"text\": \"For downstream integration with published data, pseudohaploid genotypes were called for the 1240 k SNP panel (Mathieson et al., 2015), resulting in a median of 685,058 SNPs (167,000\\u20131,029,345) per sample. We analyzed newly reported genomes in conjunction with 2033 present-day genomes, 1998 prehistoric genomes, and 764 published historical period genomes (Clemente et al., 2021; Kovacevic et al., 2014, Mallick et al., 2023; Pagani et al., 2016; Saupe et al., 2021; \\u017degarac et al., 2021, primary AADR sources cited in Materials and methods). Genomes were grouped by regions and time periods (Figure 1) and analyzed using principal component analysis (PCA) and qpAdm modeling (Haak et al., 2015).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib84\"], \"section\": \"Detailed map of locations for newly reported samples.\", \"text\": \"Each circle represents a location, the size of the circle corresponds to the number of individuals sampled from that location. Circles are colored by their time period: Bronze Age is green (Pian Sultano), Iron Age is yellow (two recently reported sites Tarquinia and Kerkouane), Imperial Rome and Late Antiquity is dark blue, Medieval Ages and Early Modern are light blue (Palazzo della Cancelleria, Veli\\u0107, Gardun, Mirine-Fulfinum). Note that the Bronze Age and Iron Age sites were recently reported in Moots et al., 2022.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2s1\"], \"section\": \"Local historical population structure varies across regions\", \"text\": \"To investigate historical population structure, we categorized the data into 14 geographical regions, split into three sub-periods of the historical period: Iron Age (1000\\u20131 BCE), Imperial Rome & Late Antiquity (1\\u2013700 CE), and Medieval Ages & Early Modern (700\\u20131950 CE). We then characterized inter-individual heterogeneity within these spatio-temporal groups by examining (1) variation of projections onto a PCA space of present-day genomes (Figure 2\\u2014figure supplement 1), (2) genetic groups identified by qpAdm and clustering across time within a region, and (3) admixture modeling of genetic groups.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2s2\", \"bib42\", \"bib50\"], \"section\": \"Local historical population structure varies across regions\", \"text\": \"A majority of regions have highly heterogeneous populations in at least one historical time period (Figure 2\\u2014figure supplement 2). This is illustrated by both the visual spread in PCA and the genetically distinct clusters of individuals based on pairwise modeling with qpAdm (Haak et al., 2015; Harney et al., 2021). On average, we identified 10 genetic clusters within each region present during the historical period, with a minimum of two and a maximum of 23. With genetically similar samples grouped together, we have more power relative to individual-level analyses when performing admixture modeling on clusters of interest using qpAdm.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2\", \"fig2\", \"fig2\", \"fig2s2\"], \"section\": \"Local historical population structure varies across regions\", \"text\": \"Regional vignettes reveal various patterns of historical population structure. In Armenia, for example, the population is highly homogeneous at any given time (Figure 2). After the Copper Age, there are two distinct genetic clusters, separated by a temporal split around 772\\u2013403 BCE (Figure 2BC). The earlier cluster (C1) includes newly reported samples (n=5) from Beniamin and published ones (n=6) from five other sites. This cluster cannot be modeled by any single source of ancestry using existing data. The later cluster (C3), which contains newly reported samples (n=12) from Beniamin dating between 403 BCE-500 CE, is genetically similar to present-day Armenians (excluding two Kurdish individuals; Figure 2C). Despite the split, there is evidence of partial continuity between the earlier and later clusters: the later (C3) can be modeled using around 50% of the earlier cluster (C1) and an additional source of Steppe ancestry. Historical genomes from Northern Europe, particularly newly reported genomes from Lithuania and Poland, exhibit a similar level of homogeneity (Figure 2\\u2014figure supplement 2).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2s1\"], \"section\": \"Ancestry clusters identified within regions.\", \"text\": \"Each row displays data from a single study region. The first column shows a map with the sampling locations for the individuals, while columns two through four show the individuals projected onto a PCA space of present-day genomes (gray points) (populations are labeled in the far right panel in row 1 and in Figure 2\\u2014figure supplement 1). Individual ancient genomes in the map and PCA panels are colored by ancestry clusters identified using qpAdm. Colors are not matched across regions. Star points are putative outliers, that is individuals with ancestry that is underrepresented in the region. They are not colored by ancestry clusters so as to reduce visual clutter.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib8\", \"bib100\", \"fig3s1\", \"fig3\", \"fig4\"], \"section\": \"Local historical population structure varies across regions\", \"text\": \"In contrast to the homogeneity of the Armenian population, most of the regions, including Italy, Southeastern Europe, and Western Europe, had strikingly heterogeneous populations. Newly collected samples reinforce previous findings of high heterogeneity in Rome, including a large portion of the population having affinities for present-day Near Eastern populations (Antonio et al., 2019; Posth et al., 2021; Figure 3\\u2014figure supplement 1). Interestingly, Southeastern European and Western European individuals during the Imperial Roman & Late Antiquity period also exhibit high heterogeneity, on par with that of contemporaneous Italy (Figures 3 and 4).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2\"], \"section\": \"Southeastern Europe: highly heterogeneous Imperial Roman and Late Antiquity period population.\", \"text\": \"(A) Sampling locations of genetic clusters are represented by a single point per location. Outlier ancestries are black stars, all others are open circles colored by genetic cluster. (B) Colored bars span the minimum and maximum of the date ranges of samples (95% confidence interval from radiocarbon dating or archaeological range). Points are the mean of an individual\\u2019s date range. (C) Projections of the ancient genomes onto a PCA of present-day genomes (gray points). Population labels for the PCA reference space are shown in Figure 2C. Present-day genomes from Southeastern Europe are shown with dark gray open circles.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2s1\"], \"section\": \"Population structure of Italy during the Imperial Roman and Late Antiquity period.\", \"text\": \"Ancient Italian genomes (colored points) from the Imperial Roman and Late Antiquity period were projected onto principal components of present-day genomes (gray points, populations labeled in Figure 2\\u2014figure supplement 1). Present-day Italian genomes are highlighted by a gray filled ellipse. Star points are outliers and circle points are non-outliers. Outlier clusters that can be modeled using contemporaneous populations are labeled with the potential source region.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2\"], \"section\": \"Western Europe: heterogeneous Imperial Roman and Late Antiquity period population.\", \"text\": \"(A) Sampling locations of genetic clusters are represented by a single point per location. Outlier ancestries are black stars, all others are open circles colored by genetic cluster. (B) Colored bars span the minimum and maximum of the date ranges of samples (95% confidence interval from radiocarbon dating or archaeological range). Points are the mean of an individual\\u2019s date range. (C) Projections of the ancient genomes onto a PCA of present-day genomes (gray points). Population labels for the PCA reference space are shown in Figure 2C. Present-day genomes from Southeastern Europe are shown with dark gray open circles.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig3\", \"fig4\", \"fig3s1\"], \"section\": \"Local historical population structure varies across regions\", \"text\": \"Furthermore, these ancestries are often shared across regions. In Southeastern Europe, a core group of individuals have ancestry similar to that of present-day and contemporaneous Central Europeans (C6), while other clusters have ancestry similar to that of Northern Europeans (C7) and Eastern Mediterraneans (C8) (Figure 3C). These ancestry groups are found in contemporaneous Italy and Western Europe as well (Figure 4C, Figure 3\\u2014figure supplement 1). We also observe individuals of eastern nomadic ancestry, similar to that of Sarmatian individuals previously reported, in both Western Europe (C8, n=2) and Southeastern Europe (C2, n=2).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig5\", \"fig5\", \"fig2\", \"fig5\", \"bib84\", \"bib8\", \"fig4\"], \"section\": \"At least 7-11% of historical individuals are ancestry outliers\", \"text\": \"In total, we identified 11% of individuals as outliers, and could connect 7% of individuals to a putative source in a different region (Figure 5A). Based on the regions where these outliers and their sources originated, we created a network to illustrate their movements (Figure 5B). This network reveals the interconnectedness of Europe and the Mediterranean during the historical period. For example, as discussed above, the Armenian population is quite homogeneous (Figure 2). Unsurprisingly, no outliers were found within Armenia; however, we found outlier individuals in the Levant and Italy who can be putatively traced back to Armenia according to their ancestry (Figure 5C; blue outgoing arrows from Armenia). In contrast, the heterogeneous population in Italy connects it to many other regions, with bi-directional movement in most cases. In North Africa, outliers found in Iron Age Tunisia (Moots et al., 2022) indicate movements from many regions in Europe, and North African-like outliers were found in Italy and Austria (Western Europe). North African ancestry in Italy is supported by a single previously reported individual from the Imperial Roman period (R132; Antonio et al., 2019). Similar North African ancestry in Western Europe is supported by a single individual, R10667, from Wels, Austria, a site located on the frontier of the Roman Empire (C28 in Figure 4). This individual from Austria can be modeled using Canary Islander individuals from the Medieval Ages or an Iron Age outlier (distinguished by having more sub-Saharan ancestry) from Kerkouane, a Punic city near Carthage in modern-day Tunisia.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig3\"], \"section\": \"At least 7-11% of historical individuals are ancestry outliers\", \"text\": \"The 7% estimate for outliers with source should be considered conservative for the proportion of \\u2018non-local\\u2019 individuals. There are several cases where a cluster comprises more than 5% of the individuals in the region, but are clearly of a different ancestry than the majority and seem to be transient (only found in a single sub-period of the historical period). For example, in Southeastern Europe (Figure 3B), Imperial Roman & Late Antiquity individuals in C8 are (1) of distant ancestry (Near Eastern) and (2) not found in previous or subsequent time periods. However, since there are five individuals in this cluster, it does not meet our strict criteria for outlier consideration. Additionally, many clusters of underrepresented ancestry cannot be modeled as one-component models because they are recently admixed (i.e. require two or more ancestry components) or of ancestry not sampled elsewhere. Thus, we expect the actual proportion of individuals involved in long distance movements to be higher than reported here.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig6\", \"bib42\", \"bib66\"], \"section\": \"Spatial population structure is relatively stable in the last 3,000 years\", \"text\": \"The remarkable amount of heterogeneity and mobility in the historical period leads to the question of what impact this might have had on population structure over time. To investigate this, we sought to quantify the overall change in population structure across time, from prehistoric to present-day. To assess the spatial structure of population differentiation, we calculated FST across groups of individuals on a sliding spatial grid in each time period and related it to their mean geographic distance. In each time period, we recovered the classical pattern of isolation-by-distance (Figure 6A), where individuals closer in geographic space are also more similar genetically. Across time periods, we see a large decrease in overall FST from the Mesolithic & Neolithic periods to the Bronze Age (approximately 10,000\\u20132300 BCE), coinciding with the major prehistoric migrations (Haak et al., 2015; Lazaridis et al., 2014). From the Bronze Age onward, however, FST does not decrease further with time, indicating that the level of genetic differentiation across space is relatively stable from the Bronze Age to present-day.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig6\", \"bib88\", \"fig6\", \"fig6\"], \"section\": \"Spatial population structure is relatively stable in the last 3,000 years\", \"text\": \"To assess not only the amount, but also the structure of geographic population differentiation, we compared the \\u2018genetic maps\\u2019 of historical period and present-day genomes. To construct these \\u2018maps\\u2019, we performed principal component analysis on 829 present-day European and Mediterranean genomes sampled across geographical space (Figure 6B, bottom) and projected historical period genomes onto the same PC space. Echoing close correspondence between genetic structure and geographic space in present-day Europeans (Novembre et al., 2008), we recovered similar spatial structure for historical samples as well, although noisier due to a narrower sampling distribution and higher local genetic heterogeneity (Figure 6B, middle). The similarity in structure between present-day and historical period is especially striking in comparison to a projection of prehistoric genomes, which shows much weaker correspondence to the present-day PCA as well as to geographic space (Figure 6B, top). Together, our analyses indicate that European and Mediterranean population structure has been relatively stable over the last 3000 years.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig6\", \"fig7\", \"fig7s1\", \"fig7s2\", \"fig7\", \"fig7\"], \"section\": \"Spatial population structure is relatively stable in the last 3,000 years\", \"text\": \"This raises the question: is it surprising for stable population structure to be maintained in the presence of ~7\\u201311% long-range migration? To address this, we simulated Wright-Fisher populations evolving neutrally in continuous space. In these simulations, spatial population structure is established through local mate choice and limited dispersal, which we calibrated to approximately match the spatial differentiation observed in historical-period Europe (Figure 6A, Figure 7A and Figure 7\\u2014figure supplement 1, maximum FST of ~0.03). We then allowed a proportion of the population to disperse longer distances, empirically matching the migration distances we observed in the data during the historical period (Figure 7\\u2014figure supplement 2). Even with long-range dispersal as low as 4%, we observe decreasing FST over 120 generations (~3000 years with a generation time of 25 years) as individuals become less differentiated genetically across space (Figure 7B). At 8%, FST decreases dramatically within 120 generations as spatial structure collapses to the point that it is hardly detectable in the first two principal components (Figure 7C). These simulations indicate that under a basic spatial population genetics model we would expect structure to collapse by present-day given the levels of movement we observe.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib4\", \"bib42\", \"bib67\", \"bib1\", \"bib10\", \"bib16\", \"bib123\", \"fig5s2\", \"fig5s3\", \"bib113\"], \"section\": \"Discussion\", \"text\": \"A possible explanation for this apparent paradox is that our simulations did not capture some key features of human behavior and population dynamics. In the simulated populations, migration implies both movement and reproduction with local random mate choice. However, in real human populations migration can be more complex: people do not necessarily reproduce where they migrate, and reproduction is not necessarily random. We hypothesize that in the historical period there was an increasing decoupling of movement and reproduction, compared to prehistoric times. For the spread of Farmer and Steppe ancestry, we know that these prehistoric migrations would take hundreds of years to traverse the continent (Allentoft et al., 2015; Haak et al., 2015; Lazaridis et al., 2016). In contrast, in the historical period, there were dense travel networks of roads and waterways as well as clear incentives for cross-Mediterranean and cross-continental movement (Abulafia, 2011; Beard, 2015; Broodbank, 2013; Symonds, 2017). This enabled people to travel cross-continental distances on the order of weeks or months, well within their lifetimes (Figure 5\\u2014figure supplements 2 and 3; Scheidel, 2015).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib129\", \"bib124\", \"bib112\", \"bib93\", \"bib113\"], \"section\": \"Discussion\", \"text\": \"The Roman Empire is particularly important in understanding how transient mobility could become a unique hallmark of this period. During the expansion of the Empire, existing and new cities quickly expanded as hubs for trade and labor. Urban-military complexes emerged along the frontier as military forces established themselves and drew in local communities which sought protection or economic benefit (Verhagen et al., 2019). To support these rapidly growing economic city-centers, human capital beyond the local population was necessary, thus drawing in people from far away places either freely or forcibly (e.g. slavery, military). According to a longstanding historical hypothesis, the Urban Graveyard Effect, the influx of migrants in city-centers disproportionately contributed to the death rate over the birth rate; a process which would contribute to observing individuals as \\u2018transient\\u2019 migrants (Tacoma, 2016). Long-range, transient migration, combined with the Roman Empire\\u2019s highly efficient travel networks (Scheidel et al., 2007; Oleson, 2009; Scheidel, 2015) may explain the genetically heterogeneous populations, especially along the frontier regions (e.g. Serbia, Croatia, and Austria).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib19\", \"bib27\", \"bib107\"], \"section\": \"Discussion\", \"text\": \"With transient mobility as the main contributor to the observed heterogeneity, it remains unclear what additional demographic processes contributed to the maintenance of spatial genetic structure. The collapse of the Empire involved a loss of urban-military complexes and depopulation of cities, followed by ruralization (Burgess, 2007; Dey, 2015; Roymans et al., 2020). Without the Empire incentivising trade and movement, there may have been little motive for individuals to remain in these suddenly remote regions.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib14\"], \"section\": \"Discussion\", \"text\": \"If this hypothesis is true, we would expect a reduction in local genetic heterogeneity after the collapse of the Empire. Unfortunately, we do not have this period sampled densely enough to assess this comprehensively. The lack of samples is further amplified by the fact that ancient DNA comes from archaeological excavations, which tend to be enriched in urban areas; a stone mausoleum in the city center, for example, will produce more surface scatter than a wood farmhouse, making urban areas more likely for excavation (Bowes, 2011). This makes it difficult to comprehensively address differences in rural versus urban demography. Collecting more genetic data from both urban and rural contexts across the historical period will be a valuable future step in understanding how spatial population structure was maintained. Furthermore, it could elucidate the role of other historical events and peoples, such as the Franks, Lombards, Visigoths, and Huns, during the Migration Period.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"supp1\"], \"section\": \"Sample collection and archaeological sites\", \"text\": \"The archaeological context for ancient individuals reported in this study is detailed in Supplementary file 1. Site descriptions were written by the contributing archaeologists. Descriptions of individual-level burials are included where possible.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"supp1\", \"supp2\", \"supp3\"], \"section\": \"Date determination for individuals and time periods\", \"text\": \"In the study we use a single date estimate (for both newly reported and published samples) which is the midpoint of the 95% confidence interval when using AMS dates, and the average of the lower and upper bound inference dates when using archaeological context for dating. The dating approach used for each sample is included in files Supplementary file 1 (archaeological context) and Supplementary file 2 (sample metadata). The full AMS and calibration results are reported in Supplementary file 3.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib84\", \"bib99\", \"bib98\", \"bib24\", \"bib105\", \"bib106\", \"bib81\", \"bib81\"], \"section\": \"DNA extraction, library preparation, and sequencing\", \"text\": \"The 204 ancient genomes reported in this study, 26 of which were recently reported in Moots et al., 2022, represent a subset of samples screened from 53 archaeological sites across 18 countries. We isolated and finely ground the cochlear regions of the petrous bones in dedicated clean room facilities at the University of Vienna following the protocols described in Pinhasi et al., 2019; Pinhasi et al., 2015. Using 50 mg of bone powder, DNA was extracted by 18 hr incubation of the powder in a solution of Proteinase-K and EDTA. DNA was eluted in 50 \\u03bcl 10 mM Tris-HCl, 1 mM EDTA, 0.05% Tween-20, pH 8.0 as in Dabney et al., 2013; Rohland and Hofreiter, 2007. 12.5\\u201325 \\u00b5L of DNA extract was used to prepare partial uracil\\u2013DNA\\u2013glycosylase (UDG) double stranded libraries as described in Rohland et al., 2015. After a partial (30 minute) UDG treatment, library preparation followed a modified version of the Meyer and Kircher, 2010 protocol (Meyer and Kircher, 2010): the initial DNA fragmentation step was not required and MinElute PCR purification kits (Qiagen) were used for all library clean-up steps. Libraries were measured on a qPCR to determine the ideal cycle number to avoid over or under-amplification. 10-20 uL of library was then double indexed using Agilent PfuTurbo Cx HotStart DNA Polymerase with conditions: 95 \\u00b0C for 5 min followed by the qPCR-determined cycles of 95 \\u00b0C for 15 s, 60 \\u00b0C for 30 s and 72 \\u00b0C for 30 s with a final elongation at 72 \\u00b0C for 5 min. After indexing, the libraries were purified using the MinElute system (Qiagen) and eluted in 25 \\u03bcL of 1 mM EDTA, 0.05% Tween-20. Libraries were screened based on Qubit concentration and visual validation of Bioanalyzer peaks for an initial low coverage (NextSeq or Novaseq) screening run.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib76\"], \"section\": \"Processing sequence data and sample screening\", \"text\": \"Adapters were removed from sequence reads using Cutadapt (v1.14) (Martin, 2011). Then, for each sample, reads were processed further (a) with the 2 base pairs at either end of the reads trimmed off and (b) without trimming. Since partial UDG treatment was performed on the libraries, a damage signature consisting of elevated C>T transitions on the 5\\u2019 end and G>A transitions on the 3\\u2019 end should remain at the ends of reads. Therefore, analyzing untrimmed, aligned reads would allow us to assess the amount of the ancient DNA damage signature present in a sample, and to use this as a criteria for authenticating that the sampled DNA is ancient. Other than the variable trimming parameter for the ends of the reads, all other parameters remained the same for both screening and high coverage sequencing data.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib69\", \"bib6\"], \"section\": \"Processing sequence data and sample screening\", \"text\": \"Following variable trimming, reads were filtered for minimum length of 30, then aligned to hg19 using bwa (0.7.15-r1140; Li and Durbin, 2009), with seed length disabled (-l 350). For each sample, aligned reads were sorted by coordinate using Picard\\u2019s SortSam (version 2.9.0\\u20131-gf5b9f50-SNAPSHOT) and read groups were added using Picard\\u2019s AddOrReplaceReadGroups (version 2.9.0\\u20131-gf5b9f50-SNAPSHOT) (http://broadinstitute.github.io/picard/). Reads with mapping quality <25 (including unaligned reads) were filtered out. For higher coverage sequencing runs, this process was parallelized by splitting raw fastq files and merging after alignment, sorting, and quality filtering. Duplicates were removed using samtools rmdup (http://www.htslib.org/doc/samtools.html). Genome-wide and chromosomal coverage were assessed using depth-cover (version 1.0.3, https://github.com/jalvz/depth-cover; Alvarez, 2017).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib56\"], \"section\": \"Processing sequence data and sample screening\", \"text\": \"Samples were screened and selected using the following criteria: (1)>20% reads aligned to the hg19 build of the human genome; (2) a C>T mismatch rate at the 5\\u2019-end and G>A at the 3\\u2019-end of the sequencing read of 4% or above (characterized with mapDamage v2.0.8) (J\\u00f3nsson et al., 2013); (3) library complexity estimates indicating that a minimum coverage of 0.5 x would be achievable with further sequencing, (4) with a contamination level \\u2264 5%.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib102\", \"bib61\", \"bib61\"], \"section\": \"Processing sequence data and sample screening\", \"text\": \"Contamination rates were estimated with three methods: (1) damage pattern and polymorphism in mitochondrial DNA with Schmutzi (Renaud et al., 2015), (2) atypical ratios of coverages of X and Y chromosomes to autosomes calculated with ANGSD (Korneliussen et al., 2014), and (3) for male samples, high heterozygosity on non-pseudo-autosomal region of the X chromosome (chrX:5000000\\u2013154900000 in hg19) with the \\u2018contamination\\u2019 tool in ANGSD (Korneliussen et al., 2014). If the contamination estimate for any of these three methods was above 5%, we considered the sample contaminated and excluded it from downstream analysis (n=9).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib116\", \"bib79\", \"bib23\", \"bib133\"], \"section\": \"Calling pseudohaploid genotypes\", \"text\": \"Pseudohaploid genotypes for study samples were called by randomly choosing one allele from each site where there was read coverage, following the approach and software provided by Stephan Schiffels (https://github.com/stschiff/sequenceTools; Schiffels, 2022). Variants were called for the 1240 k SNP panel, which is commonly used for capture-based sequencing of ancient samples (Mathieson et al., 2015). For the newly reported samples, a median of 685,058 SNPs (167,000\\u20131,029,345) were covered per sample. Data was output in eigenstrat format. This pipeline was also used to call genotypes for two published ancient DNA datasets which at the time were only available in BAM (sequence read) format (Clemente et al., 2021; \\u017degarac et al., 2021).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib3\", \"bib73\", \"bib9\", \"bib2\", \"bib4\", \"bib7\", \"bib8\", \"bib11\", \"bib13\", \"bib15\", \"bib17\", \"bib18\", \"bib21\", \"bib20\", \"bib25\", \"bib26\", \"bib28\", \"bib29\", \"bib30\", \"bib32\", \"bib31\", \"bib33\", \"bib34\", \"bib35\", \"bib36\", \"bib37\", \"bib39\", \"bib38\", \"bib41\", \"bib40\", \"bib45\", \"bib44\", \"bib43\", \"bib49\", \"bib17\", \"bib52\", \"bib53\", \"bib55\", \"bib54\", \"bib59\", \"bib60\", \"bib64\", \"bib63\", \"bib65\", \"bib68\", \"bib67\", \"bib66\", \"bib70\", \"bib71\", \"bib72\", \"bib74\", \"bib85\", \"bib75\", \"bib78\", \"bib77\", \"bib80\", \"bib79\", \"bib83\", \"bib82\", \"bib86\", \"bib87\", \"bib92\", \"bib91\", \"bib90\", \"bib89\", \"bib94\", \"bib95\", \"bib97\", \"bib101\", \"bib103\", \"bib104\", \"bib109\", \"bib108\", \"bib110\", \"bib115\", \"bib117\", \"bib118\", \"bib119\", \"bib121\", \"bib122\", \"bib125\", \"bib126\", \"bib127\", \"bib128\", \"bib130\", \"bib131\", \"bib132\", \"bib62\", \"bib96\", \"bib111\", \"bib23\", \"bib133\", \"supp4\"], \"section\": \"Combining new genotypes with ancient and present-day published data\", \"text\": \"Newly processed pseudohaploid data was merged with several datasets. Most of the published data was retrieved from the Allen Ancient Data Resource (AADR) v44.3 (January 2021) (Allen Ancient DNA Resource, 2021; Mallick et al., 2023): a compilation of pseudohaploid and diploid genotypes for 5,225 ancient and 3,720 present-day individuals (1000\\nAuton et al., 2015; Agranat-Tamir et al., 2020; Allentoft et al., 2015; Amorim et al., 2018; Antonio et al., 2019; Bergstr\\u00f6m et al., 2020; Biagini et al., 2019; Brace et al., 2019; Broushaki et al., 2016; Brunel et al., 2020; Cassidy et al., 2020; Cassidy et al., 2016; Damgaard et al., 2018; de Barros Damgaard et al., 2018; Ebenesersd\\u00f3ttir et al., 2018; Feldman et al., 2019a; Feldman et al., 2019b; Fernandes et al., 2020; Fernandes et al., 2018; Fregel et al., 2018; Fu et al., 2016; Furtw\\u00e4ngler et al., 2020; Gamba et al., 2014; Gokhman et al., 2020; Gonz\\u00e1lez-Fortes et al., 2019; Gonz\\u00e1lez-Fortes et al., 2017; G\\u00fcnther et al., 2018; G\\u00fcnther et al., 2015; Haber et al., 2020; Haber et al., 2019; Haber et al., 2017; Harney et al., 2018; Broushaki et al., 2016; J\\u00e4rve et al., 2019; Jeong et al., 2019; Jones et al., 2017; Jones et al., 2015; Keller et al., 2012; K\\u0131l\\u0131n\\u00e7 et al., 2016; Krzewi\\u0144ska et al., 2018b; Krzewi\\u0144ska et al., 2018a; Lamnidis et al., 2018; Lazaridis et al., 2017; Lazaridis et al., 2016; Lazaridis et al., 2014; Linderholm et al., 2020; Lipson et al., 2017; Mallick et al., 2016; Malmstr\\u00f6m et al., 2019; Morrison et al., 2020; Margaryan et al., 2020; Martiniano et al., 2017; Martiniano et al., 2016; Mathieson et al., 2018; Mathieson et al., 2015; Mittnik et al., 2019; Mittnik et al., 2018; Narasimhan et al., 2019; Nikitin et al., 2019; Olalde et al., 2019; Olalde et al., 2018; Olalde et al., 2015; Olalde et al., 2014; Omrak et al., 2016; O\\u2019Sullivan et al., 2018; Patterson et al., 2012; Pr\\u00fcfer et al., 2017; Rivollat et al., 2020; Rodr\\u00edguez-Varela et al., 2017; Saag et al., 2019; Saag et al., 2017; S\\u00e1nchez-Quinto et al., 2019; Schiffels et al., 2016; Schroeder et al., 2019; Schuenemann et al., 2017; Sikora et al., 2017; Skoglund et al., 2014; Skourtanioti et al., 2020; Unterl\\u00e4nder et al., 2017; Valdiosera et al., 2018; van den Brink et al., 2017; Veeramah et al., 2018; Villalba-Mouco et al., 2019; Wang et al., 2019; Zalloua et al., 2018). We also included relevant genetic data made available by authors that were not in the AADR: present-day genomes from the Balkans (Kovacevic et al., 2014), present-day genomes from 4 Poles, 3 Germans, and 2 Moldavians (Pagani et al., 2016), and Bronze Age Italian genomes (Saupe et al., 2021). Pseudohaploid genotypes for published Bronze Age Aegean genomes (Clemente et al., 2021) and Bronze Age Serbian genomes (\\u017degarac et al., 2021) were generated from BAM files using our pipeline. All published genomes were filtered for contamination based on reported contamination levels in the original study and SNP coverage based on the genomic data. All published samples that contributed to this study are listed in Supplementary file 4. To ensure maximum overlap with present-day and ancient samples in analyses, the merged dataset was subset to SNPs in the Human Origin Panel array, resulting in a total of 481,259 SNPs. For PCA and qpAdm modeling, SNPs that are transitions at CpG sites (n=76,678) were excluded since they may have arisen from DNA damage as opposed to true genetic variation.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib22\"], \"section\": \"Setting up the principal component analysis\", \"text\": \"Principal component analysis was performed on genotypes from present-day and Mediterranean individuals using smartpca v16000 (https://github.com/chrchang/eigensoft/blob/master/POPGEN/README; Chang, 2013). The following parameters were used: 5 outlier iterations (numoutlieriter), 10 principal components along which to remove outliers (numoutlierevec), altnormstyle set to NO, with least squares projection turned on (lsqproject set to YES). To calculate principal components only using present-day individuals, a file (poplistname) was provided with the population names of present-day individuals, randomly subsampled per population. After outlier removal (which removed 55 samples), 829 individuals and 480,712 SNPs were used in the initial analysis. All individuals (non \\u2018reference\\u2019 present-day genomes, and all of the ancient individuals) whose population was not listed in the poplistname file were projected onto the calculated principal components. In the paper, we refer to the individuals used in the calculation of principal components as belonging to the \\u2018reference PCA space\\u2019. These \\u2018reference\\u2019 genomes were used to calculate the PCs because (1) they represent a wide range of present-day variation and (2) the genotypes tend to be of high quality.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2s1\"], \"section\": \"Visual representation of PCA\", \"text\": \"In the figures, present-day genomes used in the reference space are generally colored gray in order to illustrate the background space of genetic variation. To reduce visual clutter and emphasize the ancient genomes, these present-day \\u2018reference\\u2019 genomes are typically unlabeled. Labels for these populations are shown in Figure 2\\u2014figure supplement 1.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib12\"], \"section\": \"Calculation of FST\", \"text\": \"To assess the extent of genetic differentiation across geographic space within a time period, we calculated the Fixation index (FST) between groups of individuals on a sliding spatial grid. Each grid cell measured 10 degrees longitude by 10 degrees latitude, and was slid by 1 degree in both directions (north and east) nine times to build a total of 10 spatial grids. For each of these grids, pairwise FST was calculated between all populated 10-by-10 grid cells using Hudson\\u2019s estimator, correcting for unequal sample size (Bhatia et al., 2013). In addition to FST, we also calculated the average geographic distance (in kilometers) between all individuals across pairs of grid cells to assess how spatial distance relates to genetic differentiation. To visualize this relationship, we used lowess smoothing as implemented in python\\u2019s statsmodels package (statsmodels.api.nonparametric.lowess, v. 0.12.2). To infer confidence intervals for the lowess smoothing estimates, we devised a spatial bootstrapping procedure. Our bootstrap approach samples pairs of grid cells in a way that always samples all overlapping cells or none of them, so individuals are either fully included in a bootstrap replicate or not at all. This prevents double-counting individuals since they contribute to several comparisons across space due to the sliding grid.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib50\"], \"section\": \"Modeling ancestry and identifying outliers using qpAdm\", \"text\": \"In this workflow, we heavily utilize what we call one-component models, where we test whether two individuals or clusters of individuals form a clade relative to a chosen set of reference populations (which we define below). To clarify what we mean by one-component model, assume that we have two focal individuals i1 and i2, as well as a set of reference populations. Using the terminology of admixtools 2 as well as (Harney et al., 2021), the following four tests are equivalent with respect to the resulting p-value:\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib32\"], \"section\": \"Modeling ancestry and identifying outliers using qpAdm\", \"text\": \"where we use the R-style notation of c(i1, i2) to denote a vector consisting of i1 and i2. In all four cases, we are testing against the null hypothesis that the two individuals do form a clade, with low p-values indicating a rejection of that hypothesis. We will call any implementation of this test a \\u2018one-component qpAdm model\\u2019, but the equivalence stated above shows that our one-component models are equivalent to what was called a \\u2018qpWave analysis\\u2019 by Fernandes et al., 2020.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib32\"], \"section\": \"Modeling ancestry and identifying outliers using qpAdm\", \"text\": \"For our reference populations, we chose a set similar to those previously used to model Eurasian historical genomes (Fernandes et al., 2020). However, we added two Asian populations (Laos_Hoabinhian and Onge) based on evidence of gene flow from further east in a subset of our data. The purpose of a set of reference populations in the qpAdm modeling setting is to represent components of ancestry which are differentially related to the focal individuals being tested (i.e. \\u2018left\\u2019, or \\u2018sources\\u2019 and \\u2018target\\u2019) in order to resolve differences in ancestry, but distally related enough to minimize the chance of recent gene flow between \\u2018left\\u2019 and \\u2018right\\u2019. Our final set of references is:\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib67\", \"bib50\"], \"section\": \"Identifying ancestry outliers and their potential sources\", \"text\": \"Model competition involves re-testing the model fit of each potential source after adding another potential source to the right group, first described in Lazaridis et al., 2016, and more recently detailed in Harney et al., 2021. The idea is that if a population in the right group has significantly more allele sharing with the target than the source, then the model will be rejected. (This is why right group populations are chosen to be distal, yet relevant, to target and source). We use this property to our advantage by rotating all n-1 valid sources for the target through the right group set, one at a time, for the same target and a source x (the one source that is not included in the rotation). If the previously valid source x is rejected when including another valid source y in the right group then we remove source x from the list of potential sources. Note that this does not make source y the best source, only a better one than x. Thus, this scheme only eliminates sub-optimal sources, rather than selecting a best source.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig5s1\"], \"section\": \"Identifying ancestry outliers and their potential sources\", \"text\": \"Among the outliers identified, we did not find a significant sex bias compared to non-outliers. Overall, there are more males than females in the dataset. However, the proportions of males in non-outliers, outliers with source, and outliers without source do not differ significantly by a Chi-squared test (p-value = 0.4117, df = 2; Figure 5\\u2014figure supplement 1). When outliers (with and without source) are treated as one group, there is still no significant association with outlier status and sex (p-value = 0.633, df = 1).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig2\", \"fig4\"], \"section\": \"Admixture modeling with qpAdm\", \"text\": \"For targeted analyses of other clusters beyond just outlier candidates, for example to annotate Figures 2\\u20144, we also used cluster-based qpAdm. In addition to one-component models as described above, we also used two-component models of admixture. These models test the hypothesis that a focal target cluster can be modeled as a two-way admixture of two sources (or \\u2018left\\u2019 populations). As above, a p-value below the threshold rejects this hypothesis, that is the proposed admixture model is not a good fit and a different model needs to be considered to disentangle the admixture scenario in question.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib47\", \"bib48\"], \"section\": \"Simulations\", \"text\": \"To assess how spatial population structure would be impacted by different modes of dispersal, we set up forward simulations in continuous 2D space using SLiM v. 3.6 (Haller and Messer, 2019). The aim of these simulations was to approximate the extent of spatial population structure we observe by the beginning of the Iron Age in Western Eurasia, after the major prehistoric migrations had taken place. To achieve that, we decided not to attempt simulating the precise ancestry composition of populations in different regions at that time, but rather to simulate simply the extent of spatial structure as measured by the relationship of population differentiation (FST) and geographic distance. We chose the SLiM simulation framework to make use of its extensive feature set to simulate individuals in continuous space. We simulated diploid genomes made up of a single, 108 bp long chromosome, with recombination rate and mutation rate set to 10\\u20138. We used the default Wright-Fisher simulation mode, where a single population of constant size N is simulated with non-overlapping populations, that is each generation is made up of offspring generated from the previous generation. Spatial structure is established by associating each individual with a continuous 2D coordinate (i.e. latitude and longitude), and by using these coordinates to govern three demographic processes: mate choice, competition, and dispersal. An overview of how these processes can be set up to interact in SLiM can be found in Recipe 15.4 of SLiM v. 3.6 (see e.g. here: https://github.com/MesserLab/SLiM/tree/v3.6/SLiMgui/Recipes; Haller, 2021). Briefly, for mate choice, a Gaussian interaction function with maxDistance = 0.1, maxStrength = 1.0, sigma = 0.02 is used to govern a mateChoice callback using the strength of that interaction function. For competition, another Gaussian interaction function with maxDistance = 0.3, maxStrength = 3.0, sigma = 0.1 is used to calculate competition using the totalNeighborStrength vector of that interaction function to scale an individual\\u2019s relative fitness as 1.1 - competition / N. Finally, we establish local dispersal through a modifyChild callback, where a newly generated offspring\\u2019s position is drawn from a Gaussian centered at the location of the maternal individual with standard deviation sigmaDisp.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib46\", \"bib58\", \"bib57\", \"bib46\"], \"section\": \"Simulations\", \"text\": \"We let this population evolve forward in time for 2000+120 generations during which the processes outlined above lead to spatial population structure, where individuals sampled closely together in 2D space are also more closely related genetically. We do not simulate mutations in SLiM, as this poses a major computational burden. Instead, we use tree sequence recording (Haller et al., 2019; Kelleher et al., 2018) to track the full genealogy of all individuals in the simulation which are either alive at the end of the simulation, or explicitly sampled through time using the treeSeqRememberIndividuals function of SLiM. While 2000 generations are enough to establish spatial structure under the parameters we consider, it is by far insufficient for all sampled individuals to fully coalesce. To accurately assess neutral variation however, we need all sampled individuals to have a common ancestor at some point in the past, as mutations may have arisen at any point leading back to this ultimate coalescence event. Therefore, we approximate the deep history of our population with a panmictic population simulated backwards in time using the coalescent with recombination as implemented in msprime (Kelleher et al., 2016). This process has been referred to as \\u2018recapitation\\u2019 (Haller et al., 2019), where an incomplete genealogy with multiple roots (from SLiM) is \\u2018recapitated\\u2019 using coalescent simulation backwards in time. This is made possible by using the tree sequence data structure to record and simulate genealogies in both SLiM and msprime.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib12\", \"fig7s1\", \"fig7\"], \"section\": \"Simulations\", \"text\": \"To assess the relationship between FST and spatial distance, we split geographic space into a 10-by-10 grid and calculated all pairwise FST between inhabited grid cells using Hudson\\u2019s estimator with unequal sample size correction (Bhatia et al., 2013), as well as the average geographic (euclidean) distance between individuals across grid cells. We used this FST analysis to calibrate the base dispersal sigmaDisp as well as the population size N, so that FST at maximum distance (FSTmax) would approximately match the FSTmax we observed at the start of the historical period (~0.03). We used grid search with a range of sigmaDisp and N values, and found the parameter pair N=50,000 & sigmaDisp = 0.02 to qualitatively produce the closest match (Figure 7\\u2014figure supplement 1). We use this parameter set as our base model of population structure without long-range dispersal, where we allow spatial structure to establish over 2000 generations, and then observe the FST - Distance relationship over the following 120 generations for a total of 2120 generations simulated in SLiM (Figure 7A).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig4\", \"fig7s2\"], \"section\": \"Simulations\", \"text\": \"We aimed to choose a sigmaDispLR that approximately matches the empirical distribution of long-range dispersing individuals we observe in the analysis displayed in Figure 4. Since the euclidean distances in the simulation are on a different scale than the geodesic distances observed in the data, we aimed to match qualitatively the relationship of long-range dispersal distances to random distances that could be observed if two populated locations were drawn at random. We visually analyzed this relationship from the data, and then performed a search across a range of possible sigmaDistLR to find a qualitative match. This led us to choose a value of sigmaDistLR = 0.20 (Figure 7\\u2014figure supplement 2).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"fig7\"], \"section\": \"Simulations\", \"text\": \"Finally, we analyzed simulated \\u2018present-day\\u2019 genomes (i.e. after 2120 generations of SLiM) using PCA. We used the sklearn.decomposition.PCA module (scikit-learn v. 0.24.2) with the svd_solver == \\u2018arpack\\u2019 option to run non-probabilistic PCA to calculate the first 10 principal components. Similarly to how the empirical data was analyzed with smartpca, we also did 5 rounds of iterative outlier removal, removing individuals from the PCA that deviated by more than six standard deviations along any of the 10 principal components. The number of variants contributing to these PCA were 624,617, 625,669, and 626,052 for Figure 7A, B and C respectively, and thus comparable to the number of variants contributing to our data analysis.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"bib5\"], \"section\": \"Ethics\", \"text\": \"This study follows ethics guidelines adopted by the ancient DNA field (Alpaslan-Roodenberg et al., 2021). A clear plan of research was laid out before the collection of samples, leading us to focus on sampling from under-sampled historical regions in Eurasia and therefore minimize unnecessary destruction of human remains. The research intent for these samples was clearly communicated to caretakers of the samples prior to collection. Local anthropologists, archaeologists, and museum directors from each geographic region were involved in the sample acquisition, extraction from skeletal material, and interpretation of genetic results. The genetic findings regarding individual samples from each region were communicated to local collaborators, all of whom were included as co-authors on the paper and were supportive of the final results. The involvement of our local collaborators was essential for the interpretation of the genetic results through their input on the historical and archaeological characterization of the specimens. We have supported our local collaborators with immediate access to the raw genetic data, and by communicating results in written and oral forums. Authorities responsible for all archaeological sites provided written documentation for their specimens to be included in this study through collaboration with the Pinasi Lab (Vienna, Austria).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"sa2fig3\"], \"section\": \"\", \"text\": \"Projecting historical genomes into the present-day PCA space allows for a convenient visualization that is common in the field of ancient DNA and exhibits an established connection to geographic space that is easy to interpret. This is true especially for more recent ancient and historical genomes, as spatial population structure approaches that of present day. However, there are two challenges: (1) a two-dimensional representation of a fairly high-dimensional ancestry space necessarily incurs some amount of information loss and (2) we know that some axes of genetic variation are not well-represented by the present-day PCA space. This is evident, for example, by projecting our qpAdm reference populations into the present-day PCA, where some ancestries which we know to be quite differentiated project closely together (Author response image 3). Despite this limitation, we continue to use the PCA representation as it is well resolved for visualization and maximizes geographical correspondence across Eurasia.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"sa2fig3\"], \"section\": \"\", \"text\": \"On the other hand, the qpAdm reference space (used in clustering and outlier detection) has higher resolution to distinguish ancestries by more comprehensively capturing the fairly high-dimensional space of different ancestries. This includes many ancestries that are not well resolved in the present-day PCA space, yet are relevant to our sample set, for example distinguishing Iranian Neolithic ancestry against ancestries from further into central and east Asia, as well as distinguishing between North African and Middle Eastern ancestries (Author response image 3).\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"sa2fig4\", \"sa2fig3\"], \"section\": \"\", \"text\": \"To investigate the differences between these two reference spaces, we chose pairwise outgroup-f3 statistics (to Mbuti) as a pairwise similarity metric representing the reference space of f-statistics and qpAdm in a way that\\u2019s minimally affected by population-specific drift. We related this similarity measure to the euclidean distance on the first two PCs between the same set of populations (Author response image 4). This analysis shows that while there is almost a linear correspondence between these pairwise measures for some populations, others comparisons fall off the diagonal in a manner consistent with PCA projection (Author response image 3), where samples are close together in PCA but not very similar according to outgroup-f3. Taken together, these analyses highlight the non-equivalence of the two reference spaces.\"}, {\"pmc\": \"PMC10827293\", \"pmid\": \"38288729\", \"reference_ids\": [\"sa2fig5\"], \"section\": \"\", \"text\": \"The fact that historical period individuals are \\u201cmost closely related to the folks that contribute to present-day people that roughly live in the same geographic location\\u201d is exactly the point we were hoping to make with Figures 6 B and C. We do realize, however, that the fact that one set of samples is projected into the PC space established by the other may suggest that this is an obvious result. To make it more clear that it is not, we added an additional panel to Figure 6, which shows pre-historical samples projected into the present-day PC space. This figure shows that pre-historical individuals project all across the PCA space and often outside of present-day diversity, with degraded correlation of geographic location and projection location (see also Author response image 5). This illustrates the contrast we were hoping to communicate, where projection locations of historical individuals start to \\u201csettle\\u201d close to present-day individuals from similar geographic locations, especially in contrast with pre-historic individuals.\"}]"

Metadata

"{}"