PMC Articles

Independent domestication and cultivation histories of two West African indigenous fonio millet crops

PMCID: PMC12044004

PMID: 40307323


Abstract

Crop evolutionary history and domestication processes are key issues for better conservation and effective use of crop genetic diversity. Black and white fonio ( Digitaria iburua and D. exilis , respectively) are two small indigenous grain cereals grown in West Africa. The relationship between these two cultivated crops and wild Digitaria species is still unclear. Here, we analyse whole genome sequences of 265 accessions comprising these two cultivated species and their close wild relatives. We show that white and black fonio were the result of two independent domestications without gene flow. We infer a cultivation expansion that began at the outset of the CE era, coinciding with the earliest discovered archaeological fonio remains in Nigeria. Fonio population sizes declined a few centuries ago, probably due to a combination of several factors, including major social and agricultural changes, intensification of the slave trade and the introduction of new, less labour-intensive crops. The key knowledge and genomic resources outlined here will help to promote and conserve these neglected climate-resilient crops and thereby provide an opportunity to tailor agriculture to the changing world. Black and white fonio are two small indigenous grain cereals grown in West Africa. Here, the authors reveal their independent domestication and cultivation histories through population genetics and demographic history analyses.


Full Text

Here we present one of the most geographically complete datasets of cultivated fonio cereals and their associated wild relatives, including 265 genome assemblies (Fig. 1a, Supplementary Table 1, and Supplementary Data 1). We resequenced the genomes of 26 D. iburua accessions and of 22 accessions of its closest wild relative D. ternata. Moreover, we extended the D. exilis sampling to key regions situated at the edge of its geographical distribution range.
The average mapping rates of raw reads on the D. exilis reference genome were 99% and 93% for D. exilis and D. longiflora, respectively (Supplementary Data 2). For D. iburua and D. ternata, the mapped read percentages were 74% to 68%, respectively (Supplementary Data 2). The number of unfiltered SNPs was 15,588,339, with a missing rate of 0.64 and 0.03 for D. iburua and D. exilis, respectively (Supplementary Data 2). After filtering loci with a missing rate > 0.05 and individuals with a missing rate > 0.40, we kept 1,910,119 high-quality bi-allelic SNPs and 247 individuals, with an average missing rate of 0.006 for D. exilis (n = 200) and 0.09 for D. iburua (n = 21) (Supplementary Data 3). For the wild relatives, we obtained an average missing rate of 0.06 for D. longiflora (n = 14) and 0.18 for D. ternata (n = 11) (Supplementary Data 3). More than a half (52%) of these SNPs were rare variants with a minor allele frequency <0.01 (Supplementary Fig. 1). SNPs were distributed evenly across the 18 chromosomes of the reference assembly, but chromosomes 8A and 8B had a lower SNP density than the other chromosomes (Supplementary Fig. 2).
The nucleotide diversity (π) of cultivated D. exilis and D. iburua was lower (P < 2.2 × 10 − 16; two-tailed Welch t-test, Supplementary Table 2) than that of their wild relatives D. longiflora and D. ternata (Fig. 2a, Supplementary Table 3). Moreover, black fonio displayed higher (P < 2.2 × 10–16; two-tailed Welch t-test, Supplementary Table 2) genetic diversity than white fonio and a lower reduction in nucleotide diversity compared to its wild relative D. ternata (Fig. 2a, Supplementary Table 3). These patterns were also obtained with the Jaccard dissimilarity index that can be viewed as a genome reference-free measure of the population genetic diversity computed with k-mers (Fig. 2b). However, the mean values between D. iburua and D. ternata were not significantly different (P = 0.15; two-tailed Welch t-test, Supplementary Table 2). The Watterson’s ϴ values were higher than π for both species of the D. exilis/D. longiflora pair, while being lower for the D. iburua/D. ternata pair. A negative Tajima’s D value was obtained for D. exilis/D. longiflora, while being positive for D. iburua/D. ternata (Supplementary Table 3). We identified linkage disequilibrium (LD) decays according to the kb distance for D. exilis and D. longiflora, while LD very quickly decayed to its minimum value for D. iburua and D. ternata (Supplementary Fig. 3).
a Boxplot distribution of nucleotide diversity (π) computed using sliding windows of size 50 kb and step sizes of 10 kb within D. exilis (n = 199), D. longiflora (n = 14), D. iburua (n = 21) and D. ternata (n = 11). b Boxplot distribution of the Jaccard dissimilarity index generated from the 1,000,000 k-mers presence/absence table and calculated within D. exilis (n = 199), D. longiflora (n = 13), D. iburua (n = 21) and D. ternata (n = 11). The centre line represents the median; the box limits represent the upper and lower quartiles; the upper and lower lines represent 1.5 times the interquartile range. Statistical comparisons between each pair of population are presented in Supplementary Table 2. Source data are provided as a Source Data file.
Using SNPs, the first PCA axis (68% of the total variance) differentiated the two cultivated/wild species pairs: D. exilis/D. longiflora versus D. iburua/D. ternata (Fig. 3a). The second axis (2.6% of the total variance) differentiated the group of cultivated D. exilis individuals from their wild relative D. longiflora, with East African individuals being more distant than those from West Africa (Fig. 3a, Supplementary Fig. 4). The third axis (1.65% of the total variance) differentiated the cultivated accessions from their associated wild relatives, for both pairs (Supplementary Fig. 4). We noticed that one individual from Benin that was classified as D. exilis grouped with D. iburua. On the other hand, some accessions identified as D. iburua in Nigeria grouped with D. exilis, suggesting potential misidentification during the sampling in the region where the two species are reported to co-occur. We corrected this classification in our subsequent analyses. The results obtained with the SNP datasets that had been filtered at different missing data thresholds (5%, 10% and 20%) were consistent (Supplementary Fig. 5).
We also performed a PCA of the four Digitaria species using the mapping free approach (k-mer dataset). The results were similar to those obtained in the SNP analysis (Fig. 3b and Supplementary Fig. 6). The two cultivated fonio millet and wild relative pairs were differentiated on the first axis (56% of the total variance). The differentiation between cultivated and wild accessions was associated with the second axis (2.6% of the total variance).
The genetic variation observed among the four species was confirmed by genetic structure analyses conducted with sNMF. The cross-validation error decreased drastically from K = 2 to K = 4, reaching a minimum at K = 6 before sharply increasing at K = 8 (Supplementary Fig. 7). The two cultivated/wild species pairs were well differentiated at K = 2 (Fig. 3c). At K = 3, a D. longiflora wild relative group from East Africa was split (Supplementary Fig. 8), while a wild relative group of D. ternata individuals from Côte d’Ivoire formed a distinct cluster at K = 4 (blue colour, Fig. 3c, Supplementary Fig. 8). Two D. exilis geographical groups were highlighted at K = 5 (Supplementary Fig. 8) and a West African D. longiflora group appeared at K = 6 (Fig. 3c and Supplementary Fig. 8). Differentiation between D. ternata and D. iburua was observed from K = 7 (Supplementary Fig. 8). At K = 8, a third D. exilis group, mainly from Nigeria, differed (Supplementary Fig. 8). The structure obtained with the k-mer approach corroborated these results (Supplementary Note 1, Supplementary Figs. 9, 10, and 11).
The pairwise population distance (D) and the net pairwise distance (D) between genetic clusters indicated a lower distance between the D. exilis/D. longiflora pair compared to D. iburua/D. ternata (Supplementary Table 4). Moreover, the East African D. longiflora population and the Côte d’Ivoire D. ternata population were more differentiated than the other populations of the same species (Supplementary Table 4).
The inference of the relationship between the four species (four populations) using Treemix distinguished the two cultivated/wild species pairs D. exilis/D. longiflora and D. iburua/D. ternata, with substantial genetic drift (>0.2, Fig. 4a). The inference without migration event explained >99.9% of the variance in relatedness between populations. Adding one migration event led to a higher log-likelihood but did not increase the explained variance (Supplementary Fig. 12), suggesting a poorer fit of the model.
Topology weights (Twisst) considering the four species as populations corroborated the distinctiveness of the two wild/cultivated species pairs (Fig. 4b). Indeed, with four populations as input, almost all (99.9%) genomic windows supported the topology distinguishing the two respective fonio millet/wild relative pairs (Fig. 4b).
For both methods, the inferred topologies were robust when more populations were defined based on the genetic structure inferred with sNMF (Supplementary Figs. 13, and 14). These results suggested that white fonio (D. exilis) and black fonio (D. iburua) millet species had independent histories and domestication patterns, with no evidence of gene flow.
The best tree topology inferred with fastsimcoal showed divergence of cultivated fonio millets from their respective wild relatives (m01 model, Fig. 4c and Supplementary Fig. 15). The addition of a population bottleneck for cultivated species after their divergence from their respective wild relatives better supported our data (Supplementary Fig. 16). Parameter estimates showed an old divergence of the wild relatives D. longiflora and D. ternata, and an older divergence of white fonio from its wild relative compared to black fonio (Fig. 4c and Supplementary Table 5).
We used the sequentially Markovian coalescent based method smc++ to analyse demographic changes associated with domestication, expansion and or contraction of the two cultivated fonio millet species. The effective population size of D. exilis steadily declined from more than 20,000 years ago and reached a minimum ~2000 years ago (Fig. 4d). Then the effective population size markedly increased, and then declined again from 500 to 200 years ago (Fig. 4d). Similar trends were noted when the two D. exilis populations were differentiated based on genetic clusters, but the onset of the expansion of the Guinean cluster was later than that of the Nigerian cluster (Supplementary Fig. 17). Moreover, only the Nigerian cluster experienced a decrease in population size at 500 years, whereas the effective population size of the Guinean cluster stopped increasing and remained stable (Supplementary Fig. 17). The patterns of the two clusters of the D. longiflora wild populations differed. From 60,000 to 30,000 years ago, the East African cluster experienced a marked decrease in population size, while the contrary was observed for the West African cluster (Supplementary Fig. 17). Then, from 30,000 to 1000 years, the effective population sizes of these populations increased or decreased, respectively, before levelling off (Supplementary Fig. 17). Concerning D. iburua, the population decreased sharply from around 20,000 years ago and reached a minimum before D. exilis, at around 3000 BP. A lesser increase in the effective population size occurred from this period to 300 years BP. The closest D. ternata wild population decreased the same way, but did not increase thereafter like the cultivated population (Supplementary Fig. 17). The effective population size of D. ternata from Côte d’Ivoire showed the same decrease until 8000 years, and diverged at 4000 years, with a steep increase until 1000 years ago (Supplementary Fig. 17). However, those results for the D. iburua and D. ternata populations should be considered with caution as there was high variance between the different runs (Fig. 4d and Supplementary Fig. 17).
We assembled a large collection of fonio genomic resources, with 265 georeferenced accessions, including 94 new sequences (Fig. 1a, Supplementary Table 1, and Supplementary Data 1). It contains whole-genome sequences of D. iburua and its wild relative D. ternata, and new D. exilis sequences, so now its geographical distribution is likely fully represented. For D. exilis, a dataset (Abrouk et al.) of 157 accessions is available in the European Nucleotide Archive (ENA), but some countries are missing or poorly represented. We sampled 46 new D. exilis accessions originating from Senegal, Benin, Nigeria and Niger to enhance the overall distribution coverage of the species. Our new D. exilis collection now consists of a total of 203 accessions from nine countries. For D. iburua, we sequenced 26 new samples collected in Nigeria. For the wild relatives, we used the 14 D. longiflora accessions analysed in Abrouk et al. and collected 22 accessions from D. ternata specimens stored at the National Museum of Natural History in Paris (France) and in CIRAD herbariums.
Total genomic DNA extractions were performed from fresh leaves by an automated method adapted from Risterucci et al. on Biomek FXP (Beckman, Coulter, CA, USA) and using the NucleoMag Plant Kit (Macherey-Nagel, Germany). For D. ternata herbarium specimens, DNA was extracted using the modified protocol of Doyle and Doyle (Supplementary Method 1). We constructed paired-end DNA libraries using a homemade library construction approach with a double indexing protocol corresponding to a Tag sequence of six nucleotides on the 5′ DNA end and an index sequence of eight nucleotides on the 3′ DNA end (Illumina Unique Dual Index). This protocol was used to avoid contaminated or recombined reads that would not be correctly assigned during the demultiplexing step. The libraries of the 94 new genomes were sequenced using an Illumina NovaSeq 6000 system with a targeted 10X sequencing depth. A duplicate accession was added as positive sequencing control.
We first inferred between-population relationships with Treemix v.1.13. Treemix v.1.13 uses the genome-wide covariance structure of allele frequencies between populations to maximise the likelihood of the inferred tree topology and migration between populations. We first analysed the four groups together, i.e. the two cultivated species D. iburua and D. exilis, and the two wild species D. longiflora and D. ternata. We also performed an analysis based on the genetic structure inferred with sNMF. Individuals were included in a group if their ancestry coefficients were 0.6 or higher. First, a total of 100 independent bootstrap replicates were generated and a consensus tree was built with phylip v.3.697. We then ran 10 independent Treemix runs with the consensus tree to find the optimal number of migration events (m), for each m from m = 1 to m = 6. The optimal number of migration edges was determined with the optM v0.1.6 R package which allowed us to plot the mean composite log likelihood of each run for each m. Another set of 100 bootstrap replicates was generated to find the new consensus tree with the number of previously inferred migration edges. Finally, 30 independent Treemix runs were performed with the new consensus tree and we visualised the tree having the maximum likelihood with R Treemix plotting functions. At each step of this pipeline, we accounted for linkage disequilibrium by setting SNP block sizes ranging from 500 to 1000. The model was implemented without specifying a root or by specifying D. ternata individuals from Côte d’Ivoire as root, since they appeared to be distant and isolated in the PCA and clustering analyses. This pipeline was implemented based on an available approach (https://github.com/carolindahms/TreeMix). In addition to this genome-wide approach, we summarised relationships among the different populations with a sliding window method implemented in the Twisst pipeline which quantifies the relative weights for each possible sub-tree topology among a set of inferred gene trees (Supplementary Method 2).
We first tested six different topologies (Supplementary Fig. 18) to test: (1) the independence of the cultivated fonio millet domestications, and (2) from which wild species the cultivated species differed. We assumed a constant population size for these different scenarios. The best tree topology was then used as a backbone to test if the data supported or not a domestication bottleneck after population divergence.
For each of the six scenarios, we performed 100 independent runs of 500,000 coalescent simulations to estimate: the effective size of each population X (N) and the times of population divergence (T). We defined uniform (or log-uniform) prior distributions as well as prior boundaries per parameter (Supplementary Table 6). Parameter estimations were based on a maximum composite likelihood from the SFS with 40 optimisation cycles, with parameter reinitialization after three non-improved cycles while reducing the search ranges by 50%. We used an Infinite site model (-I) with a fixed mutation rate of 6.5 × 10−9 and a 1 year generation time as fonio millets are annual plants.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.


Sections

"[{\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig1\", \"MOESM1\", \"MOESM4\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"Here we present one of the most geographically complete datasets of cultivated fonio cereals and their associated wild relatives, including 265 genome assemblies (Fig.\\u00a01a, Supplementary Table\\u00a01, and Supplementary Data\\u00a01). We resequenced the genomes of 26 D. iburua accessions and of 22 accessions of its closest wild relative D. ternata. Moreover, we extended the D. exilis sampling to key regions situated at the edge of its geographical distribution range.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM5\", \"MOESM5\", \"MOESM5\", \"MOESM6\", \"MOESM6\", \"MOESM1\", \"MOESM1\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"The average mapping rates of raw reads on the D. exilis reference genome were 99% and 93% for D. exilis and D. longiflora, respectively (Supplementary Data\\u00a02). For D. iburua and D. ternata, the mapped read percentages were 74% to 68%, respectively (Supplementary Data\\u00a02). The number of unfiltered SNPs was 15,588,339, with a missing rate of 0.64 and 0.03 for D. iburua and D. exilis, respectively (Supplementary Data\\u00a02). After filtering loci with a missing rate\\u2009>\\u20090.05 and individuals with a missing rate\\u2009>\\u20090.40, we kept 1,910,119 high-quality bi-allelic SNPs and 247 individuals, with an average missing rate of 0.006 for D. exilis (n\\u2009=\\u2009200) and 0.09 for D. iburua (n\\u2009=\\u200921) (Supplementary Data\\u00a03). For the wild relatives, we obtained an average missing rate of 0.06 for D. longiflora (n\\u2009=\\u200914) and 0.18 for D. ternata (n\\u2009=\\u200911) (Supplementary Data\\u00a03). More than a half (52%) of these SNPs were rare variants with a minor allele frequency <0.01 (Supplementary Fig.\\u00a01). SNPs were distributed evenly across the 18 chromosomes of the reference assembly, but chromosomes 8A and 8B had a lower SNP density than the other chromosomes (Supplementary Fig.\\u00a02).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\", \"Fig2\", \"MOESM1\", \"MOESM1\", \"Fig2\", \"MOESM1\", \"Fig2\", \"MOESM1\", \"MOESM1\", \"MOESM1\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"The nucleotide diversity (\\u03c0) of cultivated D. exilis and D. iburua was lower (P\\u2009<\\u20092.2\\u2009\\u00d7\\u200910\\u2009\\u2212\\u200916; two-tailed Welch t-test, Supplementary Table\\u00a02) than that of their wild relatives D. longiflora and D. ternata (Fig.\\u00a02a, Supplementary Table\\u00a03). Moreover, black fonio displayed higher (P\\u2009<\\u20092.2\\u2009\\u00d7\\u200910\\u201316; two-tailed Welch t-test, Supplementary Table\\u00a02) genetic diversity than white fonio and a lower reduction in nucleotide diversity compared to its wild relative D. ternata (Fig.\\u00a02a, Supplementary Table\\u00a03). These patterns were also obtained with the Jaccard dissimilarity index that can be viewed as a genome reference-free measure of the population genetic diversity computed with k-mers (Fig.\\u00a02b). However, the mean values between D. iburua and D. ternata were not significantly different (P\\u2009=\\u20090.15; two-tailed Welch t-test, Supplementary Table\\u00a02). The Watterson\\u2019s \\u03f4 values were higher than \\u03c0 for both species of the D. exilis/D. longiflora pair, while being lower for the D. iburua/D. ternata pair. A negative Tajima\\u2019s D value was obtained for D. exilis/D. longiflora, while being positive for D. iburua/D. ternata (Supplementary Table\\u00a03). We identified linkage disequilibrium (LD) decays according to the kb distance for D. exilis and D. longiflora, while LD very quickly decayed to its minimum value for D. iburua and D. ternata (Supplementary Fig.\\u00a03).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\", \"MOESM8\"], \"section\": \"Patterns of genetic diversity in fonio millets and wild relatives, with and without consideration of the D. exilis reference genome for calculations.\", \"text\": \"a Boxplot distribution of nucleotide diversity (\\u03c0) computed using sliding windows of size 50\\u2009kb and step sizes of 10\\u2009kb within D. exilis (n\\u2009=\\u2009199), D. longiflora (n\\u2009=\\u200914), D. iburua (n\\u2009=\\u200921) and D. ternata (n\\u2009=\\u200911). b Boxplot distribution of the Jaccard dissimilarity index generated from the 1,000,000\\u2009k-mers presence/absence table and calculated within D. exilis (n\\u2009=\\u2009199), D. longiflora (n\\u2009=\\u200913), D. iburua (n\\u2009=\\u200921) and D. ternata (n\\u2009=\\u200911). The centre line represents the median; the box limits represent the upper and lower quartiles; the upper and lower lines represent 1.5 times the interquartile range. Statistical comparisons between each pair of population are presented in Supplementary Table\\u00a02. Source data are provided as a Source Data file.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig3\", \"Fig3\", \"MOESM1\", \"MOESM1\", \"MOESM1\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"Using SNPs, the first PCA axis (68% of the total variance) differentiated the two cultivated/wild species pairs: D. exilis/D. longiflora versus D. iburua/D. ternata (Fig.\\u00a03a). The second axis (2.6% of the total variance) differentiated the group of cultivated D. exilis individuals from their wild relative D. longiflora, with East African individuals being more distant than those from West Africa (Fig.\\u00a03a, Supplementary Fig.\\u00a04). The third axis (1.65% of the total variance) differentiated the cultivated accessions from their associated wild relatives, for both pairs (Supplementary Fig.\\u00a04). We noticed that one individual from Benin that was classified as D. exilis grouped with D. iburua. On the other hand, some accessions identified as D. iburua in Nigeria grouped with D. exilis, suggesting potential misidentification during the sampling in the region where the two species are reported to co-occur. We corrected this classification in our subsequent analyses. The results obtained with the SNP datasets that had been filtered at different missing data thresholds (5%, 10% and 20%) were consistent (Supplementary Fig.\\u00a05).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig3\", \"MOESM1\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"We also performed a PCA of the four Digitaria species using the mapping free approach (k-mer dataset). The results were similar to those obtained in the SNP analysis (Fig.\\u00a03b and Supplementary Fig.\\u00a06). The two cultivated fonio millet and wild relative pairs were differentiated on the first axis (56% of the total variance). The differentiation between cultivated and wild accessions was associated with the second axis (2.6% of the total variance).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\", \"Fig3\", \"MOESM1\", \"Fig3\", \"MOESM1\", \"MOESM1\", \"Fig3\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"The genetic variation observed among the four species was confirmed by genetic structure analyses conducted with sNMF. The cross-validation error decreased drastically from K\\u2009=\\u20092 to K\\u2009=\\u20094, reaching a minimum at K\\u2009=\\u20096 before sharply increasing at K\\u2009=\\u20098 (Supplementary Fig.\\u00a07). The two cultivated/wild species pairs were well differentiated at K\\u2009=\\u20092 (Fig.\\u00a03c). At K\\u2009=\\u20093, a D. longiflora wild relative group from East Africa was split (Supplementary Fig.\\u00a08), while a wild relative group of D. ternata individuals from C\\u00f4te d\\u2019Ivoire formed a distinct cluster at K\\u2009=\\u20094 (blue colour, Fig.\\u00a03c, Supplementary Fig.\\u00a08). Two D. exilis geographical groups were highlighted at K\\u2009=\\u20095 (Supplementary Fig.\\u00a08) and a West African D. longiflora group appeared at K\\u2009=\\u20096 (Fig.\\u00a03c and Supplementary Fig.\\u00a08). Differentiation between D. ternata and D. iburua was observed from K\\u2009=\\u20097 (Supplementary Fig.\\u00a08). At K\\u2009=\\u20098, a third D. exilis group, mainly from Nigeria, differed (Supplementary Fig.\\u00a08). The structure obtained with the k-mer approach corroborated these results (Supplementary Note\\u00a01, Supplementary Figs.\\u00a09,\\u00a010, and\\u00a011).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\", \"MOESM1\"], \"section\": \"Genomic diversity and population structure revealed two distinct groups\", \"text\": \"The pairwise population distance (D) and the net pairwise distance (D) between genetic clusters indicated a lower distance between the D. exilis/D. longiflora pair compared to D. iburua/D. ternata (Supplementary Table\\u00a04). Moreover, the East African D. longiflora population and the C\\u00f4te d\\u2019Ivoire D. ternata population were more differentiated than the other populations of the same species (Supplementary Table\\u00a04).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig4\", \"MOESM1\"], \"section\": \"Genomic evidence of independent domestication events\", \"text\": \"The inference of the relationship between the four species (four populations) using Treemix distinguished the two cultivated/wild species pairs D. exilis/D. longiflora and D. iburua/D. ternata, with substantial genetic drift (>0.2, Fig.\\u00a04a). The inference without migration event explained >99.9% of the variance in relatedness between populations. Adding one migration event led to a higher log-likelihood but did not increase the explained variance (Supplementary Fig.\\u00a012), suggesting a poorer fit of the model.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig4\", \"Fig4\"], \"section\": \"Genomic evidence of independent domestication events\", \"text\": \"Topology weights (Twisst) considering the four species as populations corroborated the distinctiveness of the two wild/cultivated species pairs (Fig.\\u00a04b). Indeed, with four populations as input, almost all (99.9%) genomic windows supported the topology distinguishing the two respective fonio millet/wild relative pairs (Fig.\\u00a04b).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\", \"MOESM1\"], \"section\": \"Genomic evidence of independent domestication events\", \"text\": \"For both methods, the inferred topologies were robust when more populations were defined based on the genetic structure inferred with sNMF (Supplementary Figs.\\u00a013, and\\u00a014). These results suggested that white fonio (D. exilis) and black fonio (D. iburua) millet species had independent histories and domestication patterns, with no evidence of gene flow.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig4\", \"MOESM1\", \"MOESM1\", \"Fig4\", \"MOESM1\"], \"section\": \"Genomic evidence of independent domestication events\", \"text\": \"The best tree topology inferred with fastsimcoal showed divergence of cultivated fonio millets from their respective wild relatives (m01 model, Fig.\\u00a04c and Supplementary Fig.\\u00a015). The addition of a population bottleneck for cultivated species after their divergence from their respective wild relatives better supported our data (Supplementary Fig.\\u00a016). Parameter estimates showed an old divergence of the wild relatives D. longiflora and D. ternata, and an older divergence of white fonio from its wild relative compared to black fonio (Fig.\\u00a04c and Supplementary Table\\u00a05).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig4\", \"Fig4\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"Fig4\", \"MOESM1\"], \"section\": \"Past population size history and relative timing of domestication\", \"text\": \"We used the sequentially Markovian coalescent based method smc++ to analyse demographic changes associated with domestication, expansion and or contraction of the two cultivated fonio millet species. The effective population size of D. exilis steadily declined from more than 20,000 years ago and reached a minimum ~2000 years ago (Fig.\\u00a04d). Then the effective population size markedly increased, and then declined again from 500 to 200 years ago (Fig.\\u00a04d). Similar trends were noted when the two D. exilis populations were differentiated based on genetic clusters, but the onset of the expansion of the Guinean cluster was later than that of the Nigerian cluster (Supplementary Fig.\\u00a017). Moreover, only the Nigerian cluster experienced a decrease in population size at 500 years, whereas the effective population size of the Guinean cluster stopped increasing and remained stable (Supplementary Fig.\\u00a017). The patterns of the two clusters of the D. longiflora wild populations differed. From 60,000 to 30,000 years ago, the East African cluster experienced a marked decrease in population size, while the contrary was observed for the West African cluster (Supplementary Fig.\\u00a017). Then, from 30,000 to 1000 years, the effective population sizes of these populations increased or decreased, respectively, before levelling off (Supplementary Fig.\\u00a017). Concerning D. iburua, the population decreased sharply from around 20,000 years ago and reached a minimum before D. exilis, at around 3000 BP. A lesser increase in the effective population size occurred from this period to 300 years BP. The closest D. ternata wild population decreased the same way, but did not increase thereafter like the cultivated population (Supplementary Fig.\\u00a017). The effective population size of D. ternata from C\\u00f4te d\\u2019Ivoire showed the same decrease until 8000 years, and diverged at 4000 years, with a steep increase until 1000 years ago (Supplementary Fig.\\u00a017). However, those results for the D. iburua and D. ternata populations should be considered with caution as there was high variance between the different runs (Fig.\\u00a04d and Supplementary Fig.\\u00a017).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"Fig1\", \"MOESM1\", \"MOESM4\"], \"section\": \"Plant material\", \"text\": \"We assembled a large collection of fonio genomic resources, with 265 georeferenced accessions, including 94 new sequences (Fig.\\u00a01a, Supplementary Table\\u00a01, and Supplementary Data\\u00a01). It contains whole-genome sequences of D. iburua and its wild relative D. ternata, and new D. exilis sequences, so now its geographical distribution is likely fully represented. For D. exilis, a dataset (Abrouk et al.) of 157 accessions is available in the European Nucleotide Archive (ENA), but some countries are missing or poorly represented. We sampled 46 new D. exilis accessions originating from Senegal, Benin, Nigeria and Niger to enhance the overall distribution coverage of the species. Our new D. exilis collection now consists of a total of 203 accessions from nine countries. For D. iburua, we sequenced 26 new samples collected in Nigeria. For the wild relatives, we used the 14 D. longiflora accessions analysed in Abrouk et al. and collected 22 accessions from D. ternata specimens stored at the National Museum of Natural History in Paris (France) and in CIRAD herbariums.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Library preparation and whole genome sequencing\", \"text\": \"Total genomic DNA extractions were performed from fresh leaves by an automated method adapted from Risterucci et al. on Biomek FXP (Beckman, Coulter, CA, USA) and using the NucleoMag Plant Kit (Macherey-Nagel, Germany). For D. ternata herbarium specimens, DNA was extracted using the modified protocol of Doyle and Doyle (Supplementary Method\\u00a01). We constructed paired-end DNA libraries using a homemade library construction approach with a double indexing protocol corresponding to a Tag sequence of six nucleotides on the 5\\u2032 DNA end and an index sequence of eight nucleotides on the 3\\u2032 DNA end (Illumina Unique Dual Index). This protocol was used to avoid contaminated or recombined reads that would not be correctly assigned during the demultiplexing step. The libraries of the 94 new genomes were sequenced using an Illumina NovaSeq 6000 system with a targeted 10X sequencing depth. A duplicate accession was added as positive sequencing control.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Historical relationships between populations\", \"text\": \"We first inferred between-population relationships with Treemix v.1.13. Treemix v.1.13 uses the genome-wide covariance structure of allele frequencies between populations to maximise the likelihood of the inferred tree topology and migration between populations. We first analysed the four groups together, i.e. the two cultivated species D. iburua and D. exilis, and the two wild species D. longiflora and D. ternata. We also performed an analysis based on the genetic structure inferred with sNMF. Individuals were included in a group if their ancestry coefficients were 0.6 or higher. First, a total of 100 independent bootstrap replicates were generated and a consensus tree was built with phylip v.3.697. We then ran 10 independent Treemix runs with the consensus tree to find the optimal number of migration events (m), for each m from m\\u2009=\\u20091 to m\\u2009=\\u20096. The optimal number of migration edges was determined with the optM v0.1.6\\u2009R package which allowed us to plot the mean composite log likelihood of each run for each m. Another set of 100 bootstrap replicates was generated to find the new consensus tree with the number of previously inferred migration edges. Finally, 30 independent Treemix runs were performed with the new consensus tree and we visualised the tree having the maximum likelihood with R Treemix plotting functions. At each step of this pipeline, we accounted for linkage disequilibrium by setting SNP block sizes ranging from 500 to 1000. The model was implemented without specifying a root or by specifying D. ternata individuals from C\\u00f4te d\\u2019Ivoire as root, since they appeared to be distant and isolated in the PCA and clustering analyses. This pipeline was implemented based on an available approach (https://github.com/carolindahms/TreeMix). In addition to this genome-wide approach, we summarised relationships among the different populations with a sliding window method implemented in the Twisst pipeline which quantifies the relative weights for each possible sub-tree topology among a set of inferred gene trees (Supplementary Method\\u00a02).\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Population modelling and estimation of the relative timing of fonio millet domestication\", \"text\": \"We first tested six different topologies (Supplementary Fig.\\u00a018) to test: (1) the independence of the cultivated fonio millet domestications, and (2) from which wild species the cultivated species differed. We assumed a constant population size for these different scenarios. The best tree topology was then used as a backbone to test if the data supported or not a domestication bottleneck after population divergence.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Population modelling and estimation of the relative timing of fonio millet domestication\", \"text\": \"For each of the six scenarios, we performed 100 independent runs of 500,000 coalescent simulations to estimate: the effective size of each population X (N) and the times of population divergence (T). We defined uniform (or log-uniform) prior distributions as well as prior boundaries per parameter (Supplementary Table\\u00a06). Parameter estimations were based on a maximum composite likelihood from the SFS with 40 optimisation cycles, with parameter reinitialization after three non-improved cycles while reducing the search ranges by 50%. We used an Infinite site model (-I) with a fixed mutation rate of 6.5\\u2009\\u00d7\\u200910\\u22129 and a 1 year generation time as fonio millets are annual plants.\"}, {\"pmc\": \"PMC12044004\", \"pmid\": \"40307323\", \"reference_ids\": [\"MOESM7\"], \"section\": \"Reporting summary\", \"text\": \"Further information on research design is available in the\\u00a0Nature Portfolio Reporting Summary linked to this article.\"}]"

Metadata

"{}"