Integrated global assessment of the natural forest carbon potential
PMCID: PMC10700142
PMID: 37957399
Abstract
Forests are a substantial terrestrial carbon sink, but anthropogenic changes in land use and climate have considerably reduced the scale of this system 1 . Remote-sensing estimates to quantify carbon losses from global forests 2 – 5 are characterized by considerable uncertainty and we lack a comprehensive ground-sourced evaluation to benchmark these estimates. Here we combine several ground-sourced 6 and satellite-derived approaches 2 , 7 , 8 to evaluate the scale of the global forest carbon potential outside agricultural and urban lands. Despite regional variation, the predictions demonstrated remarkable consistency at a global scale, with only a 12% difference between the ground-sourced and satellite-derived estimates. At present, global forest carbon storage is markedly under the natural potential, with a total deficit of 226 Gt (model range = 151–363 Gt) in areas with low human footprint. Most (61%, 139 Gt C) of this potential is in areas with existing forests, in which ecosystem protection can allow forests to recover to maturity. The remaining 39% (87 Gt C) of potential lies in regions in which forests have been removed or fragmented. Although forests cannot be a substitute for emissions reductions, our results support the idea 2 , 3 , 9 that the conservation, restoration and sustainable management of diverse forests offer valuable contributions to meeting global climate and biodiversity targets. Analysis of ground-sourced and satellite-derived models reveals a global forest carbon potential of 226 Gt outside agricultural and urban lands, with a difference of only 12% across these modelling approaches.
Full Text
The underlying goal of our analysis was to investigate the impact of human land-use change on forest carbon stocks globally. Of course, many indigenous populations and local communities live in sustainable harmony with natural forests, often with beneficial impacts on ecosystem structure. However, we aimed to isolate the effects of extensive land-use change and anthropogenic degradation. To achieve this, we used a partial-regression approach in the first step, testing for the relationship between aboveground forest biomass and anthropogenic degradation, while controlling for the effects of climate, topography and soil conditions (Fig. 1d,g and Methods). This analysis revealed a consistent decline in tree carbon density along the anthropogenic degradation gradient across all biomes, evident in both the ground-sourced and the satellite-derived biomass observations (Fig. 1e,h).
Our GS models of potential forest biomass combine plot-level aboveground forest carbon measurements with spatially explicit data reflecting climate, soil conditions, topography, forest canopy cover and human disturbance, using random-forest machine-learning models to interpolate our biomass measurements across the globe (see Methods). In the first set of models (GS1), we estimated the global forest carbon potential in the absence of human activity by statistically accounting for the impact of human disturbance, setting all variables directly reflecting human disturbance to zero. By contrast, the second set of GS models (GS2) extrapolated the global forest carbon potential from data derived from protected areas with minimal human disturbance. To account for uncertainties in canopy-cover estimates from the forest inventory plots, we incorporated upper and lower boundaries of canopy cover in each pixel, resulting in a total of four GS models: GS1Upper, GS1Lower, GS2Upper and GS2Lower. We extended this combination of approaches to evaluate the biomass potential for each of the three satellite-derived biomass products (ESA-CCI, Walker et al. and harmonized). The models included either all terrestrial regions (SD1) or only regions with minimal human disturbance (SD2), using the same set of predictor variables as covariates included in the GS models. This resulted in a total of six SD models: SD1ESA-CCI, SD1Walker, SD1Harmonized, SD2ESA-CCI, SD2Walker and SD2Harmonized.
The full combination of models allowed us to disentangle the effects of deforestation and forest degradation on tree carbon losses while representing data and model uncertainties. The total tree carbon potential was determined by summing the forest carbon that would naturally exist (1) outside existing forests (restoration potential) and (2) in existing, degraded forests (conservation potential). The resulting maps provide models of tree carbon potential under current (1979–2013) climate conditions in the hypothetical absence of human disturbance (Fig. 2a).
The coefficients of variation from a bootstrapping procedure showed that existing and potential carbon stocks were estimated with confidence across all models. For 90–100% of the pixels inside the existing and potential forest area, the coefficients of variation were below 20% (Supplementary Figs. 1 and 2). A spatial-validation procedure (spatially buffered leave-one-out cross-validation (LOO-CV)), accounting for the potential effects of spatial autocorrelation on model-validation statistics, showed that the GS and SD models explained 70–77% or 82–87% of the spatial variation in tree biomass, respectively (Supplementary Table 1 and Supplementary Fig. 3). Furthermore, when specifically considering disturbed regions with human-disturbance levels ranging from 10% to 60%, the explained variation in tree biomass remained high (>60%), showing that our models effectively captured the variation of carbon stocks in regions with high human footprint (Supplementary Fig. 4).
Despite discrepancies in certain regions, there was high overall agreement between the ground-sourced and satellite-derived biomass estimations at the global scale (average R2 of 0.72 at a spatial resolution of approximately 1 km2; Supplementary Figs. 5–9). This agreement translated to similar estimates of existing live tree biomass: 367 Gt C (model range = 334–400 Gt C) for the GS models and 394 Gt C (model range 355–445 Gt C) for the SD models (<7% difference). A comparison of existing biomass estimates across the latitudinal gradient also showed high inter-model consistency, with the GS model predicting slightly higher biomass values than the SD model for the equatorial zone and lower biomass values at high-latitude regions of the Southern Hemisphere (>40 °S) (Supplementary Fig. 6). On average, the models predicted that 69% of live tree biomass is stored in tropical regions, with temperate, boreal and dryland regions accounting for 18%, 11% and 1%, respectively (Supplementary Table 3).
Using all sets of GS and SD models, we could estimate the total potential living tree carbon that would exist in the absence of human influence. Our models projected considerable gains in the hypothetical natural forest biomass, with a mean estimate for total potential living tree carbon of 600 Gt C (model range = 487–712 Gt C). The individual model estimates were as follows: GS1Upper = 487 Gt C, GS1Lower = 595 Gt C, GS2Upper = 517 Gt C, GS2Lower = 647 Gt C, SD1Harmonized = 552 Gt C, SD1ESA-CCI = 578 Gt C, SD1Walker = 669 Gt C, SD2Harmonized = 596 Gt C, SD2ESA-CCI = 645 Gt C and SD2Walker = 712 Gt C (Figs. 3 and 4 and Supplementary Tables 2 and 3). The highest estimates were derived from the Walker et al. map, with the GS, harmonized biomass and ESA-CCI estimates being 19%, 17% and 11% lower, respectively. Overall, we predict that, under current climate conditions, a further 217 Gt (model range = 153–267 Gt) of living tree carbon could potentially exist in the absence of humans (Fig. 5b). Of this potential, 123 Gt C (99–153 Gt C) can be attributed to tropical regions, 55 Gt C (40–66 Gt C) to temperate regions, 14 Gt C (5–25 Gt C) to boreal regions and 25 Gt C (9–41 Gt C) to dryland regions (Supplementary Table 3).
Despite the broad consensus on the global top-down and bottom-up carbon potential estimates, considerable spatial variations were observed in the models. The SD models tended to predict higher potential carbon stocks than the GS models across 82% of pixels, particularly in South American tropical forests (Fig. 2e,f), suggesting possible overestimation of satellite-derived biomass potential in these regions. More ground-sourced data are needed from tropical areas to improve accuracy and balance the high sample sizes available for temperate regions. On the other hand, the GS models predicted slightly higher potential than the SD models in subtropical regions and temperate forests of Europe.
We also show that the type 1 models (GS1 and SD1) predicted a 47 Gt C lower potential than the type 2 models (GS2 and SD2; Fig. 3). The focus on ‘undisturbed’ regions in the type 2 models may introduce bias by favouring regions with unusually high biomass. By contrast, the type 1 models incorporated observations across the full human-disturbance gradient, potentially resulting in an underestimation of potential in regions with incomplete historic-disturbance data. Furthermore, we imposed a constraint on forest biomass potential by limiting forest growth to the potential tree cover range projected in a previous analysis. If this spatial constraint is removed to compare our model with the estimate of Walker et al. of 796 Gt C (without such constraints), our SD2Walker model generates a similar total potential of 760 Gt C (<5% difference). Thus, our mean estimate of Earth’s total potential living tree carbon of 600 Gt C from the ensemble of modelling approaches is probably conservative.
To determine the total carbon storage potential of natural woody ecosystems, we converted our estimates of living tree biomass into total ecosystem carbon stocks by incorporating global data on soil carbon, dead wood and litter. To represent the various sources of uncertainty (Fig. 4), we considered: (1) model type (types 1 and 2); (2) input data (upper and lower canopy cover boundaries for GS models; ESA-CCI, Walker et al. and harmonized for SD models); (3) aboveground biomass potential (bootstrapping); (4) tree root biomass; (5) dead wood and litter; and (6) soil carbon. The GS and SD models exhibited similar uncertainty contributions globally, with 21.2% and 19.0% attributed to aboveground living tree biomass potential, 21.6% and 23.9% to dead wood and litter, 22.8% and 20.7% to aboveground biomass input data, 15.0% to soil carbon, 12.1% and 11.8% to root biomass and 7.3% and 9.6% to model type. Soil carbon emerged as the primary source of uncertainty in regions with high latitudes and elevation. By contrast, aboveground biomass input data and dead wood and litter were the primary sources of uncertainty in dry and humid tropical areas, respectively (Fig. 4).
Considering all carbon pools together, we estimate that current forest carbon storage is 328 Gt (221–472 Gt) lower than the full natural potential (Fig. 5 and Table 1). Of this difference, 226 Gt C (151–363 Gt C) exist outside urban and agricultural areas, with 61% in forested regions in which sustainable management and conservation can promote carbon capture through the recovery of degraded ecosystems and 39% in regions in which forests have been removed (Table 1). These estimates highlight that forest conservation, restoration and sustainable management can help achieve climate targets by mitigating emissions and enhancing carbon sequestration.
Previous work has suggested that up to 80% of the world’s forests are secondary systems that have undergone anthropogenic degradation. Our models corroborate these findings, revealing a considerable potential for carbon capture in existing forests by allowing these degraded ecosystems to regenerate to maturity. The difference between current and potential ecosystem carbon stocks amounts to 139 Gt C (108–228 Gt C) in existing forests, representing 61% of the total difference when excluding urban and agricultural areas (Table 1). Of the total 139 Gt, 11 Gt (8%) can be attributed to biomass loss in existing forest plantations, in which restoring diverse ecosystems could lead to further carbon capture. The remaining 128 Gt can be attributed to human degradation in other forest ecosystems. These findings highlight the importance of forest conservation for carbon capture, as ecosystems are allowed to recover to their mature states. It suggests that a substantial proportion of carbon capture can be achieved with minimal land-use conflicts. However, it is essential to acknowledge that the demand for wood and other forest-based products imposes limitations on this potential, given their climate benefits as substitutes for carbon-intensive materials such as fossil fuels and concrete. Nonetheless, evidence shows that reductions in harvesting intensity and forest degradation can deliver important climate benefits. Moreover, our model might underestimate the extent of degradation owing to challenges in capturing historical land-use legacies and limited data availability on plantations in certain countries. These observations reinforce the importance of effective forest conservation and management not only in reducing future carbon emissions but also in removing carbon that has already been released into the atmosphere.
In areas in which forests have been removed, the difference between the current and potential forest carbon stocks amounts to 189 Gt C (112–269 Gt C). Of this difference, 30% (57 Gt C) can be attributed to cropland areas, 28% (53 Gt C) to areas experiencing low anthropogenic pressure at present, 23% (43 Gt C) to pasture land, 18% (34 Gt C) to rangeland and 1% (2 Gt C) to urban areas (Fig. 5, Table 1 and Supplementary Fig. 10). It is important to recognize that the scale of this potential is contingent on social land-use constraints. Socially responsible ecosystem restoration must be driven by the land-use decisions of local communities, especially indigenous communities that often face marginalization. Sustainable economic development that promotes approaches that work with nature (for example, agroforestry, ecotourism etc.) can provide critical avenues for long-term financial security as a result of healthy nature. Also, it is important to acknowledge that forests can lead to reductions in surface albedo, which generally have warming effects in high-latitude regions. Conversely, the local biophysical cooling effects of forests in warmer regions probably enhance the climate-adaptation benefits in the global south.
Our integrated estimate of the difference between current and potential global living tree biomass (217 Gt C) falls at the lower end of the range of previous estimates, which ranged from 150 to 446 Gt C (Fig. 3c,d). Also, our estimate of the extra potential for total ecosystem carbon storage outside urban and agricultural land (226 Gt C) aligns closely with recent global-scale estimates of 205 and 287 Gt C (refs. ). However, it is worth noting that three previous data-driven approaches, not included in this meta-analysis because of methodological differences, have suggested carbon potential values below this range. Specifically, Lewis et al. considered more rigorous social constraints and estimated that natural restoration of 350 Mha of deforested, tropical land could capture 42 Gt C in living tree biomass. Scaling this estimate to 900 million hectares yielded a potential of 89–108 Gt tree carbon, which is comparable with our estimate of tree biomass restoration potential of 91 Gt C outside existing forest, urban areas and cropland regions (Table 1). Similarly, Roebroek et al. recently reported that the carbon potential in existing forests could be as low as 44 Gt C. Their estimate is considerably lower than our conservation potential estimate of 139 Gt C. This difference arises because Roebroek et al. focused only on aboveground tree biomass (excluding soil, roots, dead wood and leaf litter) and only considered the tree cover of existing forested regions. When we narrow our analysis to aboveground biomass in these forests, we recover a similar estimate of forest potential of 50 (39–63) Gt C. Nonetheless, when we consider studies that focused on the total ecosystem potential in all forest regions, our analysis reveals a distinct overlap that provides confidence in the scale of carbon losses from the global forest system.
Understanding the potential for carbon storage in natural forests is crucial for comprehending their role in combating climate change. Our combined modelling approach, including ten estimates from this study and nine others from previous studies, allows us to identify the extent of overlap across diverse approaches and increases our confidence about the scale of the forest carbon potential across the globe. We found that total forest carbon storage is, at present, 328 Gt C (model range = 221–472 Gt C) below its full potential. Of this potential, 102 Gt C (69–134 Gt C) exist in urban areas, cropland and permanent pasture sites, in which substantial restoration is highly unlikely. Yet, a potential of 226 Gt C (151–363 Gt C) is in existing forests and regions with low human pressure (Table 1). Of this constrained forest carbon potential, 139 Gt C (61%) can be found in regions that are already forested. This highlights that the prevention of deforestation does not only contribute to the reduction of carbon emissions but has large carbon drawdown potential if ecosystems can be allowed to return to maturity. Improved forest management and restoration to reconnect fragmented forest landscapes contribute a considerable 87 Gt (39%) to the extra carbon drawdown potential. We stress that, despite considering the broad land-use types, we cannot identify detailed land-use activities at a high resolution, so different social and economic considerations may place further constraints on the scale of this potential. Nevertheless, this work highlights the potential contribution of forest conservation, restoration and sustainable management in capturing carbon from the atmosphere.
The development of current and natural forest carbon maps involved several approaches and data sources with varying strengths and weaknesses. This ensemble of modelling approaches can help to identify the extent of agreement and uncertainty across modelling approaches, enabling a comprehensive understanding of carbon potential at a global scale. As new satellite technologies, such as the Global Ecosystem Dynamics Investigation (GEDI) project, begin to reveal high-resolution information about forest structure, it will be increasingly important to refine the spatial and temporal resolution of these carbon stock models. Our multimodel and multidata comparison pinpoints regional variation in the main sources of uncertainty in forest carbon potential, highlighting the need for improved aboveground data-sampling efforts in the tropics and soil carbon sampling at high latitudes (Fig. 4). As such, continuing efforts to refine the confidence in this forest carbon potential require advancements in remote-sensing instrumentation, field-monitoring strategies with sustained funding for research teams and field workers, especially in the Global South, better representation of temporal dynamics in carbon stocks, especially in ecosystems prone to natural disturbances, and methodology to allow for strict and verifiable integration of ground data and remote sensing into comprehensive carbon stock estimates. Fair and equitable funding support for sustaining and sharing tropical forest data is vital to reduce global sampling biases in forest inventory efforts (Supplementary Fig. 11).
Plot-level forest inventory records were obtained from data compiled in the GFBI database (http://www.gfbinitiative.org), which hosts information for 1,188,771 plots (median plot size = 250 m2) from every continent except Antarctica (Fig. 1). Each plot contains information on stem diameter at breast height (DBH) for each tree. Individuals with a DBH < 5 cm were removed from the analysis. Quality controls of tree density values were conducted and we removed plots with tree densities that fell outside the median ± 2.5 times the median absolute deviation (moderately conservative threshold) in each biome (6% of total plots). This resulted in retaining a total of 25,779,993 tree observations in 1,089,026 plots.
Following ref. , we applied back calculation to generate a pseudo dataset for biomass changes along DBH gradients based on each of the 430 allometric equations. To generate the pseudo data, we applied the following rules: (1) for a DBH between 5 and 25 cm, each centimetre was assigned a corresponding pseudo biomass value; (2) for a DBH between 25 and 100 cm, every 5 cm was assigned a corresponding value; (3) for a DBH between 100 and 300 cm (maximum DBH), every 10 cm was assigned a corresponding value. We then trained biome-specific allometric equations (varying in the β0 and β1 parameter estimates) based on the pseudo DBH and biomass dataset (Supplementary Fig. 12 and Supplementary Table 4).
After computing the aboveground dry biomass for all approximately 28 million individuals in our dataset, plot-level biomass values were obtained by summing the biomass of all individuals in the respective plot. For plots that contained data for several years, we calculated the mean of these years. The median year of observation across all plots was 2002. Subsequently, the biomass densities (in t ha−1) of each plot were obtained by dividing the total aboveground biomass (W) by the plot area. Carbon values were obtained by multiplying tree biomass by biome-specific wood carbon concentrations, ranging from 45.6% in tropical moist broadleaf forest to 50.1% in temperate conifer forest (see Supplementary Table 5). The spatial modelling was performed at 30-arcsec (about 1-km2) resolution and we therefore averaged tree carbon-density values for plots located in the same 30-arcsec pixel.
To avoid overestimation of carbon densities, we removed (1) values larger than the maximum carbon density ever recorded for forests (1,867 t C ha−1) and (2) values that fell outside the median ± 2.5 times the median absolute deviation (moderately conservative threshold) in each biome. Small outlier values were kept, however, if they fell in human-modified non-forest landscapes, that is, regions with a human-disturbance index >10% and canopy cover <10%. This was done to avoid the underestimation of current carbon in croplands, pasture lands and urban areas that can contain notable amounts of existing biomass in trees outside forests. To obtain normally distributed data, the carbon-density values were log-transformed before the median absolute deviation was calculated, using the following equation (Supplementary Fig. 13):
In total, 40 layers, reflecting climate, soil and topographic features, were used as covariates in our analyses (Supplementary Table 6). All layers were standardized to 30-arcsec resolution (1 km2 at the equator). Layers for 19 bioclimatic variables came from the CHELSA version 1.2 open climate database (www.chelsa-climate.org), topographic information (elevation, slope, roughness, eastness, northness, aspect cosine, aspect sine and profile curvature) from the EarthEnv (www.earthenv.org/topography) database, cloud cover (annual mean, inter-annual standard deviation and intra-annual standard deviation) from the EarthEnv (www.earthenv.org/cloud) database and ref. , depth to the water table from ref. , the annual mean of solar radiation and wind speed from the WorldClim database (version 2), absolute depth to bedrock and soil texture (clay content, coarse fragments, sand content, silt content and soil pH), averaged for the depth between 0 to 100 cm below surface, from the SoilGrids database and the Global Aridity Index from the Global Aridity Index and Potential Evapotranspiration (ET0) Climate Database version 2.0 (refs. ).
To train spatially explicit tree carbon models across the world’s forests, we ran random-forest machine-learning models using Google Earth Engine. The models included 40 environmental layers (representing climate, soil and topographic features), eight human disturbance layers, and canopy cover as predictors. In random forest, unlike traditional regression, correlation among variables does not affect the model accuracy. Indeed, the ability to use many correlated predictors is one of the key benefits of machine-learning models. When variables are correlated, the effect of these variables is ‘shared’ across the trees in the random forest. Because random forest does not estimate coefficients as in regression, this correlation does not hinder model fit or performance but, rather, complicates efforts to quantify variable importance, which is also shared across correlated variables (see Supplementary Fig. 14 for an evaluation of variable importance using a reduced, uncorrelated set of variables). Thus, including numerous variables, even if correlated, can improve the predictive power of the model to accurately quantify current carbon.
In a first step, we tested for the existence of spatial autocorrelation in model residuals, which can bias model-validation statistics. This was done by calculating the Moran’s I index of the residuals from generalized additive models at different spatial scales (0–1,000 km). The Moran’s I indices indicated residual spatial autocorrelation at distances of up to 80 km for all GS models (Supplementary Fig. 15a–d). To avoid any bias introduced by the influence of spatial autocorrelation and correct for the uneven sampling across regions, we therefore applied bootstrapped spatial subsampling (100 iterations) to predict both current and potential tree carbon densities (see ‘Geospatial modelling of tree carbon potential’ section). The spatial subsampling was conducted by subsampling one random observation inside each 0.7-arcdegree (about 78-km) grid, resulting in approximately 4,500 observations for each subsample. Given that the model was run with 100 iterations, this resulted in a total of about 450,000 samples used to build our GS models. Parameter tuning for each model was performed through the grid-search procedure of Google Earth Engine to explore the results of a suite of machine-learning models trained on the 49 covariates. For each of the models, we ran 48 discrete parameter sets covering the total grid space of 700 possible parameter combinations. Performance of each model was assessed using the coefficient of determination (R2) values from tenfold cross-validation (Supplementary Table 1) and we retained the best models from each bootstrapped spatial subsample. All R2 values reported throughout the manuscript represent the coefficient of determination relative to the 1:1 line of observed versus predicted values, which is equivalent to a standardized mean squared error.
As an alternative to testing whether spatial autocorrelation in model residuals affects model-validation statistics, we applied spatially buffered LOO-CV using the respective autocorrelation distances as buffer radii (Supplementary Table 1). In this procedure, each data point is predicted by a model that uses all data outside the buffer radius of the respective data point as training data. To run the LOO-CV, we used the hyperparameter settings of the best-performing random-forest model based on random tenfold cross-validation.
To account for tree carbon stored belowground as roots, we multiplied our aboveground tree carbon predictions by the pixel-level means or the upper and lower confidence bounds of the proportional contribution of root carbon, using a spatially explicit map of tree root mass fraction (Supplementary Fig. 16). This map was derived from random-forest models based on 5,170 spatially explicit observations of tree biomass ratios between roots and shoots, covering all continents except Antarctica. Confidence ranges of the pixel-level root mass fraction estimates were based on sampling uncertainty, using a stratified bootstrapping procedure (see methods in ref. for details).
To evaluate the extent of model interpolation versus extrapolation, that is, how well our training data represent the full multivariate environmental covariate space, we performed an approach based on principal component analysis (PCA). To do so, we performed PCA on the 49 covariates represented in our training data, using the centring values, scaling values and eigenvectors to transform the 49 covariates into the same PCA spaces. Then we created convex hulls for each of the bivariate combinations from the top 19 principal components (which collectively covered more than 90% of the sample-space variation). Using the coordinates of these convex hulls, we classified whether each pixel falls within or outside each of these convex hulls. In total, 92% of the potential canopy cover area fell within ≥95% of the 171 PCA convex hull spaces computed from our training data (representing the range of environmental conditions in our training data), with most of the outliers existing in arid regions (Supplementary Fig. 17a).
We also tested how well the training data span the variation in the eight human-disturbance layers. In total, 90% of the potential canopy cover area fell within ≥95% of the ten PCA convex hull spaces computed from our training data (Supplementary Fig. 17b).
The ESA-CCI map represents aboveground living tree biomass for the year 2010 and was produced using satellite data from ALOS-2/PALSAR-2 and a physical-based inversion model that estimates biomass from growing stock volume, wood density and biomass expansion factors, with bias adjustment following the validation framework in ref. . The map was averaged from 100-m to 1-km2 spatial resolution to match the resolution of the covariates. The 1-km2 ESA-CCI map was assessed following the validation framework in ref. , wherein map bias is predicted using a model-based approach based on global reference data. This step reduces mapping bias in areas with statistically significant prediction bias and particularly reduces the underestimation of biomass at high-biomass forests >350 t ha−1. The map comes with an uncertainty layer that accounts for spatially correlated errors during spatial averaging. To convert the living tree biomass estimates to carbon, we multiplied tree biomass with biome-specific wood carbon concentrations (see Supplementary Table 5).
After training and parameterizing the GS model of current tree carbon density using equation (3), we estimated the potential tree carbon density in forests that could exist in the absence of human disturbance by modifying this equation setting human-disturbance variables to zero and replacing existing canopy cover with potential canopy cover (GS1):in which are the environmental variables, are the scaled human-disturbance variables set to zero and is the current canopy cover, which was replaced by potential canopy cover after model training for the prediction of the total carbon potential. This allowed us to train the model including information on current (2010) forest canopy cover and then to predict the tree carbon potential inside the potential canopy cover by replacing current canopy cover with the ‘natural’ canopy cover expected in the absence of humans.
The two types of SD model were run with the ESA-CCI, Walker et al. and harmonized maps of current woody carbon as input data, resulting in six model combinations (two model types and three input datasets). As for the GS1 model, model structure and parameterization of the first SD model of potential living tree carbon (SD1) followed equation (5). Similarly, as for the GS2 model, the second SD model of potential tree carbon density (SD2) followed equation (6), and we trained the model using only biomass density information from areas with minimal human disturbance inside protected areas (strict nature reserve or wilderness area) and/or intact forest landscapes.
To test for spatial autocorrelation in model residuals, we calculated the Moran’s I index of the residuals from generalized additive models at different spatial scales (0–1,000 km) and, for each model, found spatial autocorrelation at distances of up to 550–900 km (Supplementary Fig. 15e–j). To test for the effect of spatial autocorrelation on model validation statistics, we then ran LOO-CV models for each of the 100 bootstrapped subsamples, using the respective autocorrelation distances as buffer radii and the hyperparameter settings of the best-performing random-forest model based on random tenfold cross-validation (Supplementary Table 1).
To account for forest carbon stored in dead wood and litter, we obtained forest-type-level carbon ratios from previous studies. Means and confidence ranges of the ratios between dead wood and litter carbon and living tree carbon for tropical, temperate and boreal forests were calculated from forest-type estimates of total living biomass, dead wood and litter from Table S3 in ref. . Means and confidence ranges for dryland forests were calculated from Table 1 in ref. , using all sites for which data on plant aboveground and belowground biomass and litter was available. The ratios between dead wood and litter carbon and living tree carbon were 22% (95% confidence range = 15–33%), 33% (30–37%), 80% (68–94%) and 21% (2–40%) for tropical, temperate, boreal and dryland forests, respectively. We then multiplied pixel-level living tree carbon values by these percentages to estimate the means and confidence bounds of dead wood and litter carbon for each pixel (Table 1).
Using the soil potential map ref. , which represents the effects of anthropogenic land-use and land-cover changes on soil organic carbon in the top 2 m (ref. ) over the past 12,000 years, we extracted estimates of soil carbon potential in the absence of humans (difference between soil carbon 10,000 BC and current soil carbon) for all pixels that would naturally support trees (potential canopy cover3 ≥ 10%; Table 1). Associated spatial-prediction uncertainties (absolute errors) were calculated by fitting a spatial-prediction model to the prediction residuals of the cross-validated original model and applying this error model over the whole area of interest.
For each of the GS and SD models, the 100 bootstrapped models of aboveground tree carbon potential were used to calculate per-pixel coefficient-of-variation values (standard deviation divided by the mean predicted value) as a measure of sampling uncertainty (hereafter referred to as bootstrap prediction uncertainty; Supplementary Figs. 1 and 2). Using the bootstrapped models, we also calculated 95% confidence ranges of estimates, allowing us to represent uncertainty ranges for each aboveground carbon model. To represent the uncertainty in canopy cover of the forest inventory plots, we ran the GS1 and GS2 models for both the upper and lower canopy cover estimates. To represent data uncertainty of the SD models, we ran the SD1 and SD2 models using three different input datasets (ESA-CCI, Walker et al. and harmonized biomass maps). Uncertainty in belowground tree carbon was derived by multiplying the upper and lower confidence ranges of aboveground tree carbon values with the upper and lower confidence ranges of spatially explicit root mass fractions, thus representing uncertainties in both root mass fraction and aboveground biomass. Using the entire confidence range of total (aboveground and belowground) living tree carbon, including sampling and data uncertainty, we then calculated the uncertainty in dead wood and litter biomass by multiplying the upper and lower confidence ranges of total living tree carbon values with the upper and lower confidence ranges of the forest-type-specific ratios between dead wood and litter carbon and living tree carbon (see ‘Dead wood and litter biomass’ section). Dead wood and litter biomass uncertainty was thus the result of uncertainties in both dead wood and litter-to-tree biomass ratios and tree biomass. Spatially explicit uncertainties in soil carbon potential were derived from maps of absolute errors in organic carbon density at 0–200 cm soil depth provided in ref. . Propagation of uncertainty was done by summing all individual uncertainties and assuming that they are uncorrelated.
To quantify the relative contribution of the different sources of uncertainty to the overall uncertainty in our models, we divided the absolute uncertainty of each uncertainty type by the sum of all uncertainties (Fig. 4). This partitioning allows for relative comparison in uncertainty among sources, but otherwise does not necessarily reflect total model uncertainty owing to overlap and correlation across sources of uncertainty.
Throughout the text, we refer to conservation potential as the difference between current and potential carbon in existing forests, which was computed by subtracting the carbon stored at present inside existing forests from the expected carbon in these forests in the absence of human disturbance. We refer to restoration potential as the difference between current and potential carbon outside existing forests, which was estimated as the expected carbon in non-forest areas that would naturally support trees in the absence of human disturbances. Finally, the total difference between current and potential carbon refers to the sum of the conservation and restoration potentials (Figs. 3 and 5).
To estimate the existing and potential carbon within biomes (Supplementary Table 2), forest classes (tropical, temperate, boreal and dryland; Supplementary Table 3) and countries (Fig. 5e), we used the World Wide Fund for Nature (WWF) biome definitions and country boundaries from the world boundary map. Forests were classified into four broad categories (tropical, temperate, boreal and dryland). Tropical forest includes six biomes: tropical and subtropical moist broadleaf forest, tropical and subtropical dry broadleaf forest, tropical and subtropical coniferous forest, tropical and subtropical grassland, savannah and shrubland, flooded grassland and savannah, and mangroves; temperate forest includes four biomes: temperate broadleaf and mixed forest, conifer forest, temperate grassland, savannah and shrubland, and montane grassland and shrubland; boreal forest includes two biomes: boreal forest/taiga and tundra; dryland refers to the two biomes Mediterranean forest, woodland and scrub, and desert and xeric shrubland.
To gain insight into the forest carbon potential estimated by previous studies, we reviewed publications that applied diverse approaches to quantify the potential carbon storage capacity of global forests. These studies fall into two types of estimate. The first type included studies reporting the total carbon that could be stored in global forests in the absence of human activities (Fig. 3b). The second type encompassed studies reporting the extra potential carbon that could be stored in the global forests, that is, the difference between current and potential carbon stocks (Fig. 3d). In total, we found 20 estimates of the total carbon potential and nine estimates of the difference between current and potential carbon stocks. These estimates were derived from four different approaches: inventory-based empirical estimates, mechanistic models, ensemble models and data-driven models. Inventory-based estimates comprise studies that estimated the global carbon potential from maximum forest carbon densities observed in climate zones or ecoregions based on inventory data. Mechanistic-model estimates included studies that used mechanistic models, such as Earth system models, to estimate the carbon potential of global forests. Ensemble-model estimates consisted of studies that used a variety of existing biomass maps to estimate the global carbon potential from maximum forest carbon densities in climate zones or ecoregions. Last, the data-driven model category encompassed studies that used extensive global carbon density observations to train global models based on environmental covariates. References to the studies included in this meta-analysis are shown in the legend of Fig. 3 and Supplementary Table 7.
Sections
"[{\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig1\", \"Sec9\", \"Fig1\"], \"section\": \"Mapping the human impact on tree biomass\", \"text\": \"The underlying goal of our analysis was to investigate the impact of human land-use change on forest carbon stocks globally. Of course, many indigenous populations and local communities live in sustainable harmony with natural forests, often with beneficial impacts on ecosystem structure. However, we aimed to isolate the effects of extensive land-use change and anthropogenic degradation. To achieve this, we used a partial-regression approach in the first step, testing for the relationship between aboveground forest biomass and anthropogenic degradation, while controlling for the effects of climate, topography and soil conditions (Fig. 1d,g and Methods). This analysis revealed a consistent decline in tree carbon density along the anthropogenic degradation gradient across all biomes, evident in both the ground-sourced and the satellite-derived biomass observations (Fig. 1e,h).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Sec9\"], \"section\": \"Mapping the human impact on tree biomass\", \"text\": \"Our GS models of potential forest biomass combine plot-level aboveground forest carbon measurements with spatially explicit data reflecting climate, soil conditions, topography, forest canopy cover and human disturbance, using random-forest machine-learning models to interpolate our biomass measurements across the globe (see\\u00a0Methods). In the first set of models (GS1), we estimated the global forest carbon potential in the absence of human activity by statistically accounting for the impact of human disturbance, setting all variables directly reflecting human disturbance to zero. By contrast, the second set of GS models (GS2) extrapolated the global forest carbon potential from data derived from protected areas with minimal human disturbance. To account for uncertainties in canopy-cover estimates from the forest inventory plots, we incorporated upper and lower boundaries of canopy cover in each pixel, resulting in a total of four GS models: GS1Upper, GS1Lower, GS2Upper and GS2Lower. We extended this combination of approaches to evaluate the biomass potential for each of the three satellite-derived biomass products (ESA-CCI, Walker et al. and harmonized). The models included either all terrestrial regions (SD1) or only regions with minimal human disturbance (SD2), using the same set of predictor variables as covariates included in the GS models. This resulted in a total of six SD models: SD1ESA-CCI, SD1Walker, SD1Harmonized, SD2ESA-CCI, SD2Walker and SD2Harmonized.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig2\"], \"section\": \"Mapping the human impact on tree biomass\", \"text\": \"The full combination of models allowed us to disentangle the effects of deforestation and forest degradation on tree carbon losses while representing data and model uncertainties. The total tree carbon potential was determined by summing the forest carbon that would naturally exist (1) outside existing forests (restoration potential) and (2) in existing, degraded forests (conservation potential). The resulting maps provide models of tree carbon potential under current (1979\\u20132013) climate conditions in the hypothetical absence of human disturbance (Fig. 2a).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\"], \"section\": \"Mapping the human impact on tree biomass\", \"text\": \"The coefficients of variation from a bootstrapping procedure showed that existing and potential carbon stocks were estimated with confidence across all models. For 90\\u2013100% of the pixels inside the existing and potential forest area, the coefficients of variation were below 20% (Supplementary Figs. 1 and 2). A spatial-validation procedure (spatially buffered leave-one-out cross-validation (LOO-CV)), accounting for the potential effects of spatial autocorrelation on model-validation statistics, showed that the GS and SD models explained 70\\u201377% or 82\\u201387% of the spatial variation in tree biomass, respectively (Supplementary Table 1 and Supplementary Fig. 3). Furthermore, when specifically considering disturbed regions with human-disturbance levels ranging from 10% to 60%, the explained variation in tree biomass remained high (>60%), showing that our models effectively captured the variation of carbon stocks in regions with high human footprint (Supplementary Fig. 4).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\", \"MOESM1\", \"MOESM1\"], \"section\": \"Comparison between models\", \"text\": \"Despite discrepancies in certain regions, there was high overall agreement between the ground-sourced and satellite-derived biomass estimations at the global scale (average R2 of 0.72 at a spatial resolution of approximately 1\\u2009km2; Supplementary Figs. 5\\u20139). This agreement translated to similar estimates of existing live tree biomass: 367\\u2009Gt\\u2009C (model range\\u2009=\\u2009334\\u2013400\\u2009Gt\\u2009C)\\u00a0for the GS models and 394\\u2009Gt\\u2009C (model range 355\\u2013445\\u2009Gt\\u2009C)\\u00a0for the SD\\u00a0models (<7% difference). A comparison of existing biomass estimates across the latitudinal gradient also showed high inter-model consistency, with the GS model predicting slightly higher biomass values than the SD model for the equatorial zone and lower biomass values at high-latitude regions\\u00a0of\\u00a0the Southern Hemisphere (>40\\u2009\\u00b0S) (Supplementary Fig. 6). On average, the models predicted that 69% of live tree biomass is stored in tropical regions, with temperate, boreal and dryland regions accounting for 18%, 11% and 1%, respectively (Supplementary Table 3).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig3\", \"Fig4\", \"MOESM1\", \"MOESM1\", \"Fig5\", \"MOESM1\"], \"section\": \"Comparison between models\", \"text\": \"Using all sets of GS and SD models, we could estimate the total potential living tree carbon that would exist in the absence of human influence. Our models projected considerable gains in the hypothetical natural forest biomass, with a mean estimate for total potential living tree carbon of 600\\u2009Gt\\u2009C (model range\\u2009=\\u2009487\\u2013712\\u2009Gt\\u2009C). The individual model estimates were as follows: GS1Upper\\u2009=\\u2009487\\u2009Gt\\u2009C, GS1Lower\\u2009=\\u2009595\\u2009Gt\\u2009C, GS2Upper\\u2009=\\u2009517\\u2009Gt\\u2009C, GS2Lower\\u2009=\\u2009647\\u2009Gt\\u2009C, SD1Harmonized\\u2009=\\u2009552\\u2009Gt\\u2009C, SD1ESA-CCI\\u2009=\\u2009578\\u2009Gt\\u2009C, SD1Walker\\u2009=\\u2009669\\u2009Gt\\u2009C, SD2Harmonized\\u2009=\\u2009596\\u2009Gt\\u2009C, SD2ESA-CCI\\u2009=\\u2009645\\u2009Gt\\u2009C and SD2Walker\\u2009=\\u2009712\\u2009Gt\\u2009C (Figs. 3 and 4 and Supplementary Tables 2 and 3). The highest estimates were derived from the Walker et al. map, with the GS, harmonized biomass and ESA-CCI estimates being 19%, 17% and 11% lower, respectively. Overall, we predict that, under current climate conditions, a further 217\\u2009Gt (model range\\u2009=\\u2009153\\u2013267\\u2009Gt) of living tree carbon could potentially exist in the absence of humans (Fig. 5b). Of this potential, 123\\u2009Gt\\u2009C (99\\u2013153\\u2009Gt\\u2009C) can be attributed to tropical regions, 55\\u2009Gt\\u2009C (40\\u201366\\u2009Gt\\u2009C) to temperate regions, 14\\u2009Gt\\u2009C (5\\u201325\\u2009Gt\\u2009C) to boreal regions and 25\\u2009Gt\\u2009C (9\\u201341\\u2009Gt\\u2009C) to dryland regions (Supplementary Table 3).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig2\"], \"section\": \"Comparison between models\", \"text\": \"Despite the broad consensus on the global top-down and bottom-up carbon potential estimates, considerable spatial variations were observed in the models. The SD models tended to predict higher potential carbon stocks than the GS models across 82% of pixels, particularly in South American tropical forests (Fig. 2e,f), suggesting possible overestimation of satellite-derived biomass potential in these regions. More ground-sourced data are needed from tropical areas to improve accuracy and balance the high sample sizes available for temperate regions. On the other hand, the GS models predicted slightly higher potential than the SD models in subtropical regions and temperate forests of Europe.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig3\"], \"section\": \"Comparison between models\", \"text\": \"We also show that the type 1 models (GS1 and SD1) predicted a 47\\u2009Gt\\u2009C lower potential than the type 2 models (GS2 and SD2; Fig. 3). The focus on \\u2018undisturbed\\u2019 regions in the type 2 models may introduce bias by favouring regions with unusually high biomass. By contrast, the type 1 models incorporated observations across the full human-disturbance gradient, potentially resulting in an underestimation of potential in regions with incomplete historic-disturbance data. Furthermore, we imposed a constraint on forest biomass potential by limiting forest growth to the potential tree cover range projected in a previous analysis. If this spatial constraint is removed to compare our model with the estimate of Walker et al. of 796\\u2009Gt\\u2009C (without such constraints), our SD2Walker model generates a similar total potential of 760\\u2009Gt\\u2009C (<5% difference). Thus, our mean estimate of Earth\\u2019s total potential living tree carbon of 600\\u2009Gt\\u2009C from the ensemble of modelling approaches is probably conservative.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig4\", \"Fig4\"], \"section\": \"Total ecosystem carbon potential\", \"text\": \"To determine the total carbon storage potential of natural woody ecosystems, we converted our estimates of living tree biomass into total ecosystem carbon stocks by incorporating global data on soil carbon, dead wood and litter. To represent the various sources of uncertainty (Fig. 4), we considered: (1) model type (types 1 and 2); (2) input data (upper and lower canopy cover boundaries for GS models; ESA-CCI, Walker et al. and harmonized for SD models); (3) aboveground biomass potential (bootstrapping); (4) tree root biomass; (5) dead wood and litter; and (6) soil carbon. The GS and SD models exhibited similar uncertainty contributions globally, with 21.2% and 19.0% attributed to aboveground living tree biomass potential, 21.6% and 23.9% to dead wood and litter, 22.8% and 20.7% to aboveground biomass input data, 15.0% to soil carbon, 12.1% and 11.8% to root biomass and 7.3% and 9.6% to model type. Soil carbon emerged as the primary source of uncertainty in regions with high latitudes and elevation. By contrast, aboveground biomass input data and dead wood and litter were the primary sources of uncertainty in dry and humid tropical areas, respectively (Fig. 4).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig5\", \"Tab1\", \"Tab1\"], \"section\": \"Total ecosystem carbon potential\", \"text\": \"Considering all carbon pools together, we estimate that current forest carbon storage is 328\\u2009Gt (221\\u2013472\\u2009Gt) lower than the full natural potential (Fig. 5 and Table 1). Of this difference, 226\\u2009Gt\\u2009C (151\\u2013363\\u2009Gt\\u2009C) exist outside urban and agricultural areas, with 61% in forested regions in which sustainable management and conservation can promote carbon capture through the recovery of degraded ecosystems and 39% in regions in which forests have been removed (Table 1). These estimates highlight that forest conservation, restoration and sustainable management can help achieve climate targets by mitigating emissions and enhancing carbon sequestration.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Tab1\"], \"section\": \"Carbon potential in existing forests\", \"text\": \"Previous work has suggested that up to 80% of the world\\u2019s forests are secondary systems that have undergone anthropogenic degradation. Our models corroborate these findings, revealing a considerable potential for carbon capture in existing forests by allowing these degraded ecosystems to regenerate to maturity. The difference between current and potential\\u00a0ecosystem carbon stocks amounts to 139\\u2009Gt\\u2009C (108\\u2013228\\u2009Gt\\u2009C) in existing forests, representing 61% of the total difference when excluding urban and agricultural areas (Table 1). Of the total 139\\u2009Gt, 11\\u2009Gt (8%) can be attributed to biomass loss in existing forest plantations, in which restoring diverse ecosystems could lead to further carbon capture. The remaining 128\\u2009Gt can be attributed to human degradation in other forest ecosystems. These findings highlight the importance of forest conservation for carbon capture, as ecosystems are allowed to recover to their mature states. It suggests that a substantial proportion of carbon capture can be achieved with minimal land-use conflicts. However, it is essential to acknowledge that the demand for wood and other forest-based products imposes limitations on this potential, given their climate benefits as substitutes for carbon-intensive materials such as fossil fuels and concrete. Nonetheless, evidence shows that reductions in harvesting intensity and forest degradation can deliver important climate benefits. Moreover, our model might underestimate the extent of degradation owing to challenges in capturing historical land-use legacies and limited data availability on plantations in certain countries. These observations reinforce the importance of effective forest conservation and management not only in reducing future carbon emissions but also in removing carbon that has already been released into the atmosphere.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig5\", \"Tab1\", \"MOESM1\"], \"section\": \"Carbon potential in converted lands\", \"text\": \"In areas in which forests have been removed, the difference between the current and potential forest carbon stocks amounts to 189\\u2009Gt\\u2009C (112\\u2013269\\u2009Gt\\u2009C). Of this difference, 30% (57\\u2009Gt\\u2009C) can be attributed to cropland areas, 28% (53\\u2009Gt\\u2009C) to areas experiencing low anthropogenic pressure at present, 23% (43\\u2009Gt\\u2009C) to pasture land, 18% (34\\u2009Gt\\u2009C) to rangeland and 1% (2\\u2009Gt\\u2009C) to urban areas (Fig. 5, Table 1 and Supplementary Fig. 10). It is important to recognize that the scale of this potential is contingent on social land-use constraints. Socially responsible ecosystem restoration must be driven by the land-use decisions of local communities, especially indigenous communities that often face marginalization. Sustainable economic development that promotes approaches that work with nature (for example, agroforestry, ecotourism etc.) can provide critical avenues for long-term financial security as a result of healthy nature. Also, it is important to acknowledge that forests can lead to reductions in surface albedo, which generally have warming effects in high-latitude regions. Conversely, the local biophysical cooling effects of forests in warmer regions probably enhance the climate-adaptation benefits in the global south.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig3\", \"Tab1\"], \"section\": \"Comparison with previous estimates\", \"text\": \"Our integrated estimate of the difference between current and potential\\u00a0global living tree biomass (217\\u2009Gt\\u2009C) falls at the lower end of the range of previous estimates, which ranged from 150 to 446\\u2009Gt\\u2009C (Fig. 3c,d). Also, our estimate of the extra potential for total ecosystem carbon storage outside urban and agricultural land (226\\u2009Gt\\u2009C) aligns closely with recent global-scale estimates of 205 and 287\\u2009Gt\\u2009C (refs.\\u2009). However, it is worth noting that three previous data-driven approaches, not included in this meta-analysis because of methodological differences, have suggested carbon potential values below this range. Specifically, Lewis et al. considered more rigorous social constraints and estimated that natural restoration of 350\\u2009Mha of deforested, tropical land could capture 42\\u2009Gt\\u2009C in living tree biomass. Scaling this estimate to 900 million hectares yielded a potential of 89\\u2013108\\u2009Gt tree carbon, which is comparable with our estimate of tree biomass restoration potential of 91\\u2009Gt\\u2009C outside existing forest, urban areas and cropland regions (Table 1). Similarly, Roebroek et al. recently reported that the carbon potential in existing forests could be as low as 44\\u2009Gt\\u2009C. Their estimate is considerably lower than our conservation potential estimate of 139\\u2009Gt\\u2009C. This difference arises because Roebroek et al. focused only on aboveground tree biomass (excluding soil, roots, dead wood and leaf litter) and only considered the tree cover of existing forested regions. When we narrow our analysis to aboveground biomass in these forests, we recover a similar estimate of forest potential of 50\\u2009(39\\u201363)\\u2009Gt\\u2009C. Nonetheless, when we consider studies that focused on the total ecosystem potential in all forest regions, our analysis reveals a distinct overlap that provides confidence in the scale of carbon losses from the global forest system.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Tab1\"], \"section\": \"Discussion\", \"text\": \"Understanding the potential for carbon storage in natural forests is crucial for comprehending their role in combating climate change. Our combined modelling approach, including ten estimates from this study and nine others from previous studies, allows us to identify the extent of overlap across diverse approaches and increases our confidence about the scale of the forest carbon potential across the globe. We found that total forest carbon storage is, at present, 328\\u2009Gt\\u2009C (model range\\u2009=\\u2009221\\u2013472\\u2009Gt\\u2009C) below its full potential. Of this potential, 102\\u2009Gt\\u2009C (69\\u2013134\\u2009Gt\\u2009C) exist in urban areas, cropland and permanent pasture sites, in which substantial restoration is highly unlikely. Yet, a potential of 226\\u2009Gt\\u2009C (151\\u2013363\\u2009Gt\\u2009C) is in existing forests and regions with low human pressure (Table 1). Of this constrained forest carbon potential, 139\\u2009Gt\\u2009C (61%) can be found in regions that are already forested. This highlights that the prevention of deforestation does not only contribute to the reduction of carbon emissions but has large carbon drawdown potential if ecosystems can be allowed to return to maturity. Improved forest management and restoration to reconnect fragmented forest landscapes contribute a considerable 87\\u2009Gt (39%) to the extra carbon drawdown potential. We stress that, despite considering the broad land-use types, we cannot identify detailed land-use activities at a high resolution, so different social and economic considerations may place further constraints on the scale of this potential. Nevertheless, this work highlights the potential contribution of forest conservation, restoration and sustainable management in capturing carbon from the atmosphere.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig4\", \"MOESM1\"], \"section\": \"Discussion\", \"text\": \"The development of current and natural forest carbon maps involved several approaches and data sources with varying strengths and weaknesses. This ensemble of modelling approaches can help to identify the extent of agreement and uncertainty across modelling approaches, enabling a comprehensive understanding of carbon potential at a global scale. As new satellite technologies, such as the Global Ecosystem Dynamics Investigation (GEDI) project, begin to reveal high-resolution information about forest structure, it will be increasingly important to refine the spatial and temporal resolution of these carbon stock models. Our multimodel and multidata comparison pinpoints regional variation in the main sources of uncertainty in forest carbon potential, highlighting the need for improved aboveground data-sampling efforts in the tropics and soil carbon sampling at high latitudes (Fig. 4). As such, continuing efforts to refine the confidence in this forest carbon potential require advancements in remote-sensing instrumentation, field-monitoring strategies with sustained funding for research teams and field workers, especially in the Global South, better representation of temporal dynamics in carbon stocks, especially in ecosystems prone to natural disturbances, and methodology to allow for strict and verifiable integration of ground data and remote sensing into comprehensive carbon stock estimates. Fair and equitable funding support for sustaining and sharing tropical forest data is vital to reduce global sampling biases in forest inventory efforts (Supplementary Fig. 11).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig1\"], \"section\": \"Forest inventory data\", \"text\": \"Plot-level forest inventory records were obtained from data compiled in the GFBI database (http://www.gfbinitiative.org), which hosts information for 1,188,771 plots (median plot size\\u2009=\\u2009250\\u2009m2) from every continent except Antarctica (Fig. 1). Each plot contains information on stem diameter at breast height (DBH) for each tree. Individuals with a DBH\\u2009<\\u20095\\u2009cm were removed from the analysis. Quality controls of tree density values were conducted and we removed plots with tree densities that fell outside the median\\u2009\\u00b1\\u20092.5 times the median absolute deviation (moderately conservative threshold) in each biome (6% of total plots). This resulted in retaining a total of 25,779,993 tree observations in 1,089,026 plots.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\"], \"section\": \"Biomass estimation for individual trees\", \"text\": \"Following ref.\\u2009, we applied back calculation to generate a pseudo dataset for biomass changes along DBH gradients based on each of the 430 allometric equations. To generate the pseudo data, we applied the following rules: (1) for a DBH between 5 and 25\\u2009cm, each centimetre was assigned a corresponding pseudo biomass value; (2) for a DBH between 25 and 100\\u2009cm, every 5\\u2009cm was assigned a corresponding value; (3) for a DBH between 100 and 300\\u2009cm (maximum DBH), every 10\\u2009cm was assigned a corresponding value. We then trained biome-specific allometric equations (varying in the \\u03b20 and \\u03b21 parameter estimates) based on the pseudo DBH and biomass dataset (Supplementary Fig. 12 and Supplementary Table 4).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Plot-level tree biomass calculation\", \"text\": \"After computing the aboveground dry biomass for all approximately 28 million individuals in our dataset, plot-level biomass values were obtained by summing the biomass of all individuals in the respective plot. For plots that contained data for several years, we calculated the mean of these years. The median year of observation across all plots was 2002. Subsequently, the biomass densities (in t\\u2009ha\\u22121) of each plot were obtained by dividing the total aboveground biomass (W) by the plot area. Carbon values were obtained by multiplying tree biomass by biome-specific wood carbon concentrations, ranging from 45.6% in tropical moist broadleaf forest to 50.1% in temperate conifer forest (see Supplementary Table 5). The spatial modelling was performed at 30-arcsec (about 1-km2) resolution and we therefore averaged tree carbon-density values for plots located in the same 30-arcsec pixel.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Plot-level tree biomass calculation\", \"text\": \"To avoid overestimation of carbon densities, we removed (1) values larger than the maximum carbon density ever recorded for forests (1,867\\u2009t\\u2009C\\u2009ha\\u22121) and (2) values that fell outside the median\\u2009\\u00b1\\u20092.5 times the median absolute deviation (moderately conservative threshold) in each biome. Small outlier values were kept, however, if they fell in human-modified non-forest landscapes, that is, regions with a human-disturbance index >10% and canopy cover <10%. This was done to avoid the underestimation of current carbon in croplands, pasture lands and urban areas that can contain notable amounts of existing biomass in trees outside forests. To obtain normally distributed data, the carbon-density values were log-transformed before the median absolute deviation was calculated, using the following equation (Supplementary Fig. 13):\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Environmental covariates\", \"text\": \"In total, 40 layers, reflecting climate, soil and topographic features, were used as covariates in our analyses (Supplementary Table 6). All layers were standardized to 30-arcsec resolution (1\\u2009km2 at the equator). Layers for 19 bioclimatic variables came from the CHELSA version 1.2 open climate database (www.chelsa-climate.org), topographic information (elevation, slope, roughness, eastness, northness, aspect cosine, aspect sine and profile curvature) from the EarthEnv (www.earthenv.org/topography) database, cloud cover (annual mean, inter-annual standard deviation and intra-annual standard deviation) from the EarthEnv (www.earthenv.org/cloud) database and ref.\\u2009, depth to the water table from ref.\\u2009, the annual mean of solar radiation and wind speed from the WorldClim database (version 2), absolute depth to bedrock and soil texture (clay content, coarse fragments, sand content, silt content and soil pH), averaged for the depth between 0 to 100\\u2009cm below surface, from the SoilGrids database and the Global Aridity Index from the Global Aridity Index and Potential Evapotranspiration (ET0) Climate Database version 2.0 (refs.\\u2009).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Ground-sourced tree carbon density model\", \"text\": \"To train spatially explicit tree carbon models across the world\\u2019s forests, we ran random-forest machine-learning models using Google Earth Engine. The models included 40 environmental layers (representing climate, soil and topographic features), eight human disturbance\\u00a0layers, and canopy cover as predictors. In random forest, unlike traditional regression, correlation among variables does not affect the model accuracy. Indeed, the ability to use many correlated predictors is one of the key benefits of machine-learning models. When variables are correlated, the effect of these variables is \\u2018shared\\u2019 across the trees in the random forest. Because random forest does not estimate coefficients as in regression, this correlation does not hinder model fit or performance but, rather, complicates efforts to quantify variable importance, which is also shared across correlated variables (see Supplementary Fig. 14 for an evaluation of variable importance using a reduced, uncorrelated set of variables). Thus, including numerous variables, even if correlated, can improve the predictive power of the model to accurately quantify current carbon.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\"], \"section\": \"Ground-sourced tree carbon density model\", \"text\": \"In a first step, we tested for the existence of spatial autocorrelation in model residuals, which can bias model-validation statistics. This was done by calculating the Moran\\u2019s I index of the residuals from generalized additive models at different spatial scales (0\\u20131,000\\u2009km). The Moran\\u2019s I indices indicated residual spatial autocorrelation at distances of up to 80\\u2009km for all GS models (Supplementary Fig. 15a\\u2013d). To avoid any bias introduced by the influence of spatial autocorrelation and correct for the uneven sampling across regions, we therefore applied bootstrapped spatial subsampling (100 iterations) to predict both current and potential tree carbon densities (see \\u2018Geospatial modelling of tree carbon potential\\u2019 section). The spatial subsampling was conducted by subsampling one random observation inside each 0.7-arcdegree (about 78-km) grid, resulting in approximately 4,500 observations for each subsample. Given that the model was run with 100 iterations, this resulted in a total of about 450,000 samples used to build our GS models. Parameter tuning for each model was performed through the grid-search procedure of Google Earth Engine to explore the results of a suite of machine-learning models trained on the 49 covariates. For each of the models, we ran 48 discrete parameter sets covering the total grid space of 700 possible parameter combinations. Performance of each model was assessed using the coefficient of determination (R2) values from tenfold cross-validation (Supplementary Table 1) and we retained the best models from each bootstrapped spatial subsample. All R2 values reported throughout the manuscript represent the coefficient of determination relative to the 1:1 line of observed versus predicted values, which is equivalent to a standardized mean squared error.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Ground-sourced tree carbon density model\", \"text\": \"As an alternative to testing whether spatial autocorrelation in model residuals affects model-validation statistics, we applied spatially buffered LOO-CV using the respective autocorrelation distances as buffer radii (Supplementary Table 1). In this procedure, each data point is predicted by a model that uses all data outside the buffer radius of the respective data point as training data. To run the LOO-CV, we used the hyperparameter settings of the best-performing random-forest model based on random tenfold cross-validation.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Ground-sourced tree carbon density model\", \"text\": \"To account for tree carbon stored belowground as roots, we multiplied our aboveground tree carbon predictions by the pixel-level means or the upper and lower confidence bounds of the proportional contribution of root carbon, using a spatially explicit map of tree root mass fraction (Supplementary Fig. 16). This map was derived from random-forest models based on 5,170 spatially explicit observations of tree biomass ratios between roots and shoots, covering all continents except Antarctica. Confidence ranges of the pixel-level root mass fraction estimates were based on sampling uncertainty, using a stratified bootstrapping procedure (see methods in ref.\\u2009 for details).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Ground-sourced tree carbon density model\", \"text\": \"To evaluate the extent of model interpolation versus extrapolation, that is, how well our training data represent the full multivariate environmental covariate space, we performed an approach based on principal component analysis (PCA). To do so, we performed PCA on the 49 covariates represented in our training data, using the centring values, scaling values and eigenvectors to transform the 49 covariates into the same PCA spaces. Then we created convex hulls for each of the bivariate combinations from the top 19 principal components (which collectively covered more than 90% of the sample-space variation). Using the coordinates of these convex hulls, we classified whether each pixel falls within or outside each of these convex hulls. In total, 92% of the potential canopy cover area fell within \\u226595% of the 171 PCA convex hull spaces computed from our training data (representing the range of environmental conditions in our training data), with most of the outliers existing in arid regions (Supplementary Fig. 17a).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Ground-sourced tree carbon density model\", \"text\": \"We also tested how well the training data span the variation in the eight human-disturbance layers. In total, 90% of the potential canopy cover area fell within \\u226595% of the ten PCA convex hull spaces computed from our training data (Supplementary Fig. 17b).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\"], \"section\": \"Satellite-derived tree carbon density models\", \"text\": \"The ESA-CCI map represents aboveground living tree biomass for the year 2010 and was produced using satellite data from ALOS-2/PALSAR-2 and a physical-based inversion model that estimates biomass from growing stock volume, wood density and biomass expansion factors, with bias adjustment following the validation framework in ref.\\u2009. The map was averaged from 100-m to 1-km2 spatial resolution to match the resolution of the covariates. The 1-km2 ESA-CCI map was assessed following the validation framework in ref.\\u2009, wherein map bias is predicted using a model-based approach based on global reference data. This step reduces mapping bias in areas with statistically significant prediction bias and particularly reduces the underestimation of biomass at high-biomass forests >350\\u2009t\\u2009ha\\u22121. The map comes with an uncertainty layer that accounts for spatially correlated errors during spatial averaging. To convert the living tree biomass estimates to carbon, we multiplied tree biomass with biome-specific wood carbon concentrations (see Supplementary Table 5).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Equ3\"], \"section\": \"GS models\", \"text\": \"After training and parameterizing the GS model of current tree carbon density using equation (3), we estimated the potential tree carbon density in forests that could exist in the absence of human disturbance by modifying this equation setting human-disturbance variables to zero and replacing existing canopy cover with potential canopy cover (GS1):in which are the environmental variables, are the scaled human-disturbance variables set to zero and is the current canopy cover, which was replaced by potential canopy cover after model training for the prediction of the total carbon potential. This allowed us to train the model including information on current (2010) forest canopy cover and then to predict the tree carbon potential inside the potential canopy cover by replacing current canopy cover with the \\u2018natural\\u2019 canopy cover expected in the absence of humans.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Equ5\", \"Equ6\"], \"section\": \"SD models\", \"text\": \"The two types of SD model were run with the ESA-CCI, Walker et al. and harmonized maps of current woody carbon as input data, resulting in six model combinations (two model types and three input datasets). As for the GS1 model, model structure and parameterization of the first SD model of potential living tree carbon (SD1) followed equation (5). Similarly, as for the GS2 model, the second SD model of potential tree carbon density (SD2) followed equation (6), and we trained the model using only biomass density information from areas with minimal human disturbance inside protected areas (strict nature reserve or wilderness area) and/or intact forest landscapes.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\"], \"section\": \"SD models\", \"text\": \"To test for spatial autocorrelation in model residuals, we calculated the Moran\\u2019s I index of the residuals from generalized additive models at different spatial scales (0\\u20131,000\\u2009km) and, for each model, found spatial autocorrelation at distances of up to 550\\u2013900\\u2009km (Supplementary Fig. 15e\\u2013j). To test for the effect of spatial autocorrelation on model validation statistics, we then ran LOO-CV models for each of the 100 bootstrapped subsamples, using the respective autocorrelation distances as buffer radii and the hyperparameter settings of the best-performing random-forest model based on random tenfold cross-validation (Supplementary Table 1).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Tab1\"], \"section\": \"Dead wood and litter biomass\", \"text\": \"To account for forest carbon stored in dead wood and litter, we obtained forest-type-level carbon ratios from previous studies. Means and confidence ranges of the ratios between dead wood and litter carbon and living tree carbon for tropical, temperate and boreal forests were calculated from forest-type estimates of total living biomass, dead wood and litter from Table\\u00a0S3 in ref.\\u2009. Means and confidence ranges for dryland forests were calculated from Table\\u00a01 in ref.\\u2009, using all sites for which data on plant aboveground and belowground biomass and litter was available. The ratios between dead wood and litter carbon and living tree carbon were 22% (95% confidence range\\u2009=\\u200915\\u201333%), 33% (30\\u201337%), 80% (68\\u201394%) and 21% (2\\u201340%) for tropical, temperate, boreal and dryland forests, respectively. We then multiplied pixel-level living tree carbon values by these percentages to estimate the means and confidence bounds of dead wood and litter carbon for each pixel (Table 1).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Tab1\"], \"section\": \"Soil carbon\", \"text\": \"Using the soil potential map ref.\\u2009, which represents the effects of anthropogenic land-use and land-cover changes on soil organic carbon in the top 2\\u2009m (ref.\\u2009) over the past 12,000 years, we extracted estimates of soil carbon potential in the absence of humans (difference between soil carbon 10,000\\u2009BC and current soil carbon) for all pixels that would naturally support trees (potential canopy cover3\\u2009\\u2265\\u200910%; Table 1). Associated spatial-prediction uncertainties (absolute errors) were calculated by fitting a spatial-prediction model to the prediction residuals of the cross-validated original model and applying this error model over the whole area of interest.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\"], \"section\": \"Model uncertainty\", \"text\": \"For each of the GS and SD models, the 100 bootstrapped models of aboveground tree carbon potential were used to calculate per-pixel coefficient-of-variation values (standard deviation divided by the mean predicted value) as a measure of sampling uncertainty (hereafter referred to as bootstrap prediction uncertainty; Supplementary Figs. 1 and 2). Using the bootstrapped models, we also calculated 95% confidence ranges of estimates, allowing us to represent uncertainty ranges for each aboveground carbon model. To represent the uncertainty in canopy cover of the forest inventory plots, we ran the GS1 and GS2 models for both the upper and lower canopy cover estimates. To represent data uncertainty of the SD models, we ran the SD1 and SD2 models using three different input datasets (ESA-CCI, Walker et al. and harmonized biomass maps). Uncertainty in belowground tree carbon was derived by multiplying the upper and lower confidence ranges of aboveground tree carbon values with the upper and lower confidence ranges of spatially explicit root mass fractions, thus representing uncertainties in both root mass fraction and aboveground biomass. Using the entire confidence range of total (aboveground and belowground) living tree carbon, including sampling and data uncertainty, we then calculated the uncertainty in dead wood and litter biomass by multiplying the upper and lower confidence ranges of total living tree carbon values with the upper and lower confidence ranges of the forest-type-specific ratios between dead wood and litter carbon and living tree carbon (see \\u2018Dead wood and litter biomass\\u2019 section). Dead wood and litter biomass uncertainty was thus the result of uncertainties in both dead wood and litter-to-tree biomass ratios and tree biomass. Spatially explicit uncertainties in soil carbon potential were derived from maps of absolute errors in organic carbon density at 0\\u2013200\\u2009cm soil depth provided in ref.\\u2009. Propagation of uncertainty was done by summing all individual uncertainties and assuming that they are uncorrelated.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig4\"], \"section\": \"Model uncertainty\", \"text\": \"To quantify the relative contribution of the different sources of uncertainty to the overall uncertainty in our models, we divided the absolute uncertainty of each uncertainty type by the sum of all uncertainties (Fig. 4). This partitioning allows for relative comparison in uncertainty among sources, but otherwise does not necessarily reflect total model uncertainty owing to overlap and correlation across sources of uncertainty.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig3\", \"Fig5\"], \"section\": \"Carbon potential partitioning\", \"text\": \"Throughout the text, we refer to conservation potential as the difference between current and potential\\u00a0carbon in existing forests, which was computed by subtracting the carbon stored at present inside existing forests from the expected carbon in these forests in the absence of human disturbance. We refer to restoration potential as the difference between\\u00a0current\\u00a0and potential carbon outside existing forests, which was estimated as the expected carbon in non-forest areas that would naturally support trees in the absence of human disturbances. Finally, the total difference between current and potential\\u00a0carbon refers to the sum of the conservation and restoration potentials (Figs. 3 and 5).\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"MOESM1\", \"MOESM1\", \"Fig5\"], \"section\": \"Carbon potential partitioning\", \"text\": \"To estimate the existing and potential carbon within biomes (Supplementary Table 2), forest classes (tropical, temperate, boreal and dryland; Supplementary Table 3) and countries (Fig. 5e), we used the World Wide Fund for Nature (WWF) biome definitions and country boundaries from the world boundary map. Forests were classified into four broad categories (tropical, temperate, boreal and dryland). Tropical forest includes six biomes: tropical and subtropical moist broadleaf forest, tropical and subtropical dry broadleaf forest, tropical and subtropical coniferous forest, tropical and subtropical grassland, savannah and shrubland, flooded grassland and savannah, and mangroves; temperate forest includes four biomes: temperate broadleaf and mixed forest, conifer forest, temperate grassland, savannah and shrubland, and montane grassland and shrubland; boreal forest includes two biomes: boreal forest/taiga and tundra; dryland refers to the two biomes Mediterranean forest, woodland and scrub, and desert and xeric shrubland.\"}, {\"pmc\": \"PMC10700142\", \"pmid\": \"37957399\", \"reference_ids\": [\"Fig3\", \"Fig3\", \"Fig3\", \"MOESM1\"], \"section\": \"Meta-analysis of previous studies on the global carbon potential\", \"text\": \"To gain insight into the forest carbon potential estimated by previous studies, we reviewed publications that applied diverse approaches to quantify the potential carbon storage capacity of global forests. These studies fall into two types of estimate. The first type included studies reporting the total carbon that could be stored in global forests in the absence of human activities (Fig. 3b). The second type encompassed studies reporting the extra potential carbon that could be stored in the global forests, that is, the difference between current and potential\\u00a0carbon stocks (Fig. 3d). In total, we found 20 estimates of the total carbon potential and nine estimates of the difference between current and potential\\u00a0carbon stocks. These estimates were derived from four different approaches: inventory-based empirical estimates, mechanistic models, ensemble models and data-driven models. Inventory-based estimates comprise studies that estimated the global carbon potential from maximum forest carbon densities observed in climate zones or ecoregions based on inventory data. Mechanistic-model estimates included studies that used mechanistic models, such as Earth system models, to estimate the carbon potential of global forests. Ensemble-model estimates consisted of studies that used a variety of existing biomass maps to estimate the global carbon potential from maximum forest carbon densities in climate zones or ecoregions. Last, the data-driven model category encompassed studies that used extensive global carbon density observations to train global models based on environmental covariates. References to the studies included in this meta-analysis are shown in the legend of Fig. 3 and Supplementary Table 7.\"}]"
Metadata
"{}"