Tree Census from Space:
Quantifying Woody Biomass
Using Machine Learning


Using the combined capabilities of the NASA Center for Climate Simulation (NCCS) Discover supercomputer and ADAPT Science Cloud and the National Center for Supercomputing Applications (NCSA) Blue Waters supercomputer, NASA researchers are quantifying the woody biomass (amount of carbon) in discrete trees in the hot dry tropical, arid, and semi-arid regions of Sub-Saharan Africa, an area of 10 million square kilometers (km).

Project Overview

NASA scientist Compton Tucker from NASA Goddard Space Flight Center’s Earth Sciences Division and colleagues are using machine learning (ML) on NCCS and NCSA compute resources to conduct a tree census from space.

Using high-resolution, orthorectified mosaics and ML tools, NASA Goddard scientists first create a high-resolution, 50-centimeter (cm) scale canopy map of all trees within millions of square miles of arid and semi-arid regions of the tropics. Combining these tree canopy maps with stereoscopic images, tree heights can be estimated. With that information, scientists can calculate the total amount of carbon sequestered in these trees. While tree biomass is the ultimate goal, the first step is mapping individual tree canopies.

Scientific Modeling

NASA scientists are using hundreds of thousands of visible and near-infrared high-resolution images from four different DigitalGlobe satellites. These images are accessed through the NextView License for NASA’s scientific research and have a resolution of approximately 50 cm per pixel. To reduce any multiple counting of tree pixels, the images are first orthorectified and then mosaicked using Python and GDAL to create panchromatic and NDVI mosaics.

Once the mosaics are completed, training data are compiled from multiple sites across Africa. ML then delineates trees by outlining the tree canopies and producing shapefiles of the tree canopies in question. The execution of the ML approach uses a large amount of compute resources.

This scientific visualization depicts the mapping of individual tree crowns and the stereographic retrieval of their height. This is done at the 50 x 50 centimeter x-y scale with a vertical accuracy of ± 1 meter. This figure is from the Sahel Zone in northern Senegal, on the south side of the Sahara Desert. Visualization by Katie Melocik, NASA Goddard.

HPC Resources Make it Possible

Using NCCS and NCSA high-performance computing (HPC) resources in tandem, scientists are solving this challenge: to calculate the total amount of carbon sequestered in woody vegetation across Sub-Saharan Africa for the first time.

Processing the images and creating the mosaics requires the substantial compute and storage resources of the NCCS ADAPT Science Cloud, which is ideal for this portion of the work. Virtual machines utilizing a specialized software stack connected to an HPC file system (IBM Spectrum Scale cluster) take massive amounts of raw images as input and create output data for analysis. The input data are a three-dimensional array of ~2.0 x 1013 elements.

Analysis at such a massive scale requires substantial HPC resources. This project leverages an innovative combination of using ADAPT to build the mosaics and the NCCS Discover supercomputer for ML-based analysis to organize the mosaicked satellite data for processing. The scientists then turn to the NCSA Blue Waters supercomputer to execute the ML work to identify tree canopies, estimate tree height, and ultimately calculate the amount of carbon stored in the trees.

This is an oblique photograph from the same area showing discrete trees in the semi-arid area known as the Sahel Zone. "Sahel" in Arabic means "shoreline" and has been used for centuries to describe the southern boundary of the Sahara Desert, stretching uninterrupted more than 5,500 kilometers across Africa from the Atlantic Ocean to the Red Sea. Photo courtesy of Compton Tucker, NASA Goddard.


In work published this October in the journal Nature, Martin Brandt of the University of Copenhagen, Tucker, and colleagues used 50,000 commercial satellite images with a 50 cm x-y spatial resolution to map tree crowns across West Africa. The 1.3 million square km study area includes the arid southern portion of the Sahara Desert, stretching through the semi-arid Sahel Zone and into the humid sub-tropics.

Previous tree-mapping efforts using lower-resolution datasets have greatly underestimated the number of individual trees. In this latest project, the scientists distinguished trees from non-trees and mapped a surprising 1.8 billion trees in the study area. They also determined the area of leaves within the tree crowns and now know every tree’s location to ±5 meters. The team has begun calculating tree heights, which combined with tree crown areas will be an accurate predictor of carbon in the wood of trees over the region.


It is important to understand how biomass measurements of the Earth change over time, because this information can be related to changing climatic conditions. The world’s arid and semi-arid regions respond quickly to changes in climate, and because of the low density of trees in these regions, changes in these areas are more easily quantified.

This work is also crucial for understanding the albedo effect of trees on global temperatures. As Tucker and colleagues noted in the October 2020 Nature article, “trees in farmlands, savannahs and deserts constitute an important—but very variable—carbon pool, and affect the climate by lowering the albedo, by altering aerodynamic roughness and through transpiration. As non-forest trees are becoming increasingly recognized in environmental initiatives across Africa, there is a growing interest in consistently measuring and monitoring trees outside of forests at the level of single trees.”

Another reason why the tree census research is so impactful: the scientists use only free and open-source software for data analysis, which is critical in terms of data access, repeatability, and validation of results.

As Princeton research scientist Liang Wang noted in his blog about NASA’s recent open science hackweek, “Although it has been historically been underestimated, open-source software is extremely important for science. It helps create collaborative communities that bring together scientists with different skill sets and scientific objectives. It requires that developers with a scientific background write more readable and sustainable code and reduce errors. Science, including fundamental sciences, has entered an era where open source is mandatory for deeper involvement of the entire community.”

This visualization from NASA’s Scientific Visualization Studio shows a close-up of 50-centimeter-resolution satellite data overlaid with machine learning-based tree crown regions, individual tree counts, and overall tree counts in a portion of the West African Sahara and Sahel. Visualization by Greg Shirah, Compton Tucker, Erin Glennie, Lori Perkins, Helen-Nicole Kostis, Leann Johnson, Ian Jones, and Laurence Schuler; NASA Goddard; Martin Brandt, University of Copenhagen; and Jérôme Chave, French National Center for Scientific Research. View narrated video.

The Future

Using the satellite data and ML, researchers will produce a census of all trees across the entirety of Sub-Saharan Africa—from 12 degrees north latitude to 24 degrees north latitude, from the Atlantic Ocean to the Red Sea. They will use the techniques from the Nature study to estimate the total amount of carbon in woody biomass over this large region of the African continent. By analyzing data from multiple years, it will be possible to calculate changes in biomass over the past decade and link these to the changing, abiotic aspects of climate.

SC20 Conference

On November 17, Tucker described this research in the Invited Talk “Satellite Tree Enumeration Outside of Forests at the Fifty Centimeter Scale” at SC20—the International Conference for High Performance Computing, Networking, Storage and Analysis. The work is also featured in NASA’s SC20 research exhibit.

Related Links

Sean Keefe and Jarrett Cohen, NASA Goddard Space Flight Center