NASA Global Weather Forecasting Jumps Forward

How flexible access to supercomputing resources on Discover enabled doubling the forecast model’s spatial resolution and other advances to yield more accurate forecasts

Project Goal:
Every 6 hours, day in, day out, NASA’s Global Modeling and Assimilation Office (GMAO) runs global weather forecasts at the NASA Center for Climate Simulation (NCCS). These computer model forecasts, which range from 2 to 10 days, support NASA satellite instrument teams, field campaigns, and weather and climate research.

The GMAO regularly upgrades their data assimilation and forecasting system to leverage advances in state-of-the-art modeling and assimilation. The transition from a three-dimensional to a four-dimensional data assimilation technique, along with an increase in the spatial resolution of the system, allowed GMAO researchers to maximize the information from observations and increase the accuracy of the analyses and forecasts.

The prior version of the GMAO’s Goddard Earth Observing System Forward-Processing (GEOS FP) forecasting system required 1,372 processor cores to run on the NCCS Discover supercomputer. It used a three-dimensional ensemble assimilation technique, in which the central ensemble member used 25-kilometer (km) grid boxes (resolution) around the modeled globe and the other ensemble members used 50-km resolution. The upgraded GEOS FP doubles the spatial resolution of both the central member and the ensemble, with the central model using 12.5-km resolution. All model members continue to use 72 vertical layers. The new version also employs four-dimensional data assimilation to make dynamic, time-dependent corrections and must ingest 4 million observations every 6 hours. With efficient coding to optimize performance, these changes would demand more than six times as much processing power as before.

The Goddard Earth Observing System Forward-Processing (GEOS FP) system produced this global aerosols forecast for September 8, 2017. Hurricanes Katia, Irma, and Jose are visible in the Atlantic Ocean as large circulations of sea salt particles (blue) caught up in swirling winds. GEOS FP runs every 6 hours at the NASA Center for Climate Simulation (NCCS).

The NCCS highly values its users and tailors its computing systems and services to enable the best science output, creating an ongoing partnership in the process. “Familiarity with the GMAO’s workloads and models allows the NCCS to more effectively evaluate new technologies and make configuration decisions when purchasing new hardware that would best support their codes,” said Bruce Pfaff, NCCS Lead High-Performance Computing System Engineer.

When the NCCS tripled the performance of its Discover supercomputer back in 2015, staff anticipated what the GMAO and other Earth modeling users would need in coming years. Starting that year with Scalable Compute Unit 10 (SCU10), all subsequent units of Discover have more than 4 gigabytes of memory per core. The newest SCUs also have internal network speeds of at least 56 gigabits per second. These characteristics are well-suited to the vast data-handling and inter-node communications demands of higher-resolution weather and climate models.

With 28 Intel Xeon “Haswell” cores per node for a total of 30,240 cores inside one networking fabric, SCU10 proved to be the perfect environment for running the upgraded GEOS FP. The central atmospheric model and data assimilation system runs on 5,600 cores, while the ensemble uses up to 2,800 additional cores for a total of 8,400 cores.

The NCCS also increased Discover’s online disk storage to 65 petabytes, yielding enough capacity to keep all the GEOS FP data on the supercomputer for validation, analysis, visualization, and generating data products. A 10-gigabit-per-second network connects Discover to the DataPortal for sharing GEOS FP data products with GMAO researchers and external customers.

Computing Systems:
Discover is the primary NCCS computing platform and will soon comprise nearly 108,000 cores with a combined peak performance of just over 5 petaflops—5,000,000,000,000,000 floating-point calculations per second. Discover is particularly appropriate for large, complex, communications-intensive problems that benefit from its ecosystem of system software and tools.

The NCCS DataPortal provides an access mechanism to distribute products controlled by the NCCS and its user community, including the GMAO GEOS FP, the Goddard Institute for Space Studies (GISS) ModelE, and the Land Information System (LIS). With its recent integration into the Advanced Data Analytics Platform (ADAPT), the DataPortal is expandable to meet new data service requirements as they arise.

Value Added:
The GMAO has upgraded GEOS FP twice since the January 2017 major upgrade, with one update to assimilate new observational data types. “Having access to the computational resources necessary to utilize this sophisticated system for both research and mission support is key to extracting the maximum amount of information from NASA’s diverse suite of Earth observations,” said Robert Lucchesi, GMAO scientific programmer. “GMAO field campaign support in particular requires immediately available computing resources as well as special consideration and attentiveness by our NCCS partners for scheduling maintenance, upgrades, and other activities that might affect the timeliness of the forecasts.”

Beyond access to a substantial number of processors and disk on Discover, Pfaff’s system administration team has implemented a priority-based preemption scheme within the GMAO workload that allows forecast jobs to enter immediate execution when necessary and pre-empt lower-priority GMAO work. “By augmenting the capabilities of the resource manager with some GMAO-targeted enhancements, we can rapidly evolve as requirements and priorities change," Pfaff said.

The flexibility of the DataPortal enables the GMAO to use several different protocols for distributing GEOS FP data products. The GMAO currently uses the HTTPS, FTP, NetCDF, and OPeNDAP protocols. Data products can include map visualizations customized for each satellite mission or field campaign.

For more information about GEOS FP, visit:
GMAO Hybrid Ensemble-Variational Atmospheric Data Assimilation System: Version 2.0