High-Performance Supercontainers Show Promise for
Earth System Prediction

Using a NASA Center for Climate Simulation (NCCS) supercomputer and other high-performance computing (HPC) platforms, the multi-agency Joint Center for Satellite Data Assimilation (JCSDA) successfully demonstrated software “supercontainers” for its next-generation Joint Effort for Data assimilation Integration (JEDI) system.

Demonstration results were presented at the virtual 101st Annual Meeting of the American Meteorological Society (AMS 101) on January 14, 2021 by Mark Miesch, JEDI core team member and software engineer with the University Corporation for Atmospheric Research.

Software containers are encapsulated user environments that contain everything needed to run an application. They are portable across computing systems ranging from laptops to cloud to HPC; supercontainers can run across multiple HPC nodes. Other container benefits include reproducibility, version control, efficiency, and getting new users up and running quickly.

“Making the JEDI software widely available through the supercontainer to the user and developer community is strategically important,” said Tsengdar Lee, NASA High-End Computing Program Manager and ex-officio JCSDA Management Oversight Board member. “It enables the NASA and NOAA Earth system modeling centers to lower the barrier to entry and greatly democratize the sophisticated satellite data assimilation software for community-based open science projects.”

This visualization depicts specific humidity at atmosphere Level 54 (in one of six cubed-sphere grid tiles covering the globe) from the Joint Center for Satellite Data Assimilation (JCSDA) Finite-Volume Cubed-Sphere Dynamical Core Global Forecast System (FV3-GFS) Joint Effort for Data assimilation Integration (JEDI) 3DVar application, which was wrapped into a supercontainer for running on high-performance computing (HPC) platforms.

For their supercontainer benchmarking study, JCSDA scientists focused on a JEDI variational data assimilation application (3DVar) based on NOAA’s Finite-Volume Cubed-Sphere Dynamical Core Global Forecast System (FV3-GFS). The application ran at a c192 (50-kilometer) model resolution and assimilated 12 million conventional and satellite observations. All benchmark runs used 864 MPI tasks, focusing on container performance rather than scalability.

The scientists built the FV3-GFS JEDI 3DVar supercontainer using Singularity software. They then ran the supercontainer on three HPC platforms: the NCCS Discover supercomputer, the S4 supercomputer at the University of Wisconsin’s Space Science and Engineering Center (SSEC), and Amazon Web Services (AWS).

Impact: This work demonstrates that software containers provide a viable means to run numerical weather prediction applications efficiently across HPC systems, without the need for JEDI users and developers to maintain complex software stacks.

Specific HPC platform configurations were as follows:

AWS S4 S4-Ivy Discover
24 c5n.18xlarge compute nodes
864 cores
27 compute nodes
864 cores
44 compute nodes
880 cores
31 SGI C2112 compute nodes
868 cores
3.6 GHz Intel Xeon Skylake-SP or Cascade Lake: 36 cores, 192 gigabytes (GB) 2.1 GHz Intel scalable Xeon Gold 6130 (Skylake): 32 cores, 192 GB 2.8 GHz Intel Xeon E5-2680 v2 (Ivy Bridge): 20 cores, 128 GB 2.6 GHz Intel Xeon Haswell: 28 cores, 128 GB
Elastic Fabric Adapter (EFA) EDR/FDR InfiniBand network
(100/56 gigabits per second, Gbps)
FDR-10 InfiniBand network
(56 Gbps)
FDR InfiniBand network
(56 Gbps)

The JEDI supercontainer benchmark performance (see graph below) ranged from 8.8 minutes on Discover to 14.59 minutes on AWS, with effectively no overhead for running in the supercontainer. The supercontainer running on Discover was optimized by using Mellanox InfiniBand network drivers. Additional tuning on AWS can speed completion time to 10.6 minutes.

The graph shows native vs. container performance for the FV3-GFS JEDI 3DVar application running on three HPC platforms: Amazon Web Services (AWS), Skylake and Ivy Bridge nodes of the University of Wisconsin Space Science and Engineering Center’s S4 supercomputer, and the NASA Center for Climate Simulation’s Discover supercomputer. Solid lines illustrate the standard deviation of 10 runs on each platform.

“By providing the capability to deploy Singularity containers on a leading-edge HPC system, NCCS has helped to pave the way for Earth science applications to exploit the portability and reproducibility benefits that containers have to offer,” Miesch said. “Nick Acks and Kenny Peck of the NCCS technical staff were such a huge help in getting the container to work efficiently on Discover.”

A variety of development and application containers and instructions on how to run them are available through JCSDA’s JEDI-FV3 Release page. For future releases, JEDI will leverage Singularity features such as native GPU support “to extend the containerized applications to more heterogeneous HPC configurations,” Miesch noted at AMS.

Related Links

Jarrett Cohen, NASA Goddard Space Flight Center