High-Performance Supercontainers Show Promise for
Earth System Prediction
Using a NASA Center for Climate Simulation (NCCS) supercomputer and other high-performance computing (HPC) platforms, the multi-agency Joint Center for Satellite Data Assimilation (JCSDA) successfully demonstrated software “supercontainers” for its next-generation Joint Effort for Data assimilation Integration (JEDI) system.
Demonstration results were presented at the virtual 101st Annual Meeting of the American Meteorological Society (AMS 101) on January 14, 2021 by Mark Miesch, JEDI core team member and software engineer with the University Corporation for Atmospheric Research.
Software containers are encapsulated user environments that contain everything needed to run an application. They are portable across computing systems ranging from laptops to cloud to HPC; supercontainers can run across multiple HPC nodes. Other container benefits include reproducibility, version control, efficiency, and getting new users up and running quickly.
“Making the JEDI software widely available through the supercontainer to the user and developer community is strategically important,” said Tsengdar Lee, NASA High-End Computing Program Manager and ex-officio JCSDA Management Oversight Board member. “It enables the NASA and NOAA Earth system modeling centers to lower the barrier to entry and greatly democratize the sophisticated satellite data assimilation software for community-based open science projects.”
For their supercontainer benchmarking study, JCSDA scientists focused on a JEDI variational data assimilation application (3DVar) based on NOAA’s Finite-Volume Cubed-Sphere Dynamical Core Global Forecast System (FV3-GFS). The application ran at a c192 (50-kilometer) model resolution and assimilated 12 million conventional and satellite observations. All benchmark runs used 864 MPI tasks, focusing on container performance rather than scalability.
The scientists built the FV3-GFS JEDI 3DVar supercontainer using Singularity software. They then ran the supercontainer on three HPC platforms: the NCCS Discover supercomputer, the S4 supercomputer at the University of Wisconsin’s Space Science and Engineering Center (SSEC), and Amazon Web Services (AWS).
Specific HPC platform configurations were as follows:
|24 c5n.18xlarge compute nodes
|27 compute nodes
|44 compute nodes
|31 SGI C2112 compute nodes
|3.6 GHz Intel Xeon Skylake-SP or Cascade Lake: 36 cores, 192 gigabytes (GB)||2.1 GHz Intel scalable Xeon Gold 6130 (Skylake): 32 cores, 192 GB||2.8 GHz Intel Xeon E5-2680 v2 (Ivy Bridge): 20 cores, 128 GB||2.6 GHz Intel Xeon Haswell: 28 cores, 128 GB|
|Elastic Fabric Adapter (EFA)||EDR/FDR InfiniBand network
(100/56 gigabits per second, Gbps)
|FDR-10 InfiniBand network
|FDR InfiniBand network
The JEDI supercontainer benchmark performance (see graph below) ranged from 8.8 minutes on Discover to 14.59 minutes on AWS, with effectively no overhead for running in the supercontainer. The supercontainer running on Discover was optimized by using Mellanox InfiniBand network drivers. Additional tuning on AWS can speed completion time to 10.6 minutes.
“By providing the capability to deploy Singularity containers on a leading-edge HPC system, NCCS has helped to pave the way for Earth science applications to exploit the portability and reproducibility benefits that containers have to offer,” Miesch said. “Nick Acks and Kenny Peck of the NCCS technical staff were such a huge help in getting the container to work efficiently on Discover.”
A variety of development and application containers and instructions on how to run them are available through JCSDA’s JEDI-FV3 Release page. For future releases, JEDI will leverage Singularity features such as native GPU support “to extend the containerized applications to more heterogeneous HPC configurations,” Miesch noted at AMS.
- Miesch, M., N. Acks, T. Auligne, D. Hahn, S. Herbener, D. Holdaway; S. Nolin, K. Peck, J. Stroik, and Y. Tremolet, 10.2 - High-Performance "Supercontainers" for Earth System Prediction, Virtual Presentation, 101st Annual Meeting of the American Meteorological Society, 1/14/21.
- “JEDI to Go: High-Performance Containers for Earth System Prediction,” JCSDA News Blog, 7/1/20.
Jarrett Cohen, NASA Goddard Space Flight Center