NCCS Expanding Discover Supercomputer to 5 Petaflops

The NASA Center for Climate Simulation (NCCS) is integrating the most powerful unit of its Discover supercomputer to date. 

Scalable Compute Unit 14 (SCU14) is a 20,800-core system capable of 1.6 petaflops, or 1,600,000,000,000,000 floating-point calculations per second. After SCU14’s integration, Discover will offer users a total of nearly 108,000 cores with a combined peak performance of just over 5 petaflops.


Scalable Compute Unit 14 (SCU14)—the newest and most powerful unit of the Discover supercomputer—contains 20,800 cores capable of 1.6 petaflops.


NCCS prime contractor CSRA Inc. coordinated the rigorous selection process based on a competitive request for proposals (RFP) to procure and deliver SCUs over the next two years; SCU14 is the first of these deployments. The RFP required the prediction of performance on a set of six benchmark software codes, encompassing the standard benchmark codes Linpack and Stream, along with a custom Message Passing Interface (MPI) performance benchmark, and three simulation codes regularly employed by NCCS users—the Goddard Earth Observing System (GEOS) atmospheric model, the GEOS cube-sphere finite-volume dynamical core, and the NASA-Unified Weather Research and Forecasting (NU-WRF) model.
 
“Everybody had to run the codes and then propose a level of performance for their system,” said Bruce Pfaff, Discover’s lead system administrator.
 
The winning proposal came from Edge Solutions & Consulting, Inc., which served as prime proposing an SCU14 based on Super Micro Computer, Inc. computing hardware; Intel Corp. processors, interconnects, and solid-state drives; and Motivair Corp. chiller doors for system cooling.
 
Besides being the top-performing unit of Discover, SCU14 represents firsts for the NCCS supercomputer in terms of processors, instruction set, and interconnect.
 
Processors: The Supermicro FatTwin server nodes each have dual 20-core Intel Xeon Gold 6148 “Skylake” processors (2.4 GHz). Being two generations newer than the “Haswell” processors in Discover’s SCU10–13, Skylake offers performance increases right out of the gate. “Users might need to change the layout of their application based on processors per node, but they will not have to rewrite their code to take advantage of the Skylakes,” Pfaff said.
 
Instruction Set: Skylake uses a different instruction set, namely Advanced Vector Extensions 512 (AVX-512). If user codes can leverage it, Pfaff said that AVX-512 offers the potential for additional performance gains since it can process twice the number of data elements compared to the AVX and AVX2 instruction sets on Discover’s Haswell and older “Sandy Bridge” cores (SCU9).

Interconnect: Connecting the Skylake nodes at 100 gigabits per second is Intel’s Omni-Path Architecture. Omni-Path has nearly twice the speed of the InfiniBand interconnect deployed in the rest of Discover at a significantly lower cost, “which allows us to purchase more cores,” Pfaff said. “One of the other advantages is that future generations of Omni-Path will be available directly on the Intel chips, which means you can eliminate an external card.”
 
Although available for purchase only since 2016, Omni-Path is used in 38 supercomputers on the June 2017 TOP500 List, including one system in the top 10 and four in the top 20. Before accepting the SCU14 proposal, NCCS staff spoke with representatives from Department of Energy laboratory computing facilities about their experiences with Omni-Path.

Limited applications testing using the small SCU14 Test and Development System showed significant price-performance gains using Omni-Path. After SCU14 comes online, a major focus of the initial pilot user period will be running large-scale NASA modeling applications to learn the differences between Omni-Path and InfiniBand. Test results will aid NCCS staff in developing documentation for the general user community before opening SCU14 for computing time allocations.
 
Having the fastest processors and interconnect plus the most memory per core (4.8 gigabytes) of any Discover SCU makes SCU14 particularly well-suited for weather and climate simulations pushing the boundaries of resolution and complexity that are possible on current-day supercomputers. Discover previously enabled global simulations with resolutions as fine as 1.5 kilometers (km) with GEOS, the flagship model of NASA’s Global Modeling and Assimilation Office (GMAO). SCU14 offers the tantalizing prospect of running GEOS at 1 km or finer resolution, providing a glimpse at future weather prediction and analysis capabilities.

Jarrett Cohen, NASA Goddard Space Flight Center

More Information
Discover Supercomputer
GEOS Systems
NASA-Unified Weather Research and Forecasting (NU-WRF)
TOP500 List–June 2017