// Discover Scalable Unit 16 (SCU16) Available for NCCS Users

The NCCS is pleased to announce the availability of the newest addition to the Discover High Performance Linux Cluster, Scalable Unit 16 (SCU16), which includes two kinds of nodes, and a high-speed Data Transfer Network for read-only access to curated datasets on the Centralized Storage System (CSS).

SCU16 Intel Cascade Lake Refresh nodes

  • 676 nodes (48 cores, with a maximum of 46 cores and 190GB total memory available, per node, to user job(s))
  • 32,448 total Intel Cascade Lake Refresh processor cores
  • Interconnect: 100Gb/s HDR100 InfiniBand

SCU16 GPU nodes

  • 12 nodes, each with
    • Four NVIDIA A100 GPUs (6,912 CUDA cores/GPU, 40 GB memory/GPU, 600 GB/s NVLINK GPU to GPU connection)
    • AMD EPYC Rome (48 cores/node, 492 GB available memory/node)
  • Interconnect: Dual HDR InfiniBand 2x200 Gbps

For more information on Discover SCU16 hardware, please see the Hardware Specifications page

Discover SCU16 Read-Only Access to CSS

Discover SCU16 compute nodes (both Cascade Lake and Rome/NVIDIA GPU nodes) have 25-Gbps Data Transfer Network connections to enable read-only access to CSS. This capability is available via Slurm directives. Please see Discover CSS Access for details.

Discover SCU16: SLES 12 SP5, No MPT

SCU16 is installed with the SLES 12 SP5 Linux operating system. We anticipate that codes running successfully on Discover’s SLES 12 SP3 nodes will experience few issues running on SLES 12 SP5. Note that the SGI MPT libraries are not available on SCU16 Cascade Lake nor GPU nodes, but only on Discover SCU10 and SCU13 Haswell nodes.

How to Access Discover SCU16

  • Access to SCU16 Cascade Lake nodes is available to all NCCS Discover users.
    • Slurm jobs:
      • Include “--constraint=cas” in Slurm directives. Please see Slurm Best Practices for additional information.
      • By default, user jobs may allocate no more than 46 cores per node. We dedicate two cores per node to system functions that support I/O to our GPFS-based /home and /discover/nobackup filesystems; this approach provides significant benefits for all currently running user jobs. (If you are certain that your job generates minimal I/O overhead, please contact support@nccs.nasa.gov, and we can help you explicitly request 47 or 48 cores on a per-job exception basis.)
    • Login nodes: for an interactive shell on a Cascade Lake node running SLES 12 SP5, either:
      • After connecting to login.nccs.nasa.gov, at the “Host:” prompt, specify ‘discover-cas’, OR
      • Following your initial login to discover, use ‘ssh discover-cas’.
  • SCU16 GPU nodes:
    • Early access to SCU16 GPU nodes is by specific request. Please see Discover GPU for more information.

Please send any questions to the NCCS User Services Group by e-mail at support@nccs.nasa.gov