// Discover Scalable Unit 17 (SCU17) Available for NCCS Users

The NCCS is pleased to announce the availability of the Discover Scalable Unit 17 (SCU17) for general use. Please note the following:

Hardware

704 nodes (128 cores per node (126 maximum user cores, see below), 512 GB RAM per node, with ~4 GB of RAM per core). Total 90,112 cores.

OS and Compilers

SCU17 runs SLES15 instead of SLES12 SP5. Your code may run on SLES15 without being recompiled but you'll likely get better performance if you do.

Our compiler recommendation is:

  • Use the Intel 2021.4 compiler and mpi if you want reproducibility
  • Use the Intel 2023 compiler if you want an executable with the latest compiler features

For more information, please see this page.

For more information regarding GEOSgcm, please see this page.

Modules

Many new modules have been created and we are working on more. If you switch between SLES12 SP5 and SLES15 frequently, you may need to manually clear your lmod cache to see only the modules on that OS.

To clear the lmod cache when switching between OSes:

  • rm -rf ~/.lmod.d/.cache

To ignore lmod cache:

  • ml --ignore-cache av

Alternatively, you can add this to your .bashrc file for an automated solution:
# Look for the OS version and set the module path accordingly
OS_VERSION=$(grep VERSION_ID /etc/os-release | cut -d= -f2 | cut -d. -f1 | sed 's/"//g')

export LMOD_SYSTEM_NAME=SLES${OS_VERSION}

1. Packable and Sharable: We are working on making packable and sharable the defaults and expect that to be in place within a month. This will prevent jobs with small core requirements from consuming an entire 128 core node. If you run lots of smaller jobs, please contact us to gain access to the packable partition as soon as possible.

2. Read-only access to CSS: Discover SCU17 compute nodes have 25-Gbps Data Transfer Network connections to enable read-only access to CSS. This capability is available via Slurm directives. Please see Discover CSS Access for details.

How to Access SCU17

Access to SCU17 Milan nodes is available to all NCCS Discover users.

Slurm jobs:

  • SCU17 Milan compute nodes are available via all the existing Slurm partitions.
  • Include “--constraint=mil” in your Slurm directives to request Milan compute node(s). Please see Slurm Best Practices for additional information.
  • By default, user jobs may allocate no more than 126 cores per Milan node. We dedicate two cores per node to system functions that support I/O to our GPFS-based /home and /discover/nobackup filesystems and other OS tasks; this approach provides significant benefits for all currently running user jobs.

SCU17 Login nodes: for an interactive shell on a Milan login node running SLES 15, either:

  • Connect to login.nccs.nasa.gov. At the “Host:” prompt, specify ‘discover-mil', OR
  • Following your initial login to discover, enter ‘ssh discover-mil'.

Please send any questions to the NCCS User Services Group by e-mail at support@nccs.nasa.gov