NCCS Message of the Day


Please note that this message is updated during the day as needed. To be sure that the message you see is up-to-the-minute, please use the refresh button on your Web-browser.


 
NCCS User Services Group
E-mail: support@nccs.nasa.gov
Phone/Phone-mail: 301-286-9120
Web-site: https://www.nccs.nasa.gov

===== GENERAL INFORMATION (as of 17:05 Tue May 21, 2019)
============================================================================
Disk Issues on the Discover Cluster: The Discover Cluster experienced an
issue with some of its storage LUNs during the morning of Sunday, May 12,
2019.  Symptoms that you may have seen would include "Stale NFS File Handle"
messages or slower than normal I/O operations.  The NCCS Discover System
Administrators have worked to recover from the event and were able to
restore service Sunday afternoon.  There are, however, still several disks
that are currently in a rebuild status, so some I/O operations may still be
slower than normal until the rebuilds finish.
----------------------------------------------------------------------------
Decommission of SCU9 and the Sandybridge Nodes on Discover: In preparation
for the installation of SCU15, we are decommissioning the remaining SCU9
nodes, which includes the last of the Sandybridge nodes on Discover.
Reduction in the number of Sandybridge nodes has already begun, with a
target completion date of Monday, April 29, 2019.  Users explicitly
requesting Sandybridge processors should update their jobs scripts and
remove the “#SBATCH --constraint=sand” directive.  If you need assistance
updating your jobs to use Haswell processors, contact the NCCS User Services
Group by e-mail at support@nccs.nasa.gov.
----------------------------------------------------------------------------
FTP Service Discontinued: In order to remmain in compliance with NASA
security policy, the NCCS has discontinued FTP service.  All data access
will be provided through URL: https://portal.nccs.nasa.gov/datashare/dirac.
Contact the NCCS User Services Group by e-mail to support@nccs.nasa.gov if
you have any questions or concerns.
----------------------------------------------------------------------------
Discover IP Range: If users have established authorizations with remote data
providers or data consumers that are based on IP addresses, the Discover
interactive and gateway nodes all fall within the IP range from
169.154.145.11 to 169.154.145.70.
----------------------------------------------------------------------------
ADAPT Users with Access to Windows VMs: NASA security policy requires a
scheduled downtime every second Wednesday of the month from 9 to 10 a.m. to
reboot the ADAPT Windows VMs after the implementation of the "Microsoft
Patch Tuesday" updates released every second Tuesday of the month.
----------------------------------------------------------------------------
Discover Users: When creating a ticket to report a Discover job crashing, if
possible please include the job ID.  This will greatly enhance our ability
to diagnose the problem.  Thank you.
----------------------------------------------------------------------------
The MATLAB licenses on discover are in high demand.  If you are not actively
using MATLAB on the ineractiive nodes, please exit your session so that
other users may use the local licenses.
============================================================================

============================================================================
===== SCHEDULED SYSTEM UNAVAILABILITY (as of 17:05 Tue May 21, 2019)
============================================================================
Date  Day Duration  System               Reason (Status)
----- --- --------- -------------------- -----------------------------------
05/19 SUN 1600-
05/21 TUE     -1800 Dirac/MSS            Maintenance for SLES12 SP3 Upgrade.
                                         (Completed.)
----- --- --------- -------------------- -----------------------------------
05/21 TUE 1000-1223 ADAPT Science Cloud  Microsoft Security Updates
                                         (Completed: see Details below.)
--------
Details:
--------
The planned work for ADAPT has been completed except for the nodes below. 
At this time we do not have an ETA for restoration of the nodes below.
above102,103,104
beyond108
beyond118
calet101
ecotone03,04,05
himat101
----- --- --------- -------------------- -----------------------------------
EVERY THU 1000-1100 NCCS Bastion Login
                    Services             Maintenance (see Details below).
--------
Details:
--------
Each Thursday one of the systems supporting the NCCS SSH bastion login
service will be patched and rebooted, potentially disconnecting users. The
remaining servers will remain up, allowing impacted users to immediately
reconnect.
============================================================================

============================================================================
===== UNSCHEDULED SYSTEM UNAVAILABILITY (as of 17:05 Tue May 21, 2019)
============================================================================
Date  Day Duration  System               Reason (Status)
----- --- --------- -------------------- -----------------------------------
There is currently no unscheduled system unavailability.
============================================================================

============================================================================
===== OTHER OUTAGES (as of 17:05 Tue May 21, 2019)
============================================================================
Date  Day Duration  System               Reason (Status)
----- --- --------- -------------------- -----------------------------------
There are currently no other outages.
============================================================================

 

Curator: Mason Chang
NCCS User Services Group: 301-286-9120
Authorizing NASA Official: Dan Duffy, High-Performance Computing Lead, GSFC Code 606.2