// NCCS Storage

All NCCS compute environments provide access to /home directories for scripts and code, /nobackup directories for code and data, and project /nobackup areas for collaborative work. For more information on best practices or monitoring usage, see the Discover and/or Explore/ADAPT documentation.

The NCCS is continually evolving its storage systems to find efficient, cost-effective solutions for the data-intensive needs of its users. A Centralized Storage System (CSS) was developed to share curated datasets, reduce duplication, and free storage for new data products. CSS provides storage services for both traditional high-performance computing and on-premises cloud and allows users to move final data products to online storage to support emerging data analytics. This filesystem is available to all the NCCS compute environments (Discover, ADAPT/Explore, and DataPortal) as read-only by default. NCCS encourages users to share their data by requesting write access and moving their final data products to CSS. Data for CSS must fall under an active Data Management Plan; contact the NCCS User Services Group for help setting up a plan. You may also view the current list of available CSS datasets.

Historically, the NCCS has provided longer-term storage of data products on tape in the Mass Storage System (MSS). Yet, with ever-increasing data output and requirements for increased resource efficiency, the NCCS is moving from MSS to a combination of services including CSS, increased local storage for working datasets, and mechanisms to archive final data products at off-site facilities. The NCCS encourages its users to redirect writes from /archive to prepare for upcoming changes. See the MSS pages for documentation.

While the NCCS provides, maintains, and monitors storage, users are expected to:

  • Provide DMPs when required and apply data lifecycle management best practices to any storage allocated.
  • Inform their PIs and the NCCS when they leave. PIs/Sponsors must designate another person or project to own the data, or else indicate that it should be removed. If data remains unclaimed and/or the NCCS receives no direction from the PI, it will be made inaccessible and will not be retained for longer than 2 years. If storage resources become constrained, data may be deleted sooner.

Storage Locations

Discover and Explore/ADAPT

Location Intended Usage Default Size (Discover) Default inodes (Discover) Default Size (Explore/ADAPT) Default inodes (Explore/ADAPT) Backup Status
/home Personal files, including small data sets and software in development. 1 GB N/A 10 GB N/A yes
Individual /nobackup Personal data or software too large for /home directories. 5 TB 300,000 10 TB N/A no
Project /nobackup Data and software shared with project members. Custom Custom Custom Custom no

Other Storage Locations

System Intended Usage Default Size Default # of inodes Backup Status
CSS (/css) Curated data products that are archived at an external location. Custom Custom No
MSS (/archive) Offline tape storage used to free up online storage. Note: MSS will be transitioning to a read-only state once all major dataflows have been stopped/redirected. Users are no longer receiving allocations there. None None No
NASA Ames (HECC) or AWS Deep Archive Offline storage for final data products that aren't archived at an official NASA archive. Custom N/A Two copies at HECC
GitLab Software development, CI/CD. 1 GB N/A Yes

Data Classification

The NCCS categorizes user data as one of the following:

  1. Input – Data that comes from another project and is used to generate new data sets.
  2. Intermediate – Data generated during software runs that may need post analysis and/or quality checks.
    • Not permanent.
    • Not to be shared publicly.
    • Could be restart files, research results, or temporary files.
  3. Final – Data used for publications or shared with the science community and/or collaborators.
    • Permanent
    • May be shared publicly.
    • Could be in multiple data formats.
    • Could be input to other science programs/projects.
  4. Software – Data needed to run user programs or applications.

// Data Collections

Available datasets on CSS include atmosphere, ocean, land, and flood data, both current and historical, as well as operational Global Modeling and Assimilation Office (GMAO) weather analysis data and forecasts that are updated four times daily.

DATASETS

// Account Information 


new users icon

See information about account eligibility, set-up, and maintenance, or reset password.

LEARN MORE