// NCCS Storage
All NCCS compute environments provide access to /home directories for scripts and code, /nobackup directories for code and data, and project /nobackup areas for collaborative work. For more information on best practices or monitoring usage, see the Discover and/or Explore/ADAPT documentation.
The NCCS is continually evolving its storage systems to find efficient, cost-effective solutions for the data-intensive needs of its users. A Centralized Storage System (CSS) was developed to share curated datasets, reduce duplication, and free storage for new data products. CSS provides storage services for both traditional high-performance computing and on-premises cloud and allows users to move final data products to online storage to support emerging data analytics. This filesystem is available to all the NCCS compute environments (Discover, ADAPT/Explore, and DataPortal) as read-only by default. NCCS encourages users to share their data by requesting write access and moving their final data products to CSS. Data for CSS must fall under an active Data Management Plan; contact the NCCS User Services Group for help setting up a plan. You may also view the current list of available CSS datasets.
Historically, the NCCS has provided longer-term storage of data products on tape in the Mass Storage System (MSS). Yet, with ever-increasing data output and requirements for increased resource efficiency, the NCCS is moving from MSS to a combination of services including CSS, increased local storage for working datasets, and mechanisms to archive final data products at off-site facilities. The NCCS encourages its users to redirect writes from /archive to prepare for upcoming changes. See the MSS pages for documentation.
While the NCCS provides, maintains, and monitors storage, users are expected to:
- Provide DMPs when required and apply data lifecycle management best practices to any storage allocated.
- Inform their PIs and the NCCS when they leave. PIs/Sponsors must designate another person or project to own the data, or else indicate that it should be removed. If data remains unclaimed and/or the NCCS receives no direction from the PI, it will be made inaccessible and will not be retained for longer than 2 years. If storage resources become constrained, data may be deleted sooner.
Storage Locations
Discover and Explore/ADAPT
Location | Intended Usage | Default Size (Discover) | Default inodes (Discover) | Default Size (Explore/ADAPT) | Default inodes (Explore/ADAPT) | Backup Status |
---|---|---|---|---|---|---|
/home | Personal files, including small data sets and software in development. | 1 GB | N/A | 10 GB | N/A | yes |
Individual /nobackup | Personal data or software too large for /home directories. | 5 TB | 300,000 | 10 TB | N/A | no |
Project /nobackup | Data and software shared with project members. | Custom | Custom | Custom | Custom | no |
Other Storage Locations
System | Intended Usage | Default Size | Default # of inodes | Backup Status |
---|---|---|---|---|
CSS (/css) | Curated data products that are archived at an external location. | Custom | Custom | No |
MSS (/archive) | Offline tape storage used to free up online storage. Note: MSS will be transitioning to a read-only state once all major dataflows have been stopped/redirected. Users are no longer receiving allocations there. | None | None | No |
NASA Ames (HECC) or AWS Deep Archive | Offline storage for final data products that aren't archived at an official NASA archive. | Custom | N/A | Two copies at HECC |
GitLab | Software development, CI/CD. | 1 GB | N/A | Yes |
Data Classification
The NCCS categorizes user data as one of the following:
- Input – Data that comes from another project and is used to generate new data sets.
- Intermediate – Data generated during software runs that may need post analysis and/or quality checks.
- Not permanent.
- Not to be shared publicly.
- Could be restart files, research results, or temporary files.
- Final – Data used for publications or shared with the science community and/or collaborators.
- Permanent
- May be shared publicly.
- Could be in multiple data formats.
- Could be input to other science programs/projects.
- Software – Data needed to run user programs or applications.
// Account Information
See information about account eligibility, set-up, and maintenance, or reset password.
LEARN MORE