NCCS Discover User FAQ for the Data Analysis nodes (dali)
Index
Q0: What is Data Analysis Node (Dali) ?
A0. The Dali system is designed for direct-access, interactive, large scale data analysis. This system provides a convenient location from which users can access all the file systems within the NCCS environment, including the Discover GPFS file systems and the Dirac archive. Users can then employ a variety of data analysis tools, including GrADS, IDL, Matlab, Mathematica, python, and more to interactively analyze large data.
| Top
of Page |
Q1: How do I access a Data Analysis Node directly?
A1. From your workstation or a resource outside the NCCS environment,
access the Data Analysis Nodes using ssh. For example:
ssh <USERID>@dali.nccs.nasa.gov
| Top
of Page |
Q2: How do I access a Data Analysis Node from either one of the discover login nodes or from one of the other discover Data Analysis Nodes?
A2. From the discover or dali prompt, access a Data Analysis Node using ssh. For example:
ssh <USERID>@dali
or
ssh <USERID>@dali## (dali01, dali02, dali03 or dali04)
| Top
of Page |
Q3: Where is data stored or read from?
A3. The Global Parallel File System (GPFS) on the dali nodes are the same as those used on the other nodes of discover. These are accessed via:
/discover/home/<username> - GPFS home directories for discover ($HOME).
/discover/nobackup/<username> - GPFS scratch/short-term storage ($NOBACKUP).
/discover/nobackup/projects/<project name> - GPFS scratch/short term storage for projects (shared space)
These are all local to the discover environment but shared between all of the nodes on discover.
NFS mounted filesystems are also available:
/archive/ - all paths beginning with /archive are NFS mounted from the DMF server. These are the same paths available from dirac (CXFS), the discover login and gateway nodes (via NFS).
/portal/ - all paths beginning with /portal are NFS mounted from the dataportal GPFS environment.
Rather than using the NFS mounts, it is highly suggested that you use scp, sftp, bbscp or bbftp to transfer data between dirac and the dataportal. You will most likely see better performance with these commands than using NFS.
| Top
of Page |
Q4: Where are all the applications?
A4. Commercial applications are all installed and accessable via modules. No modules are loaded by default for users. If you load modules via your "dot" initialization scripts in your discover home directory they will also be loaded when you log into the dali nodes.
To see a list of available applications via modules, run the command:
% module avail
Other third party freeware applications are installed under /usr/local/other directory.
| Top
of Page |
Q5: How do I run IDL?
A5. There is no default idl module loaded for the dali nodes. You just need to load an idl environment module:
% module avail tool/idl
tool/idl-6.3 tool/idl-6.4
% module load tool/idl-6.4
bash/sh/ksh users:
Remember to run 'ulimit -s 6000000' and 'ulimit -v unlimited'
csh/tcsh users:
Remember to run 'limit stacksize 6000000' and 'limit vmemoryuse unlimited'
| Top
of Page |
Q6: Where do I run cron jobs?
A6. Cron is available on discover-cron or cron-utc. These systems have access to all of the same filesystems as the other discover nodes (including dali) as well as the same NFS filesystems as the dali nodes.
From discover:
% ssh discover-cron
% ssh cron-utc
| Top
of Page |
Q7: The specifications for the IBM servers that make up the dali nodes are:
A7. Specifications for Data Analysis nodes (dali)
| Top
of Page |
Q8: The processor specifications for Data Analysis nodes (dali):
A8. The processor specifications for Data Analysis nodes (dali)
| Top
of Page |
Q9: RSA Token issue?
A32.Please contact Code 700 Support Desk
Building 12, Room E-132
(301) 286 - 7342
| Top
of Page |
Q10: How do I run MATLAB?
A10. There is no default MATLAB module loaded for the dali nodes. You just need to load an MATLAB environment module:
% module avail tool/matlab
tool/matlab-R2009a
% module load tool/matlab-R2008a
% matlab
bash/sh/ksh users:
Remember to run 'ulimit -s 6000000' and 'ulimit -v unlimited'
csh/tcsh users:
Remember to run 'limit stacksize 6000000' and 'limit vmemoryuse unlimited'
| Top
of Page |
Q11: What type of computing is appropriate on Dali?
A11. Users can run a variety of data analysis tools, including GrADS, IDL, Matlab, Mathematica, python, and more to interactively analyze large data.
| Top
of Page |
Q12: Are compilers available on Dali?
A12. These are available via modules (just like the other nodes part of discover). The software is installed in a GPFS filesystem and available across the entire cluster
| Top
of Page |
Q13: Can I run batch MPI jobs on Dali?
A13. It is not officially supported and if users attempt to use the standard mpirun commands then users will get a message indicating that users need to run within PBS.
| Top
of Page |
Q14: How do I launch GrADS on Dali?
A14. Only use the following GrADS installation:
% /discover/nobackup/projects/gmao/share/dasilva/opengrads/Contents/grads
The other ones built on discover (especially on /usr/local/other) do not contain all the features needed by users.
Click here for more info
| Top
of Page |
|