// Moving Data from Dirac
- Logging in to Dirac
- How do I recall data from tape?
- Transferring Data Out
- Avoiding MSS Directory Hangs
Logging in to Dirac
Logging in to Dirac works similarly to any other NCCS system. See the full bastion host instructions already available.
To summarize, if you typically “ssh USERNAME@login.nccs.nasa.gov”, all you need to do is type “dirac” at the host prompt.
If you prefer to avoid typing dirac every time, you can add “dirac.nccs.nasa.gov” to the host line in the following stanza in your ~/.ssh/config file:
Host discover.nccs.nasa.gov dirac.nccs.nasa.gov
User USERNAME
LogLevel Quiet
ProxyCommand ssh -l USERNAME login.nccs.nasa.gov direct %h
Protocol 2
Then, all you need to do is run “ssh USERNAME@dirac.nccs.nasa.gov” and it will take you directly there.
How do I recall data from tape?
Files that are written to tape are "offline" and must be recalled to one of Dirac's disk caches before it can be read or transferred to another location. Dirac has a set of DM commands which can help you determine what state a file is in, among other actions. For example, if you run "dmls -l" in a directory, you may notice some files that are "OFL" which means they are on tape only. Files that are "DUL" are both on tape and on a disk cache.
The simplest way to recall data from tape is with the dmget command. See "man dmget" for full details. There are multiple ways to utilize it: $ dmget filename
$ find /path/directory/ -type f -print | dmget &
$ <file.list dmget &
The "&" will make the dmget run as a background process. Important: If you need to recall more than 100TB of data or over 10,000 files, please send email to support@nccs.nasa.gov so we can help you avoid overburdening the disk cache filesystems. If you need to recall thousands of files, we can also help organize your list by tape to streamline the process.
Dirac and Discover are two separate filesystems. Please log in to Dirac directly to recall your data. This will save you time, improve performance, and reduce network traffic on Discover. Additionally, dmget on Discover is a wrapper script that runs an ssh connection to Dirac. You may hit the command-line limit on Discover depending on the size of the file names.
You can check for files that are on tape only or in the process of being recalled with: $ dmfind /path/ -type f -state OFL -o -state UNM
Some commands will inherently recall data from tape. The following table helps break this down:
Will *NOT* recall file(s) from tape | Will recall file(s) from tape |
---|---|
ls, dmls | dmget |
du, dmdu | scp, rsync |
rm | cp, mv |
find, dmfind* | vi, nano |
chmod, chown | less, more |
head, tail** | |
* Unless you -exec a command that recalls file(s).
** Using head or tail on a file will recall it. Piping printed output into head or tail is different.
Transferring Data Out
After you have recalled your file(s) it is automatically copied to a disk cache. The Dirac nodes mount those disk caches using a fiber channel connection. From their point of view, the /archive filesystems are a local mount. This is different from Discover where /archive is a network mount. Because of that difference, you should continue to use the Dirac nodes to make outbound transfers. You do NOT need to transfer data to Discover before performing a file transfer. In fact, using the Dirac nodes for scp, rsync, or any other method is preferred because it will save you time and effort while reducing the load on Discover.
If you need help looking for storage alternatives, send an email to support@nccs.nasa.gov. We can list some options but please know that most will have costs based on the total amount to be stored and how often the data is accessed.
Avoiding MSS Directory Hangs
You may have noticed MSS has been experiencing an increasing number of emergency downtimes recently. This is due to an ongoing 'directory hang' issue the NCCS has been working on with the vendor. It is triggered most often when running a 'mv' that uses '..' to refer to a parent directory. For example: $ mv * ../
$ mv src/ ..
$ mv ../src dest/
Usually the command or a following one hangs indefinitely, effectively freezing your shell. The process cannot be stopped by normal means. Instead, please explicitly specify the full path for the destination: $ mv * /path/dest
$ mv /path/src /path/dest
Not every hang involves '..' but avoiding using it could help reduce most occurrences. All this highlights how important it is to disposition your data before hardware support is not renewed on October 1st.