// GSFC OPeNDAP Servers and "subset downloading"

There are three types of OPeNDAP servers running at Goddard (GSFC):

  • GES/DISC runs a Hyrax type
  • GMAO runs a GrADS DODS type (aka GDS)
  • NCCS runs a THREDDS type

The NCCS supports the Global Modeling and Assimilation Office's (GMAO) OPeNDAP server, as well as assists the Goddard Earth Sciences Data and Information Services Center (GES/DISC) with publishing their data to LLNL ESGF.

The major advantage to using OPeNDAP is its ability to select and view a subset (in time/domain) of a large virtual dataset (e.g. a high resolution global historical dataset).

Both Hyrax and THREDDS have a web-interface that allows users to download netCDF data from the web, which can also help users create a shell script (wget) to download data. GDS (DODS) does not have such an interface, so users who want to select a subset of data from GDS have to use some OPeNDAP enabled applications. On Discover, the applications that can be used are ferret, grads, and python (xarray/netcdf4).

Examples

We would like to select the variable epv from GEOS-5 with a latitude from 80N to 82N, a longitude from 72W to 70W, and a time from Feb 20, 2021 to Feb 25, 2021.

Using ferret:

module load ferret/7.6.0
$ ferret
! NOAA/PMEL TMAP
! FERRET v7.6 (optimized)
! Linux 3.10.0-1127.10.1.el7.x86_64 64-bit - 06/25/20
! 27-Feb-21 06:50
yes? use "https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/tavg3_3d_asm_Nv"
yes? SET REGION/X=72W:70W/Y=80N:82N/T="2021-02-20T01:30:00":"2021-02-25T01:30:00"
yes? SAVE/FILE=ferret_subset.cdf EPV
yes? exit

To run these ferret commands directly from the system, you should make a jnl file for them (ex. mysubset.jnl)
$ ferret < mysubset.jnl

Using grads:

$ grads -b
ga-> sdfopen https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/tavg3_3d_asm_Nv
ga-> set lon -72 -70
ga-> set lat 80 82
ga-> set time 00Z20FEB2021 00Z25FEB2021
ga-> set z 1 72
ga-> define epv = epv
ga-> set sdfwrite grads_subset.nc
ga-> sdfwrite epv
ga-> exit

To run these grads commands directly from the system, you should make a gs file for them (ex. mysubset.gs; add single quote ' ' on each line) $ grads -b -blcx "mysubset.gs"

Using python:

module load python/GEOSpyD/Min4.8.3
conda activate python
>>>import xarray as xr
>>>URL ='https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/tavg3_3d_asm_Nv'
>>>d = xr.open_dataset(URL,engine='netcdf4')
>>>ds = d.epv.sel(lat=slice(80, 82), lon=slice(-72,-70), time=slice("2021-02-20T01:30:00", "2021-02-25T01:30:00"))
>>>ds.to_netcdf("xarray_subset.nc", engine='netcdf4')
>>>quit()

To run these commands directly from the system, you should make a py file for them (ex. mysub.py) $ python mysub.py