Evaluating a New Analytics Platform Using the Diurnal Cycle of High Frequency Surface Temperature Data

Mughilan Muthupari

Abstract

There are two problems that needed to be solved with this project. One is a big data problem in that we have a huge amount of temperature data that is difficult to handle. NASA has developed a new analytics platform called the Earth Data Analytics System (EDAS) built on a file system, or analytics cluster, called the Data Analytics Storage System (DASS) to handle the data. Theoretically, this is more efficient than NASA’s older system, however, we need practical results. This is done by comparing the system currently in use, Visgpu02, and the newer EDAS system to show the difference in the efficiency using various analytics on the data taken from reanalyses. Timings of multiple runs indicate that EDAS is nearly 50 times faster than Visgpu02.

The second goal is to use hourly temperature data to examine the difference in the average diurnal cycle from 5-year time slices from the beginning and end of the available reanalyses to determine if the data is consistent with our understanding of climate change. Multiple cities, one region, and the globe were examined. In the future, these analyses will be placed in Jupyter Notebooks to provide a more interactive experience for the user.