NASA Goddard’s AI Center of Excellence Connects Scientists and Engineers to the Latest Advances in Artificial Intelligence

NASA Goddard’s AI Center of Excellence
Connects Scientists and Engineers to the
Latest Advances in Artificial Intelligence

Overview

NASA Goddard Space Flight Center’s Artificial Intelligence Center of Excellence (AI CoE) is a new, collaborative umbrella organization for several NASA AI/machine learning (ML) partners. Officially inaugurated in May 2020, the AI CoE is supported by NASA Goddard’s Sciences and Exploration Directorate (Code 600). The mission of the AI CoE is to enable new AI techniques for scientific discovery, providing scientists within NASA Goddard and their partners beyond NASA with resources for increased collaboration, innovation, and co-learning.

The NASA Goddard AI CoE is an umbrella organization managed by a cross-disciplinary team—the Engineering and Technology Directorate’s (Code 500’s) Dr. Nargess Memarsadeghi and Code 600’s Dr. Mark Carroll—with strong support from engineers and scientists across Goddard, across NASA, and across the public and private sectors.

The AI CoE hosts monthly seminars on diverse topics, featuring new collaboration and funding opportunities; advanced computational resources and technology; examples of AI applied to specific science problems; innovative methods for AI, ML, and deep learning for scientific discovery; and helpful co-learning workshops.

Interested participants can engage with the organization and its members through the monthly seminars. For those employees who are internal to NASA Goddard, two additional channels available for engagement are the AI CoE website and the AI CoE MS Teams channel, where members can learn, connect, and collaborate.

ORGANIZATIONAL EVOLUTION

The AI CoE began as a proposal that Memarsadeghi submitted to the NASA Digital Transformation initiative in early 2019. The idea for a Center of Excellence was picked up in 2019 and supported by NASA Goddard’s Computational and Information Sciences and Technology Office (CISTO, Code 606), GSFC Sciences and Exploration Directorate (Code 600) and the Office of the Chief Technologist. Together, they developed a nucleus including an AI CoE website and a MS Teams group to facilitate collaboration and communication between interested parties.

An AI CoE kickoff meeting was held in Spring 2020, and membership has been increasing steadily since that time. The concept of the AI CoE was included in a Science Task Group (STG) proposal submitted by Carroll to Code 600 in the Fall 2020. This STG was selected, and the AI CoE will continue to grow and recruit members. As of this writing, Memarsadeghi and Carroll are actively working to identify the full community of practice for AI/ML across all divisions at NASA Goddard. They have employed AI techniques developed by Dr. Brian Thomas including Natural Language Processing to scour the records at the NASA Goddard Library to identify authors of publications that use AI or ML methods so that they can reach out to them.

*Dr. Nargess Memarsadeghi and Dr. Mark Carroll of the NASA Goddard Space Flight Center.*

MEMBER ORGANIZATIONS

The AI CoE is open to new collaborations across and beyond NASA Goddard. Although the roster of AI CoE collaborators is continuously growing, the organization benefits from both member organizations and collaborative communities of practice::

CISTO's Data Science Group

The Intelligent Systems and Data Analysis Technologies (ISDAT) 

The Center for Helioanalytics

The Goddard Cloud Computing Program (GCC)

The Goddard Machine Learning Academy

ML Science Task Group (STG)

COLLABORATORS

The Frontier Development Lab (FDL).
NASA Strategic Data Management Working Group (SDMWG)
Information Science and Technology Colloquium (IS&T)
The AI and Data Science Workshop Program Committee
Langley Research Center (LaRC)
NASA Jet Propulsion Laboratory (JPL)
California Institute of Technology (Caltech)
NASA Ames Research Center (Ames)
NASA Johnson Space Center (JSC)

AI CoE RESOURCES

The goal of the AI CoE is to provide a single point of contact for AI resources for members and participants. These resources include AI best practices, exposure to research projects and potential research collaborators, regular seminars, AI project technical monitors, AI/ML subject matter experts for projects or proposals, and learning opportunities such as continuous learning seminars with the ML Academy (led by Dr. Barbara Thompson).

The AI CoE also connects scientists and engineers to needed computational resources including High End Computing in CISTO as well as commercial cloud opportunities through the Science Managed Cloud Environment (SMCE) and the Goddard Cloud Computing Mission Cloud Platform (GCC-MCP). In conjunction with the GCC-MCP, the AI CoE conducted a solicitation of proposals for pilot projects using Amazon Web Services (AWS). Credits for compute time on AWS were provided from the development team at AWS through the GCC-MCP. Ten proposals were received, and five proposals were selected for support. Results of these projects will be presented at a monthly AI CoE webinar in the Spring of 2021.

Another effort that the AI CoE has been working on is to develop academic partnerships through the Established Program to Stimulate Competitive Research (EPSCoR) in collaboration with CISTO’s EPSCoR lead James Harrington. Successful partnerships have already been formed with the University of Delaware, University of Vermont, Wichita State, College of Charleston at South Carolina, University of Wyoming, Louisiana State University, and West Virginia University.

The AI CoE helped Harrington draft the recently-announced EPSCoR open solicitation for 2021 Rapid Response Research, found on NSPIRES. The AI CoE helped define the solicitation’s scientific research requirements, identify case studies, reviewed proposals, assigned GSFC technical mentors (often AI CoE members) to awarded proposals who then can guide awardees through using ML techniques and tools in their research, and connect them to NCCS resources, as needed.

Monthly Seminars

The AI CoE provides wide-ranging seminars that usually take place on the third Wednesday from 2:00 to 3:00 p.m. Eastern time. Those with access to internal NASA sites can use the AI CoE website's Events page or MS Teams channel to learn about upcoming events. Contact Nargess Memarsadeghi to add your email address to the AI CoE distribution list and to be added to the "GSFC AI CoE" team.

FUNDING OPPORTUNITIES

The AI CoE connects NASA Goddard researchers and their collaborators/partners to new funding opportunities as well as connecting funded projects to needed AI experts and technical monitors (TM). Specifically, the AI CoE provides a forum for individuals who are planning to submit a proposal, both internal to Goddard and external, to find collaborators and subject matter experts to build a stronger and more competitive proposal team.

IMPACT

The AI CoE has already had an impact on connecting research projects to NASA resources, specifically those of CISTO’s Innovation Lab and advanced NCCS computing resources such as the Discover Supercomputer, the ADAPT Science Cloud, the ADAPT GPU Cluster, and the Science Managed Cloud Environment (SMCE) through AI CoE technical monitors (TM). For example, as a TM for a project at the University of Delaware, Nargess recently helped bridge that project to NCCS computing resources.

One example of the impact of the AI CoE connecting NASA projects to AI resources is the Goddard Landslides Team, which uses NCCS resources to conduct ML to identify landslides from optical imagery. Using tree-based models, the team then predicts when and where landslides are most likely to occur.

The AI CoE also serves another need—proposal support. Some new Internal Research & Development Program (IRAD) project principal investigators (PIs) now seek endorsements from the AI CoE in the form of letters of support for their proposed, NASA-funded projects.

Below are a few past projects within the broader AI community of practice. A complete list of projects is too long to gather because the adoption of AI tools for research is becoming so ubiquitous at NASA, and communities of practice in AI are continually growing. “In fact,” Memarsadeghi observed, “Mark Carroll often says that, if we tried to list all awarded scientific research proposals that use AI/ML across NASA Goddard, the list would include almost everyone, because almost every project involves doing some data analysis using AI or ML techniques.”

Ongoing Projects

Application of Random Forest to Quantify Lake Depth in Arctic Lakes: Our objective is to use field measurements of water depth from lakes on the North Slope of Alaska to generate a random forest (RFA) model that can be applied to remotely sensed data from Landsat to create depth maps for lakes across the North Slope of Alaska. Specific goals are: 1) Apply RFA to training from 17 lakes where previous work using linear models has been done and quantify results (Simpson et al. 2019); 2) Recalibrate model with training points from prior collections (Arp 2018; Hinkel 2013), and data from Landsat 5 and 7 to match the date of collection of field data; 3) Apply model to time series of Landsat data and evaluate stability of calculations through time with fixed location buoy data (Hinkel et al. 2012); and 4) Recalibrate model and apply to WorldView-3 data to determine future capabilities.

From Data to Discovery in Astronomical Datasets: Astronomical datasets are seemingly as vast and diverse as space itself, each presenting unique challenges but worth attempting to overcome in order to find the incredibly valuable scientific gems hidden inside. Satellites observing space are collecting hundreds of terabytes of data that need to be investigated for potential discoveries. AI, ML, and High-Performance Computing (HPC) provide the tools to process these large volumes of data from their raw state all the way to discovery. This seminar will show early results from two different investigations using entirely different AI/ML/HPC approaches to different astronomical datasets looking for obscure but valuable X-ray energy sources (with spectra from observatories such as Chandra, XMM, Suzaku, and Swift) and multiple star systems (with data from TESS Full-Frame Images). This work uses systems located in CISTO/NCCS, including the Discover supercomputer to create millions of light curves and Deep Learning in the ADAPT GPU cluster to create high-dimensional embedding spaces.

Object Identification in Sub Meter Commercial Very High Resolution Data Using Convolutional Neural Networks: In the past twenty years, several commercial companies (Planet Labs, Digital Globe, etc.) have launched Earth observing satellites that capture images of the planet at spatial resolution as fine as 0.5 meters. This opens up new avenues for identification of features on the Earth's surface including individual agricultural fields, individual trees, and specific geologic formations. It also presents new challenges with small clouds and cloud shadows, variability within the crown of individual trees, tree and building shadows, and small water bodies. In this project, we are testing different types of Convolutional Neural Networks on the GPU cluster hosted by CISTO. The outcome of this project will be optimized methods for processing commercial high resolution optical imagery.

MERRA/Max: Harnessing the Potential of Climate Model Outputs in Studies of Ecosystem Change: There is growing interest in using Intergovernmental Panel on Climate Change (IPCC)-class climate model outputs in ecological research. These models provide realistic, global representations of the climate system, projections for hundreds of variables (including Essential Climate Variables), and combine observations from an array of satellite, airborne, and in-situ sensors. Unfortunately, direct use of this important class of data has been limited due to the large size and complexity of model output collections, internal file complexity, and limited means for dynamically creating derived products of interest. To address these limitations, we have developed an AI-based stochastic convergence technology, called MERRA/Max, that combines HPC and Princeton's Maximum Entropy (MaxEnt) software to rapidly subset and identify potential drivers of change among the hundreds of variables in a climate model output collection. MERRA/Max reduces dimensionality by iteratively drawing on MaxEnt's capacity for feature selection to winnow randomly selected climate variables until a stable set of predictors is found. Preliminary work focuses on the MERRA reanalysis, a product of NASA's GEOS-5 modeling framework. At 1 petabyte in size, MERRA comprises over 700 climate variables and spans 1970 to the present at high temporal resolution. We evaluated MERRA/Max by modeling the bioclimatic envelope of Cassin's Sparrow using MERRA and BioClim variables.

Past Projects

A Machine Learning Examination of Tropospheric Hydroxyl Radical Differences Among Chemistry-Climate Models: The hydroxyl radical (OH) is the primary oxidant of Earth’s lowermost atmosphere, determining the lifetimes of methane and numerous other pollutants and climate forcers. The fast, complex chemistry that controls OH is notoriously difficult to validate and assess in global chemistry-climate models. We developed a neural network method to understand and quantify the factors causing large differences in global OH abundances among a suite of models. We further applied the same neural network framework to understand the drivers of OH variations in time, over the course of multi-decadal chemistry-climate model simulations. Julie Nicely.

Machine Learning for 3D Lunar Data Analysis: This project leverages brand new advances in 3D ML, applied to 3D Point Clouds, to enable automated lunar terrain and geologic feature recognition important for Lunar Science and exploration. NASA's new Lunar Program, Artemis, involves extensive human and robotic exploration of the Moon's surface, with associated lunar surface science requiring detailed geologic site analysis for tasks such as: landing site selection, hazard avoidance, path planning, as well as scientific site selection, investigation, and analysis. Next-generation Lidars will be a key instrument used for many of these tasks. Lidars generate 3D "Point Clouds" of positional data, coupled with associated RGB imagery, resulting in very large data volumes. Recent innovations in "Deep" Machine Learning (ML/DL) using Convolutional Neural Nets (CNNs) now allow this 3D data to be "learned" so that 3D objects and features (e.g. craters) can be recognized quickly and accurately from newly input Lidar scans, resulting in significant cost-savings and new science-generation capabilities. Matthew Brandt.

Community Concept Mapping for Data-to-Decisions: We're exploring ways to capture the knowledge of subject matter experts across science, technology, and policy areas in the form of use cases represented as concept maps, focusing on risk management related to climate adaptation and mitigation against the impacts of extreme events. For now, these concept maps are intended more as a communication and learning tool for humans, rather than as a basis for machine reasoning. The latter, though, is what we are aiming for (e.g., context-aware knowledge discovery). Currently, this effort is primarily driven by our involvement in the Earth Science Information Partners Agriculture and Climate Cluster. Bill Teng.

MAtISSE: MAchine Intelligence for Small SatEllites: The MAtISSE project seeks to develop a new approach to rapid, real-time extraction and classification of photometric light curves using a modern differencing technique and advanced DL integrated onto a compact graphics processing unit. MAtISSE will develop this technique, which has the potential to greatly reduce the amount of data transmitted by an observatory, for implementation on a future CubeSat-based science payload with a thorough assessment of power requirements vs. processing and communications bandwidth. This technology will be especially applicable to small, power-limited spacecraft and may enable observations and science return that would be challenging or even impossible otherwise. The ML portion of this project, RAMjET: RApid Machine lEarned Triage, based on Tensorflow, is nearly complete and is running on the ADAPT GPU cluster. Richard Barry.

Sean Keefe, NASA Goddard Space Flight Center

NASA Center for Climate Simulation

High Performance Computing for Science

Alert