Machine Learning for Non-parametric Data Assimilation in Hydrological Models

Jonathan Frame

We test a method for hydrologic modeling predictions of soil moisture with a hybrid machine learning (ML) + physics-based modeling approach. This method is an alternative to data assimilation, addresses a grand challenge of integrating machine learning with physics, and has an added benefit in that it makes dynamic corrections to model structural error. Dr. Pelissier has developed a parallelized machine learning code for Gaussian Process Regression (GPR) which is used for the ML component, and we use the Noah-Multiparameter (Noah-MP) land surface model as the physics-based component of this hybrid. We test this method over annual soil moisture cycles at FluxNet towers with high quality observations. The results show that this hybrid approach significantly improves the out-of-sample soil moisture predictions as compared to those made by a calibrated Noah-MP model. We also compare the GPR with ‘traditional’ data assimilation. We ran the Noah-MP model with an Ensemble Kalman Filter (EnKF). The results show a similar performance improvement between GPR and EnKF when run in sample, but EnKF provides no benefit when making predictions after only a few time steps following an observation. Our results show that this hybrid approach continues improving model predictions even without soil moisture observations. This has significance for improving the efficiency of satellite data assimilation into large scale hydrologic models.