Inscrição na biblioteca: Guest
Journal of Machine Learning for Modeling and Computing

Publicou 4 edições por ano

ISSN Imprimir: 2689-3967

ISSN On-line: 2689-3975

Indexed in

IMPUTATION OF CONTIGUOUS GAPS AND EXTREMES OF SUBHOURLY GROUNDWATER TIME SERIES USING RANDOM FORESTS

Volume 3, Edição 2, 2022, pp. 1-22
DOI: 10.1615/JMachLearnModelComput.2021038774
Get accessDownload

RESUMO

Machine learning can provide sustainable solutions to gap-fill groundwater (GW) data needed to adequately constrain watershed models. However, imputing missing extremes is more challenging than other parts of a hydrograph. To impute missing subhourly data, including extremes, within GW time-series data collected at multiple wells in the East River watershed, located in southwestern Colorado, we consider a single-well imputation (SWI) and a multiple-well imputation (MWI) approach. SWI gap-fills missing GW entries in a well using the same well's time-series data; MWI gap-fills a specific well's missing GW entry using the time series of neighboring wells. SWI takes advantage of linear interpolation and random forest (RF) approaches, whereas MWI exploits only the RF approach. We also use an information entropy framework to develop insights into how missing data patterns impact imputation. We discovered that if gaps were at random intervals, SWI could accurately impute up to 90% of missing data over an approximately two-year period. Contiguous gaps constituted more complex scenarios for imputation and required the use of MWI. Information entropy suggested that if gaps were contiguous, up to 50% of missing GW data could be estimated accurately over an approximately two-year period. The RF-feature importance suggested that a time feature (months) and a space feature (neighboring wells) were the most important predictors in the SWI and MWI. We also noted that neither SWI nor MWI methods could capture the missing extremes of a hydrograph. To counter this, we developed a new sequential approach and demonstrated the imputation of missing extremes in a GW time series with high accuracy.

Referências
  1. Aggarwal, C.C. and Zhai, C., Mining Text Data, Springer Science & Business Media, 2012.

  2. Aguilera, H., Guardiola-Albert, C., and Serrano-Hidalgo, C., Estimating Extremely Large Amounts of Missing Precipitation Data, J. Hydroinf, vol. 22, no. 3, pp. 578-592,2020.

  3. Aissia, M.A.B., Chebana, F., and Ouarda, T.B., Multivariate Missing Data in Hydrology - Review and Applications, Adv. Water Resour., vol. 110, pp. 299-309,2017.

  4. Alpaydin, E., Introduction to Machine LearningEthem Alpaydin, Cambridge, MA: MIT Press, 2014.

  5. Arora, B., Dwivedi, D., Hubbard, S.S., Steefel, C.I., and Williams, K.H., Identifying Geochemical Hot Moments and Their Controls on a Contaminated River Floodplain System Using Wavelet and Entropy Approaches, Environ. Model. Software, vol. 85, pp. 27-41,2016a.

  6. Arora, B., Spycher, N.F., Steefel, C.I., Molins, S., Bill, M., Conrad, M.E., Dong, W., Faybishenko, B., Tokunaga, T.K., Wan, J., et al., Influence of Hydrological, Biogeochemical and Temperature Transients on Subsurface Carbon Fluxes in a Flood Plain Environment, Biogeochemistry, vol. 127, nos. 2-3, pp. 367-396,2016b.

  7. Arora, B., Dwivedi, D., Faybishenko, B., Jana, R.B., and Wainwright, H.M., Understanding and Predicting Vadose Zone Processes, Rev. Mineral. Geochem, vol. 85, no. 1, pp. 303-328,2019a.

  8. Arora, B., Wainwright, H.M., Dwivedi, D., Vaughn, L.J., Curtis, J.B., Torn, M.S., Dafflon, B., and Hubbard, S.S., Evaluating Temporal Controls on Greenhouse Gas (GHG) Fluxes in an Arctic Tundra Environment: An Entropy-Based Approach, Sci. Total Environ, vol. 649, pp. 284-299,2019b.

  9. Arora, B., Burrus, M., Newcomer, M., Steefel, C.I., Carroll, R.W., Dwivedi, D., Dong, W., Williams, K.H., and Hubbard, S.S., Differential CQ Analysis: A New Approach to Inferring Lateral Transport and Hy- drologic Transients within Multiple Reaches of a Mountainous Headwater Catchment, Front. Water, 2020. DOI: 10.3389/frwa.2020.00024.

  10. Breiman, L., Bagging Predictors, Mach. Learn., vol. 24, no. 2, pp. 123-140,1996.

  11. Cui, H. and Singh, V.P., Configurational Entropy Theory for Streamflow Forecasting, J. Hydrol., vol. 521, pp. 1-17,2015.

  12. Dafflon, B. and Dwivedi, D., Groundwater Level Elevation and Temperature at the Lower Montane in the East River Watershed, Colorado, Watershed Function SFA, 2020. DOI: 10.15485/1647040.

  13. Dax, A. and Zilberbrand, M., Imputing Missing Groundwater Observations, Hydrol. Res, vol. 49, no. 3, pp. 831-845,2018.

  14. Du, P., Bai, X., Tan, K., Xue, Z., Samat, A., Xia, J., Li, E., Su, H., and Liu, W., Advances of Four Machine Learning Methods for Spatial Data Handling: A Review, J. Geovisualiz. Spatial Anal., vol. 4, pp. 1-25, 2020.

  15. Dwivedi, D. and Mohanty, B.P., Hot Spots and Persistence of Nitrate in Aquifers across Scales, Entropy, vol. 18, no. 1,p. 25,2016.

  16. Dwivedi, D., Mohanty, B.P., and Lesikar, B.J., Estimating Escherichia coli Loads in Streams Based on Various Physical, Chemical, and Biological Factors, Water Resour. Res., vol. 49, no. 5, pp. 2896-2906, 2013.

  17. Dwivedi, D., Steefel, I.C., Arora, B., and Bisht, G., Impact of Intra-Meander Hyporheic Flow on Nitrogen Cycling, Proc. Earth Planet Sci., vol. 17, pp. 404-407,2017.

  18. Dwivedi, D., Arora, B., Steefel, C.I., Dafflon, B., and Versteeg, R., Hot Spots and Hot Moments of Nitrogen in a Riparian Corridor, Water Resour. Res., vol. 54, no. 1, pp. 205-222,2018a.

  19. Dwivedi, D., Steefel, C.I., Arora, B., Newcomer, M., Moulton, J.D., Dafflon, B., Faybishenko, B., Fox, P., Nico, P., Spycher, N., et al., Geochemical Exports to River from the Intrameander Hyporheic Zone under Transient Hydrologic Conditions: East River Mountainous Watershed, Colorado, Water Resour. Res, vol. 54, no. 10, pp. 8456-8477,2018b.

  20. Giannakou, A., Dwivedi, D., and Peisert, S., A Machine Learning Approach for Packet Loss Prediction in Science Flows, Future Gener. Comput. Syst., vol. 102, pp. 190-197,2020.

  21. Haykin, S., Neural Networks and Learning Machines, London: Pearson Education India, 2010.

  22. Hockett, C.F. Review of the Mathematical Theory of Communication by Claude L. Shannon and Warren Weaver, Language, vol. 29, no. 1, pp. 69-93,1953.

  23. Hubbard, S.S., Williams, K.H., Agarwal, D., Banfield, J., Beller, H., Bouskill, N., Brodie, E., Carroll, R., Dafflon, B., Dwivedi, D., et al., The East River, Colorado, Watershed: A Mountainous Community Testbed for Improving Predictive Understanding of Multiscale Hydrological-Biogeochemical Dynamics, Vadose Zone J, vol. 17, no. 1,pp. 1-25,2018.

  24. Hubbard, S.S., Varadharajan, C., Wu, Y., Wainwright, H., and Dwivedi, D., Emerging Technologies and Radical Collaboration to Advance Predictive Understanding of Watershed Hydrobiogeochemistry, Hy- drol. Processes, vol. 34, no. 15, pp. 3175-3182,2020.

  25. Khedri, A., Kalantari, N., and Vadiati, M., Comparison Study of Artificial Intelligence Method for Short Term Groundwater Level Prediction in the Northeast Gachsaran Unconfined Aquifer, Water Supply, vol. 20, no. 3, pp. 909-921,2020.

  26. Kondrashov, D., Feliks, Y., and Ghil, M., Oscillatory Modes of Extended Nile River Records (AD 622-1922), Geophys. Res. Lett., vol. 32, no. 10,2005.

  27. Li, L., Maher, K.,Navarre-Sitchler, A., Druhan, J., Meile, C., Lawrence, C., Moore, J., Perdrial, J., Sullivan, P., Thompson, A., et al., Expanding the Role of Reactive Transport Models in Critical Zone Processes, Earth Sci. Rev, vol. 165, pp. 280-301,2017.

  28. Likens, G.E., Biogeochemistry, the Watershed Approach: Some Uses and Limitations, Marine Freshwater Res, vol. 52, no. 1,pp. 5-12,2001.

  29. Mays, D.C., Faybishenko, B.A., and Finsterle, S., Information Entropy to Measure Temporal and Spatial Complexity of Unsaturated Flow in Heterogeneous Media, Water Resour. Res., vol. 38, no. 12, pp. 49-1-49-11,2002.

  30. Mital, U., Dwivedi, D., Brown, J.B., Faybishenko, B., Painter, S.L., and Steefel, C.I., Sequential Imputation of Missing Spatio-Temporal Precipitation Data Using Random Forests, Front. Water, 2020. DOI: 10.3389/frwa.2020.00020.

  31. Moriasi, D.N., Arnold, J.G., Van Liew, M.W., Bingner, R.L., Harmel, R.D., and Veith, T.L., Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations, Transact. ASABE, vol. 50, no. 3, pp. 885-900,2007.

  32. Moritz, S. and Bartz-Beielstein, T., Imputets: Time Series Missing Value Imputation in R, R J, vol. 9, no. 1,p. 207,2017.

  33. Muller, J., Park, J., Sahu, R., Varadharajan, C., Arora, B., Faybishenko, B., and Agarwal, D., Surrogate Optimization of Deep Neural Networks for Groundwater Predictions, J. Global Opt., pp. 1-29,2019.

  34. Oppel, H. and Schumann, A.H., Machine Learning Based Identification of Dominant Controls on Runoff Dynamics, Hydrol. Processes, vol. 34, no. 11, pp. 2450-2465,2020.

  35. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E., Scikit-Learn: Machine Learning in Python, J. Mach. Learn. Res., vol. 12, pp. 2825- 2830,2011a.

  36. Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., et al., Deep Learning and Process Understanding for Data-Driven Earth System Science, Nature, vol. 566, no. 7743, pp. 195.

  37. Shannon, C., A Mathematical Theory of Communication, Bell Syst. Technol. J, vol. 27, no. 3, pp. 379-423, 1948.

  38. Shen, C., A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists, WaterResour. Res., vol. 54,no. 11,pp. 8558-8593,2018.

  39. Singh, V.P., Entropy Theory for Movement of Moisture in Soils, Water Resour. Res., vol. 46, no. 3, 2010a.

  40. Singh, V.P., Entropy Theory for Derivation of Infiltration Equations, Water Resour. Res., vol. 46, no. 3, 2010b.

  41. Singh, V., The Use of Entropy in Hydrology and Water Resources, Hydrol. Processes, vol. 11, no. 6, pp. 587-626,1997.

  42. Stockman, M., Dwivedi, D., Gentz, R., and Peisert, S., Detecting Control System Misbehavior by Fin-gerprinting Programmable Logic Controller Functionality, Int. J. Crit. Infrastruc. Prot., vol. 26, Article 100306,2019.

  43. Varadharajan, C., Agarwal, D.A., Brown, W., Burrus, M., Carroll, R.W., Christianson, D.S., Dafflon, B., Dwivedi, D., Enquist, B.J., Faybishenko,B., et al., Challenges in Building anEnd-to-End System for Ac-quisition, Management, and Integration of Diverse Data from Sensor Networks in Watersheds: Lessons from a Mountainous Community Observatory in East River, Colorado, IEEE Access, vol. 7, pp. 182796-182813,2019.

  44. Yozgatligil, C., Aslan, S., Iyigun, C., and Batmaz, I., Comparison of Missing Value Imputation Methods in Time Series: The Case of Turkish Meteorological Data, Theor. Appl. Climatol., vol. 112, no. 1, pp. 143-167,2013.

  45. Zachara, J.M., Chen, X., Murray, C., and Hammond, G., River Stage Influences on Uranium Transport in a Hydrologically Dynamic Groundwater-Surface Water Transition Zone, Water Resour. Res., vol. 52, no. 3, pp. 1568-1590,2016.

CITADO POR
  1. Ciriello Valentina, Tartakovsky Daniel M., MACHINE LEARNING TECHNIQUES FOR APPLICATIONS IN SUSTAINABILITY RESEARCH , Journal of Machine Learning for Modeling and Computing, 3, 2, 2022. Crossref

  2. Sprenger Matthias, Carroll Rosemary W. H., Dennedy‐Frank James, Siirila‐Woodburn Erica R., Newcomer Michelle E., Brown Wendy, Newman Alexander, Beutler Curtis, Bill Markus, Hubbard Susan S., Williams Kenneth H., Variability of Snow and Rainfall Partitioning Into Evapotranspiration and Summer Runoff Across Nine Mountainous Catchments, Geophysical Research Letters, 49, 13, 2022. Crossref

  3. Bera Kartic, Newcomer Michelle E., Banik Pabitra, Groundwater recharge site suitability analysis through multi-influencing factors (MIF) in West Bengal dry-land areas, West Bengal, India, Acta Geochimica, 2022. Crossref

  4. Mital Utkarsh, Dwivedi Dipankar, Özgen-Xian Ilhan, Brown James B., Steefel Carl I., Modeling Spatial Distribution of Snow Water Equivalent by Combining Meteorological and Satellite Data with Lidar Maps, Artificial Intelligence for the Earth Systems, 1, 4, 2022. Crossref

  5. Mital Utkarsh, Dwivedi Dipankar, Brown James B., Steefel Carl I., Downscaled hyper-resolution (400 m) gridded datasets of daily precipitation and temperature (2008–2019) for the East–Taylor subbasin (western United States), Earth System Science Data, 14, 11, 2022. Crossref

Portal Digital Begell Biblioteca digital da Begell eBooks Diários Referências e Anais Coleções de pesquisa Políticas de preços e assinaturas Begell House Contato Language English 中文 Русский Português German French Spain