Assessing Differences in Large Spatio-temporal Climate Datasets with a New Python package

Alexander Pinard, Dorit M. Hammerling, Allison H. Baker

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Output data from modern Earth system model simulations are consuming increasingly massive amounts of storage resources, and storing these climate model data is not economically sustainable. Previous works have motivated lossy compression as a potential solution, which achieves greater compression ratios than lossless compression. This further reduction comes at the cost of a loss of information, and therefore, care must be taken to avoid introducing artifacts in the data that could affect scientific conclusions. In this paper we introduce a Python package designed to aid in the analysis of differences in large spatio-temporal datasets, such as those produced by global climate models. While the new package is agnostic to the source of the differences, our motivation is to enable climate scientists to more easily assess the effects of lossy data compression by visualizing and computing derived spatial-temporal quantities that compare lossily compressed datasets to the original dataset. Because Python is quickly becoming the tool of choice for scientific data analysis in the geoscience community, this new package makes use of the Python software stack in Pangeo (an active NSF-funded community platform for Big Data geoscience). Interoperability with other Pangeo software tools means that the new package easily integrates into climate scientists' post-processing and analysis workflows, which we hope will facilitate the adoption of lossy compression into the climate modeling community.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE International Conference on Big Data, Big Data 2020
EditorsXintao Wu, Chris Jermaine, Li Xiong, Xiaohua Tony Hu, Olivera Kotevska, Siyuan Lu, Weijia Xu, Srinivas Aluru, Chengxiang Zhai, Eyhab Al-Masri, Zhiyuan Chen, Jeff Saltz
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2699-2707
Number of pages9
ISBN (Electronic)9781728162515
DOIs
StatePublished - Dec 10 2020
Event8th IEEE International Conference on Big Data, Big Data 2020 - Virtual, Online, United States
Duration: Dec 10 2020Dec 13 2020

Publication series

NameProceedings - 2020 IEEE International Conference on Big Data, Big Data 2020

Conference

Conference8th IEEE International Conference on Big Data, Big Data 2020
Country/TerritoryUnited States
CityVirtual, Online
Period12/10/2012/13/20

Keywords

  • Pangeo
  • Python
  • climate model
  • compression

Fingerprint

Dive into the research topics of 'Assessing Differences in Large Spatio-temporal Climate Datasets with a New Python package'. Together they form a unique fingerprint.

Cite this