What can real information content tell us about compressing climate model data?

Hayden Sather, Alexander Pinard, Allison H. Baker, Dorit M. Hammerling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The massive data volumes produced by climate simulation models create an urgent need for data reduction. Lossy compression in particular is a solution that can significantly reduce storage requirements, however, a tradeoff must be made between the amount of compression applied and the scientific integrity of the data. Determining how much compression can be applied is therefore vital for applying lossy compression. One particular metric for gauging the quality of compression is the percentage of real information present in the original data that is preserved in the compressed data. We compute bitwise real information content for several climate variables from the popular Community Earth System Model, and we investigate the amount of compression that can be applied to each of these climate variables using two popular compression algorithms designed for floating-point data while preserving 99% of the real information content. The analysis of the real information content of data after lossy compression has been applied shows a helpful visualization of how compression artifacts have been introduced to the data. Finally, we demonstrate how this real information content can be used in a straightforward manner to determine compressor settings for our data.

Original languageEnglish
Title of host publicationProceedings of DRBSD-8 2022
Subtitle of host publication8th International Workshop on Data Analysis and Reduction for Big Scientific Data, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages29-36
Number of pages8
ISBN (Electronic)9781665463379
DOIs
StatePublished - 2022
Event8th IEEE/ACM International Workshop on Data Analysis and Reduction for Big Scientific Data, DRBSD-8 2022 - Dallas, United States
Duration: Nov 13 2022Nov 18 2022

Publication series

NameProceedings of DRBSD-8 2022: 8th International Workshop on Data Analysis and Reduction for Big Scientific Data, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference8th IEEE/ACM International Workshop on Data Analysis and Reduction for Big Scientific Data, DRBSD-8 2022
Country/TerritoryUnited States
CityDallas
Period11/13/2211/18/22

Keywords

  • Bit Grooming
  • Climate Model Data
  • Entropy
  • Lossy Compression
  • Real Information Content
  • ZFP

Fingerprint

Dive into the research topics of 'What can real information content tell us about compressing climate model data?'. Together they form a unique fingerprint.

Cite this