TY - GEN
T1 - A methodology for evaluating the impact of data compression on climate simulation data
AU - Baker, Allison H.
AU - Levy, Michael N.
AU - Xu, Haiying
AU - Nychka, Doug
AU - Dennis, John M.
AU - Mickelson, Sheri A.
PY - 2014
Y1 - 2014
N2 - High-resolution climate simulations require tremendous computing resources and can generate massive datasets. At present, preserving the data from these simulations consumes vast storage resources at institutions such as the National Center for Atmospheric Research (NCAR). The historical data generation trends are economically unsustainable, and storage resources are already beginning to limit science objectives. To mitigate this problem, we investigate the use of data compression techniques on climate simulation data from the Community Earth System Model. Ultimately, to convince climate scientists to compress their simulation data, we must be able to demonstrate that the reconstructed data reveals the same mean climate as the original data, and this paper is a first step toward that goal. To that end, we develop an approach for verifying the climate data and use it to evaluate several compression algorithms. We find that the diversity of the climate data requires the individual treatment of variables, and, in doing so, the reconstructed data can fall within the natural variability of the system, while achieving compression rates of up to 5:1.
AB - High-resolution climate simulations require tremendous computing resources and can generate massive datasets. At present, preserving the data from these simulations consumes vast storage resources at institutions such as the National Center for Atmospheric Research (NCAR). The historical data generation trends are economically unsustainable, and storage resources are already beginning to limit science objectives. To mitigate this problem, we investigate the use of data compression techniques on climate simulation data from the Community Earth System Model. Ultimately, to convince climate scientists to compress their simulation data, we must be able to demonstrate that the reconstructed data reveals the same mean climate as the original data, and this paper is a first step toward that goal. To that end, we develop an approach for verifying the climate data and use it to evaluate several compression algorithms. We find that the diversity of the climate data requires the individual treatment of variables, and, in doing so, the reconstructed data can fall within the natural variability of the system, while achieving compression rates of up to 5:1.
KW - Data compression
KW - High performance computing
UR - https://www.scopus.com/pages/publications/84904421921
U2 - 10.1145/2600212.2600217
DO - 10.1145/2600212.2600217
M3 - Conference contribution
AN - SCOPUS:84904421921
SN - 9781450327480
T3 - HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing
SP - 203
EP - 214
BT - HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing
PB - Association for Computing Machinery
T2 - 23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014
Y2 - 23 June 2014 through 27 June 2014
ER -