Parallel high-resolution climate data analysis using swift

Matthew Woitaszek, John M. Dennis, Taleena R. Sines

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Advances in software parallelism and high-performance systems have resulted in an order of magnitude increase in the volume of output data produced by the Community Earth System Model (CESM). As the volume of data produced by CESM increases, the single-threaded script-based software packages traditionally used to post-process model output data have become a bottleneck in the analysis process. This paper presents a parallel version of the CESM atmosphere model data analysis workflow implemented using the Swift scripting language. Using the Swift implementation of the workflow, the time to analyze a 10-year atmosphere simulation on a typical cluster is reduced from 95 to 32 minutes on a single 8-core node and to 20 minutes on two nodes. The parallelized workflow is then used to evaluate several new data-intensive computational systems that feature RAM-based and flash-based storage. Even when constraining parallelism to limit the amount of file system space used by intermediate temporary data, our results show that the Swift-based implementation significantly reduces data analysis time.

Original languageEnglish
Title of host publicationMTAGS'11 - Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers, Co-located with SC'11
Pages5-14
Number of pages10
DOIs
StatePublished - 2011
Event2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers, MTAGS'11, Co-located with SC'11 - Seattle, WA, United States
Duration: Nov 14 2011Nov 14 2011

Publication series

NameMTAGS'11 - Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers, Co-located with SC'11

Conference

Conference2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers, MTAGS'11, Co-located with SC'11
Country/TerritoryUnited States
CitySeattle, WA
Period11/14/1111/14/11

Keywords

  • climate modeling
  • data-intensive computing
  • many-task computing
  • workflow orchestration

Fingerprint

Dive into the research topics of 'Parallel high-resolution climate data analysis using swift'. Together they form a unique fingerprint.

Cite this