Abstract
This chapter describes the efforts to improve performance of the widely used WSM6 microphysics package used in numerical weather prediction via a variety of techniques including threading, vectorization, array alignment, improving data locality, and optional use of compile-time constants for loop and array index bounds. Code examples are included to illustrate each technique along with its performance benefits on Intel Xeon processors and Intel Xeon Phi coprocessors. Use of tools such as the Intel Thread Inspector to speed the performance tuning process are also described.
| Original language | English |
|---|---|
| Title of host publication | High Performance Parallelism Pearls |
| Subtitle of host publication | Multicore and Many-core Programming Approaches |
| Publisher | Elsevier Inc. |
| Pages | 7-23 |
| Number of pages | 17 |
| Volume | 2 |
| ISBN (Electronic) | 9780128038901 |
| ISBN (Print) | 9780128038192 |
| DOIs | |
| State | Published - Jul 23 2015 |
Keywords
- Compile-time constants
- Fine-grained parallelism
- Intel Thread Inspector
- Many-core
- Multicore
- Numerical weather prediction
- Xeon
- Xeon Phi