Parallelize Over Data Particle Advection: Participation, Ping Pong Particles, and Overhead
Overview
KVL along with international partners investigated the performance challenges of the Parallelize over Data (POD) algorithm, a common method for parallel particle advection in scientific visualization. Particle advection is crucial for understanding fluid flow in simulations, but achieving efficient performance in distributed memory settings is difficult. The research focused on analyzing the POD algorithm's behavior and identifying the factors that contribute to performance bottlenecks.
Work Summary
The POD algorithm is widely used due to its simplicity and minimal data movement. However, it suffers from scaling issues, especially when particle advection workloads become unbalanced. This paper introduces two novel metrics—rank participation and aggregated rank participation—to quantify workload imbalances over time. Additionally, it analyzes particle-centric behavior, identifying that overhead associated with particle movement between processes significantly impacts execution time. The study reveals that repeated particle circulation between blocks, termed “ping pong particles,” is a major cause of performance degradation. The research involved designing representative workloads, executing them on a supercomputer, and collecting timing and statistical data. The findings shed light on the underlying causes of poor performance and offer directions for future research.
Impact
This research provides valuable insights into the performance limitations of the POD algorithm, a foundational tool in scientific visualization. By identifying and quantifying the factors causing performance bottlenecks, this work can guide the development of more efficient algorithms for parallel particle advection. This is particularly important for handling large datasets and complex simulations on high-performance computing systems, which is essential for advancing scientific discovery in fields like fluid dynamics and climate modeling. This work will help make particle advection workloads on Shaheen III at KAUST more efficient, making better use of compute resources.