System Software and Services

Performance Interference in Next-Generation Systems

Contact: Patrick G. Bridges (bridges@cs.unm.edu); Oscar Mondragon (oscar.mondragon@gmail.com)

System software for next-generation HPC systems must handle complex resource allocation decisions, but these decisions may adversely interfere with application performance. We are researching new techniques to characterize and optimize system software interactions with applications for forthcoming HPC systems. These techniques leverage novel extreme value models of the impact of interference on HPC application performance. Our initial results, to be presented this year at Supercomputing, demonstrate that this approach can accurately predict the impact of a wide range interference workloads on modern HPC applications.

OS Support for Application Composition

Contact: Noah Evans (nevans@sandia.gov); Patrick G. Bridges (bridges@cs.unm.edu)

Emerging applications increasingly rely on multiple cooperating components to model and analyze complex phenomena. We are researching novel OS mechanisms for supporting such composed applications, for example for efficiently handling data movement and control between co-located application components. Our research leverages features of the Hobbes Exascale Operating System to effectively support emerging composed applications.

Supporting Thread-Level Heterogeneity in Coupled Applications

Contact: Sam Gutierrez (samuel@cs.unm.edu); Dorian C. Arnold (darnold@cs.unm.edu)

Hybrid parallel program models that combine message passing and multithreading (MP+MT) are becoming more popular, extending the basic message passing (MP) model that uses single-threaded processes for parallelism. A consequence is that coupled parallel applications increasingly comprise MP libraries together with MP+MT libraries with differing preferred degrees of threading, resulting in thread-level heterogeneity. Our approach enables full utilization of all available compute resources throughout an application's execution by providing programmable facilities to dynamically reconfigure runtime environments for compute phases with differing threading factors and affinities.