High-Performance Communication

Power and Performance Tradeoffs in Exascale Networks

Contact: Taylor Groves (tgroves@cs.unm.edu,); Dorian C. Arnold (darnold@cs.unm.edu)

Networks play a crucial role in facilitating responsive and effective high performance systems. Our research looks at how we can develop a better understanding of the tradeoffs between power and performance in network design. Large-scale simulation and new approaches for network analysis and visualization allow us to extract port-level details and provide valuable insights about how to best design Exascale networks.

Additionally, our research characterized contention in the memory subsystem as a result of one-sided communication. This work demonstrated that this contention could reduce memory bandwidth by 56%. We analyzed the performance impact on a variety of hardware architectures and demonstrated how machine learning could detect the contention and predict the increase to application runtime. Furthermore, we exposed how this contention becomes worse at scale, increasing application runtime by a factor of three (for 8,192 process runs). Lastly, we evaluated the beneļ¬ts of three candidate hardware and software solutions.

HPC data movement reduction for improved performance

Contact: Dewan Ibtesham (dewan@cs.unm.edu); Dorian C. Arnold (darnold@cs.unm.edu)

Memory and I/O operations can no longer keep up with the improvements in processing. We study the application of lossless data compression techniques to trade off some processing capabilities in favor of data movement reduction between memory, stable storage and compute cores in HPC contexts' current and future software/hardware system. Our research demonstrates similarities among communication messages that we exploit to reduce network contention for communication heavy applications. We also simulate memory subsystem to allocate a portion of memory for storing compressed pages thereby increasing per core memory capacity of HPC systems and provide direction for future multi-level memory systems.

Multi-core HPC Communication Systems

Contact: Matthew Dosanjh (mdosanjh@cs.unm.edu); Ryan Grant (regrant@sandia.gov); Nathan Hjelm (hjelmn@lanl.gov)

High-performance communication is essential to efficient parallel computation, but is difficult to do effectively in modern many-core systems. We are researching the performance of and optimizations for different multi-core HPC communication systems, including one-sided and two-sided operations in Open MPI and the performance of thread extensions for OpenShmem.