News Archives

[Colloquium] Achieving High Read Performance from a Write Optimized File System

November 4, 2011

Watch Colloquium: 

M4V file (619 MB)

  • Date: Friday, November 4, 2011 
  • Time: 12:00 pm — 12:50 pm
  • Place: Centennial Engineering Center 1041

Adam Manzanares
Los Alamos National Laboratory

This talk will focus on The Parallel Log Structured File System (PLFS) that was developed at the Los Alamos National Laboratory (LANL) to improve shared file write performance. Write performance is improved as PLFS transparently transforms the writes such that each process, while logically writing to a shared file, is physically writing to a unique file. By removing this concurrency, PLFS improved the write performance of many applications by multiple orders of magnitude. However, reconstructing the logical file from the multitude of physical files has proven difficult. To alleviate this issue we developed several collective techniques to aggregate information from multiple component pieces. This enables PLFS to maintain it’s large write improvements without sacrificing read performance for many workloads. There are other workloads, however, which remain challenging. Currently, Los Alamos is developing a scalable HPC key-value store to address these remaining challenges. Additionally, the transformative properties of PLFS have recently also been leveraged to improve the metadata performance of a production parallel file system.

 

Bio: Adam Manzanares is currently a Nicholas C. Metropolis postdoctoral fellow at the Los Alamos National Laboratory (LANL). He was appointed this position in November 2010 after joining LANL in July 2010 as a postdoctoral researcher. Dr. Manzanares received his Ph.D. from Auburn University in May 2010 with a focus on energy efficient storage systems. Dr. Manzanares is currently focused on storage systems for high performance computing applications. Dr. Manzanares develops middleware layers to improve the performance of HPC storage systems. Dr. Manzanares is also currently researching compression techniques and data formatting libraries for scientific data sets.