Research projects

I collaborate with a whole bunch of smart people, at UNM, Georgia Tech, Sandia National Laboratories and Oak Ridge National Laboratory. Here are the major ways in which I spend my time.
Scalable Data Services for Petascale Applications
Providing high-performance I/O for data-intensive high-end computing applications requires working with I/O systems at an unproductively low level of abstraction. This project seeks to provide higher-level I/O abstractions which make possible complex I/O tasks. The central abstraction of this approach is the structured stream. Structured streams provide a model for embedding application-specific functionality between application components. This functionality is applied to data as it moves through I/O graphs, which perform routing and in-band modification. The metadata necessary to connect these components is constructed out-of-band by autonomous metabots, moving the performance impact of metadata maintenance out of the "fast path" of high-end computing applications.
publications
Scalable Proactive Control Planes
We argue that traditional passive client interfaces to directory services are not sufficient for the application environments enabled by grid and pervasive computing, where data are updated at high frequency. In particular, an exclusively passive interface not only hinders service scalability but also indirectly restricts the behavior of potential applications. Consequently, we have proposed a customizable active mode through which clients can subscribe to be notified of changes to data of interest. We have designed and implemented the Proactive Directory Service to test our ideas. PDS clients can dynamically tune the levels of detail and granularity of these notifications through filter functions instantiated at the server or at the object's owner, and by remotely tuning the functionality of those filters. We are currently building a next generation of these tools as part of the Petascale Storage project.
publications
Middleware and Control Structures for Sensor Networks
Currently, users of sensor network applications must adopt custom methods and interfaces to perform common management tasks. This makes monitoring, tasking, diagnosing and debugging sensor networks and their applications cumbersome; for example, users often cannot transfer skills learned for one application onto another. Our work provides standardized end-to-end (from user to sensor nodes) communication and control over a sensor network. A POSIX-style filesystem interface enables users to view and update data, organize groups of sensors, and retask sensor nodes. Sensor nodes appear as directories containing sensor and data files. Users are then able to use common command-line utilities to interact with the sensor network. We are currently testing the fidelity of our approach by applying management tools such as file system visualizers to our work.
publications
Dynamic Differential Data Protection
A key concern among developers of extensible systems is the ability to provide adaptation approaches without sacrificing the level of security achievable with more static (and less adaptable) solutions. For example, where will adaptations execute? With what environment will they run? What level of access will they have to the existing system? Dynamic Differential Data Protection (D3P) has addressed these issues through the creation and evaluation of protection mechanisms for middleware infrastructures. D3P provides control over the data typing space for such middleware as well as abstractions provided by the middleware itself. D3P also provides a general and flexible extension/customization model for distributed applications based on publish/subscribe middleware. We are now exploring how D3P concepts can be applied to advanced software architectures for high-performance computing.
publications
Lightweight Storage for High-Performance Computing
Today's high-end massively parallel processing machines have thousands to tens of thousands of processors, with next-generation systems planned to have in excess of one hundred thousand processors. For systems of such scale, efficient I/O is a significant challenge that cannot be solved using traditional approaches. In particular, general purpose parallel file systems that limit applications to standard interfaces and access policies do not scale and are a performance bottleneck for many scientific applications. This project investigates the use of a "lightweight" approach to I/O that requires the application or I/O-library developer to extend a core set of critical I/O functionality with the minimum set of features and services required by its target applications. We argue that this approach allows the development of I/O libraries that are both scalable and secure.
publications
High-Performance Structured Data Exchange
The EVPath communication infrastructure, along with FFS, its companion data representation library, is the foundation for many research efforts in the systems group at Georgia Tech. I continue to collaborate with researchers there on topics ranging from wire-formats for heterogeneous communication, to peer interaction paradigms (pull/push styles of communication), to required services and driving applications.
publications