Colloquia
Date: Friday, December 9, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Andrew Wilson
Sandia National Laboratory
As data sizes grow, so too does the cognitive load on the scientists
who want to use the data and the computational load for running their
analyses and queries. The paradigm common in visualization of "show
everything and let the analyst sort through it" is already failing on
medium-large data sets (tens of terabytes) because of the difficulty
of identifying exactly which parts of the data are 'interesting'.
I will argue that the separation of computation and analysis is
improper when working with large data. The process of identifying and
labeling higher-order structure in the data -- the fundamental goal of
analysis -- must begin in the computation itself. Moreover, the
metaphors and abstractions used for analysis must preserve and
summarize meaning at some desired scale so that a high-level overview
will give immediate clues to small-scale features of interest.
Bio:
Andrew Wilson
is a senior member of the technical staff at Sandia
National Laboratories in Albuquerque, New Mexico. The problem of
computing with large data descended upon him during his first week of
graduate school and has occupied his attention since then.
While orbiting the issue he has worked on facets of the end-to-end
processing of large data, starting with data import and ending with
visual representations, with excursions into cybersecurity,
information visualization, graph algorithms, statistical analysis of
ensembles of simulation runs, parallel topic modeling and system
architectures for data-intensive computing. He received his
Ph.D. from the University of North Carolina in 2002.
Date: Friday, December 2, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
James Theiler
Los Alamos National Laboratory
Many problems in image processing require that a covariance matrix be
accurately estimated, often from a limited number of data samples.
This is particularly challenging for hyperspectral imagery, where the
number of spectral channels can run into the hundreds. The Sparse
Matrix Transform (SMT) provides a parsimonious, computation-friendly,
and full-rank estimator of covariance matrices. But unlike other
covariance regularization schemes, which deal with the eigenvalues of
a sample covariance, the SMT works with the eigenvectors. This talk
will describe the SMT and its utility for a range of problems that
arise in hyperspectral data analysis, including weak signal detection,
dimension reduction, anomaly detection, and anomalous change
detection.
Bio:
James Theiler
finished his doctoral dissertation at Caltech in 1987,
with a thesis on statistical and computational aspects of identifying
chaos in time series. He followed a nonlinear trajectory to UCSD, MIT
Lincoln Laboratory, Los Alamos National Laboratory, and the Santa Fe
Institute. His interests in statistical data analysis and in having a
real job were combined in 1994, when he joined the Space and Remote
Sensing Sciences Group at Los Alamos. In 2005, he was named a Los
Alamos Laboratory Fellow. His professional interests include
statistical modeling, image processing, remote sensing, and machine
learning. Also, covariance matrices.
Date: Friday, November 11, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
David Danks
Carnegie Mellon University
In the past twenty years, multiple machine learning
algorithms have been developed that learn causal structure from
observational or experimental data. Most of the algorithms were
designed, however, for relatively "clean" data from linear systems,
and so are often not applicable to real-world problems. In this talk,
I will first outline the principles underlying this type of causal
learning, and then examine three new algorithms developed for more
complex causal learning: specifically, for non-linear and/or
non-Gaussian data, and for learning from multiple, overlapping
datasets. Time permitting, I will provide case studies (e.g., from
oceanography) showing these algorithms in action.
Bio:
David Danks
is an Associate Professor of Philosophy & Psychology
at Carnegie Mellon University, and a Research Scientist at the
Institute for Human & Machine Cognition. His research focuses on the
interface of cognitive science and machine learning: using the tools
of machine learning to better understand complex human cognition, and
developing novel machine learning algorithms based on human cognitive
capacities. His research has centered on causal learning and
reasoning, category development and application, and decision-making.
Date: Friday, November 4, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Adam Manzanares
Los Alamos National Laboratory
This talk will focus on The Parallel Log Structured File System (PLFS) that
was developed at the Los Alamos National Laboratory (LANL) to improve shared
file write performance. Write performance is improved as PLFS transparently
transforms the writes such that each process, while logically writing to a
shared file, is physically writing to a unique file. By removing this
concurrency, PLFS improved the write performance of many applications by
multiple orders of magnitude. However, reconstructing the logical file from
the multitude of physical files has proven difficult. To alleviate this
issue we developed several collective techniques to aggregate information
from multiple component pieces. This enables PLFS to maintain it's large
write improvements without sacrificing read performance for many workloads.
There are other workloads, however, which remain challenging. Currently,
Los Alamos is developing a scalable HPC key-value store to address these
remaining challenges. Additionally, the transformative properties of PLFS
have recently also been leveraged to improve the metadata performance of a
production parallel file system.
Bio:
Adam Manzanares
is currently a Nicholas C. Metropolis postdoctoral
fellow at the Los Alamos National Laboratory (LANL). He was appointed this
position in November 2010 after joining LANL in July 2010 as a postdoctoral
researcher. Dr. Manzanares received his Ph.D. from Auburn University in May
2010 with a focus on energy efficient storage systems. Dr. Manzanares is
currently focused on storage systems for high performance computing
applications. Dr. Manzanares develops middleware layers to improve the
performance of HPC storage systems. Dr. Manzanares is also currently
researching compression techniques and data formatting libraries for
scientific data sets.
Date: Friday, October 28, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Christof Teuscher
Portland State University,
Department of Electrical and Computer Engineering
The computing disciplines face difficult challenges by further scaling down CMOS technology.
One solution path is to use an innovative combination of novel devices, compute paradigms,
and architectures to create new information processing technology. It is reasonable to
expect that future devices will increasingly exhibit extreme physical variation, and
thus have a partially or entirely unknown structure with limited functional control.
Emerging devices are also expected to behave in time-dependent nonlinear ways,
beyond a simple behavior. It is premature to say what computing model is the best
match for such devices. To address this question, our research focuses on a design
space exploration of building information processing technology with spatially
extended, heterogeneous, disordered, dynamical, and probabilistic devices that
we cannot fully control understand. In this talk I will present recent results on
computing with such systems. We draw inspiration from the field of reservoir
computing to obtain a "designed" computation from the "intrinsic" computing
capabilities of the underlying device networks. We study the structural and
functional influence of the underlying devices, their network, and the cost
on the computing task performance and robustness. The goal is to find optima
in the design space. The technological promise of harnessing intrinsic computation
has enormous potential for cheaper, faster, more robust, and more energy-efficient information processing technology.
Bio:
Christof Teuscher
is an assistant professor in the Department of Electrical and Computer Engineering (ECE)
with joint appointments in the Department of Computer Science and the Systems Science
Graduate Program. He also holds an Adjunct Assistant Professor appointment in Computer
Science at the University of New Mexico (UNM). Dr. Teuscher obtained his M.Sc. and Ph.D.
degree in computer science from the Swiss Federal Institute of Technology in Lausanne (EPFL)
in 2000 and 2004 respectively. His main research focuses on emerging computing architectures and paradigms.
Date: Tuesday, October 25, 2011
Time: 11:00 am — 11:50 am
Place: Farris Engineering Center, Room 141
Robert Geist
School of Computing,
Clemson University
Modeling and rendering natural phenomena, which includes all components of biophysical ecology,
atmospherics, photon transport, and air and water flow, remains a challenging area for computer graphics research.
Whether models are physically-based or procedural, model processing is almost always
characterized by substantial computational demands which have almost always precluded real-time performance.
Nevertheless, the recent development of new, highly parallel computational models,
coupled with dramatic performance improvements in GPU-based execution platforms,
has brought real-time modeling and rendering within reach. The talk will focus on the
natural synergy between GPU-based computing and the so-called lattice-Boltzmann
methods for solutions to PDEs. Examples will include photon transport for
global illumination and modeling and rendering of atmospheric clouds, forest ecosystems, and ocean waves.
Bio:
Robert Geist
is a Professor in the School of Computing at Clemson University.
He served as Interim Director of the School in 2007-2008, and he is co-founder
of Clemson's Digital Production Arts Program. He received an M.A. in computer
science from Duke University and a Ph.D. in mathematics from the University of
Notre Dame. He was an Associate Professor of Mathematics at the University of
North Carolina at Pembroke and an Associate Professor of Computer Science at
Duke University before joining the faculty at Clemson University. He is a member
of IFIP WG 7.3, a recipient of the Gunther Enderle Award (Best Paper, Eurographics),
and a Distinguished Educator of the ACM.
Date: Friday, October 21, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
David Lowenthal
Department of Computer Science,
The University of Arizona
Traditionally, high-performance computing (HPC) has been performance
based, with speedup serving as the dominant metric. However, this is
no longer the case; other metrics have become important, such as
power, energy, and total cost of ownership. Underlying all of these
is the notion of efficiency.
In this talk we focus on two different areas in which efficiency in
HPC is important: power efficiency and performance efficiency. First,
we discuss the Adagio run-time system, which uses dynamic frequency
and voltage scaling to improve power efficiency in HPC programs while
sacrificing little performance. Adagio locates tasks off of the
critical path at run time and executes them at lower frequencies on
subsequent timesteps. Second, we discuss our work to improve
performance efficiency. We describe a regression-based technique to
accurately predict program scalability. We have applied our technique
to both strong scaling, where the problem size is fixed as the number
of processors increases, as well as time-constrained scaling, where
the problem size instead increases with the number of processors such
that the total run time is constant. With the former, we avoid using
processors that result in inefficiency, and with the latter, we allow
accurate time-constrained scaling, which is commonly desired by
application scientists yet nontrivial. We conclude with some ideas
about where efficiency will be important in HPC in the future.
Bio:
David Lowenthal
is a Professor of Computer Science at the University
of Arizona. He received his Ph.D at the University of Arizona in
1996, and was on the faculty at the University of Georgia from
1996-2008 before returning to Arizona in 2009. His research centers
on addressing fundamental problems in parallel and distributed
computing, such as scalability prediction and power/energy reduction,
through a system software perspective. His current focus is on
solving pressing power and energy problems that will allow exascale
computing to become a reality within the decade.
Date: Friday, October 7, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Peter Dinda
Northwestern University
Although it is largely invisible to the user, systems software makes a
wide range of decisions that directly impact the user's experience
through their effects on performance. Most systems software assumes
a canonical user. However, we have demonstrated that the measured
user satisfaction with any given given decision varies broadly across
actual users. This effect appears solidly in areas as diverse as
client-side CPU and display power management, server side virtual
machine scheduling, and in networks. Empathic systems acknowledge
this effect and employ direct global feedback from the individual
end-user in the systems-level decision-making process. This makes
it possible to (a) satisfy individual users despite this diversity in
response, and (b) do so with low resource costs, typically far lower
than under the assumption of a canonical user. However, it is
challenging to build empathic systems as the user interface to the
systems software must present minimal distractions. In this talk, I
will expand on the empathic systems model and our results in applying
it in the areas described above. I will also describe some of our
current efforts in using biometrics to make the empathic systems user
interface largely invisible. More information about this work can be
found at
empathicsystems.org.
Bio:
Peter Dinda
is a professor in the Department of Electrical Engineering
and Computer Science at Northwestern University, and head of its
Computer Engineering and Systems division, which includes 17 faculty
members. He holds a B.S. in electrical and computer engineering from
the University of Wisconsin and a Ph.D. in computer science from
Carnegie Mellon University. He works in experimental computer
systems, particularly parallel and distributed systems. His research
currently involves virtualization for distributed and parallel
computing, programming languages for parallel computing, programming
languages for sensor networks, and empathic systems for bridging
individual user satisfaction and systems-level decision-making. You
can find out more about him at pdinda.org.
Date: Friday, September 30, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Daniel Quist
Advanced Computing Solutions, Los Alamos National Laboratory
Reverse engineering malware is a vital skill that is in constant demand. The existing tools
require high-level training and understanding of computer architecture, software, compilers,
and many other areas of computer science. Our work covers several areas that are made to lower the
barrier of entry to reverse engineering. First, we will introduce a hypervisor based automatic
malware analysis system. Second, we will showcase our binary instrumentation framework for
analyzing commercial software. Finally, we will show our graph-based dynamic malware
execution tracing system named VERA. Each of these systems reduces the complexity
of the reverse engineering process, and enhances productivity.
Bio:
Daniel Quist
is a research scientist at Los Alamos National Laboratory, and founder of Offensive Computing,
an open malware research site. His research is in automated analysis methods for malware with
software and hardware assisted techniques. He has written several defensive systems to mitigate
virus attacks on networks and developed a generic network quarantine technology. He consults with
both private and public sectors on system and network security. His interests include malware
defense, reverse engineering, exploitation methods, virtual machines, and automatic classification
systems. Danny holds a Ph.D. from the New Mexico Institute of Mining and Technology.
He has presented at several industry conferences including Blackhat, RSA, and Defcon.
Date: Friday, September 23, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Nathan DeBardeleben
Los Alamos National Laboratory
Over the next decade the field of high performance computing (supercomputing) will undoubtedly
see major changes in the ways leadership class machines are built, used, and maintained.
There are any number of challenges including operating systems, programming models and languages,
power, and file systems to name but a few. This talk will focus on one of those challenges,
the cross-cutting goal of providing reliable computation on fundamentally unreliable components.
Nathan will provide an overview of the field of resilience and point to decadal obstacles,
look at potential solutions that appear promising, and discuss areas that appear to need more emphasis.
Nathan's own new research on a soft error fault injection (SEFI) framework will be presented
as will some early results. SEFI is intended as a framework for determining the resilience
of a target application to soft errors. The initial implementation using a processor emulator
virtual machine will be discussed as will reasons SEFI might be moving away to a dynamic instrumentation approach.
Bio:
Nathan DeBardeleben
is a research scientist at Los Alamos National Laboratory leading the HPC Resilience
effort in the Ultrascale Systems Research Center (USRC). He joined LANL in 2004 after
receiving his PhD, Master's, and Bachelor's in computer engineering from Clemson University.
At LANL, Nathan was an early developer and designer of the Eclipse Parallel Tools Platform (PTP) project,
spent several years optimizing application codes, and has since turned to focus on resilient computation.
Nathan is active in the resilience community and spent 2010 on an IPA assignment at the
U.S. Department of Defense where he lead the Resilience Thrust of the Advanced Computing Systems Research Program.
Active on several program committees, Nathan leads the Fault-Tolerance at Extreme Scale Workshop.
His own research interests are in the field of reliable computation, particularly the area of HPC resilience.
This includes, but is not limited to, fault-tolerance, resilient programming models,
resilient application design, and soft errors (particularly those transient in nature).
(Students with interests in Dr. DeBardeleben's research wanting to meet with him
over lunch should contact Dorian Arnold (darnold@cs.unm.edu) )
Date: Friday, September 16, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Satyajayant Misra
New Mexico State University
Deployment characteristics of sensor nodes and their energy limited nature affects network
connectivity, lifetime, and fault-tolerance of wireless sensor networks (WSNs).
One approach to address these issues is to deploy some relay nodes to communicate
with the sensor nodes, other relay nodes, and the base stations in the network.
The relay node placement problem for WSNs is concerned with placing a minimum
number of relay nodes into a WSN to meet certain connectivity or survivability requirements.
Previous studies have concentrated on the unconstrained version of the problem in the
sense that relay nodes can be placed anywhere. In practice, there may be some physical
constraints on the placement of relay nodes. To address this issue, we have studied
constrained versions of the relay node placement problem, where relay nodes can only be placed at a set of candidate locations.
I will talk about relay node placement for connectivity and survivability, we will discuss the
computational complexity of the problems and look at a framework of polynomial
time O(1)-approximation algorithms with small approximation ratios. I will share our numerical results.
We will also talk about some pertinent extensions of this work in the area of high performance computing.
Bio:
Dr. Satyajayant Misra
is an assistant professor at New Mexico State University (from fall 2009).
His research interests include anonymity, security, and survivability in wireless sensor networks,
wireless ad hoc networks, and vehicular networks and optimized protocol designs for next supercomputing architectures.
Dr. Misra serves on the editorial boards of the IEEE Communications on Surveys and Tutorials
and the IEEE Wireless Communications Magazine. He is the TPC Vice-Chair of Information Systems
for the IEEE INFOCOM 2012. He has served on the executive committees of IEEE SECON 2011 and
IEEE IPCCC 2010. He is the recipient of New Mexico State University's University Research
Council Early Career Award for Exceptional Achievement in Creative Scholastic Activity, for the year 2011.
Date: Friday, September 9, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Sudip Dosanjh
Sandia National Labs
Achieving a thousand-fold increase in supercomputing technology to reach
exascale computing (1018 operations per second) in this decade will
revolutionize the way supercomputers are used. Predictive computer simulations
will play a critical role in achieving energy security, developing climate
change mitigation strategies, lowering CO2 emissions and ensuring a safe and
reliable 21st century nuclear stockpile. Scientific discovery, national
competitiveness, homeland security and quality of life issues will also greatly
benefit from the next leap in supercomputing technology. This dramatic increase
in computing power will be driven by a rapid escalation in the parallelism
incorporated in microprocessors. The transition from massively parallel
architectures to hierarchical systems (hundreds of processor cores per CPU chip)
will be as profound and challenging as the change from vector architectures to
massively parallel computers that occurred in the early 1990's. Through a
collaborative effort between laboratories and key university and industrial
partners, the architectural bottlenecks that limit supercomputer scalability
and performance can be overcome. In addition, such an effort will help make
petascale computing pervasive by lowering the costs for these systems and dramatically improving their power efficiency.
The U.S. Department of Energy's strategy for reaching exascale includes:
- Collaborations with the computer industry to identify gaps
- Prioritizing research based on return on investment and risk assessment
- Leveraging existing industry and government investments and extending technology in strategic technology focus areas
- Building sustainable infrastructure with broad market support
- Extending beyond natural evolution of commodity hardware to create new markets
- Creating system building blocks that offer superior price/performance/programmability at all scales (exascale, departmental, and embedded)
- Co-designing hardware, system software and applications
The last element, co-design, is a particularly important area of emphasis.
Applications and system software will need to change as architectures evolve
during the next decade. At the same time, there is an unprecedented
opportunity for the applications and algorithms community to influence
future computer architectures. A new co-design methodology is needed to
make sure that exascale applications will work effectively on exascale supercomputers.
Bio:
Sudip Dosanjh
heads the extreme-scale computing group at Sandia which spans architectures,
system software, scalable algorithms and disruptive computing technologies.
He is also Sandia's Exascale and platforms lead, co-director of the Alliance
for Computing at the Extreme Scale (ACES) and the Science Partnership
for Extreme-scale Computing (SPEC), and program manager for Sandia's Computer
Systems and Software Environments (CSSE) effort under DOE's Advanced Simulation
and Computing Program. In partnership with Cray, ACES has developed and is deploying
the Cielo Petascale capability platform. He and Jeff Nichols founded the ORNL/Sandia
Institute for advanced Architectures and Algorithms. His research interests include computational
science, Exascale computing, system simulation and co-design. He has a Ph.D. from U.C. Berkeley in Mechanical Engineering.
Date: Friday, September 2, 2011
Time: 12:00 pm — 12:50 pm
Place: Centennial Engineering Center 1041
Jed Crandall
Assistant Professor, University of
New Mexico, Department of Computer Science
It used to be that using a computer on a college campus or in other
places was something you didn't really need to think too much about.
You worried about backups, antivirus, patches, and such---but most
students, faculty, and staff didn't need to worry about being singled
out as targets by governments and other organizations who would like
to violate our privacy. In this talk I'll try to convince you that
individual members of the University community can be singled out as
targets by various organizations for different reasons, and tell you
what you can do to protect yourself.
I'll also talk some about how computer security research is changing.
When the United Nations has summits about "computer security" these
days, the discussions are more about content such as blog posts or
videos that threaten sovereignty or challenge social norms. Worms and
viruses are something that only the Western countries seem to be
concerned about. Computer security researchers will still worry about
computational games with well-structured rules (ARP spoofing,
asymmetric crypto authentication, password entropy, etc.), but
increasingly human psychology and motivations mean more on the
Internet than RFCs and assembly language do. I'll talk about the
opportunities for research that this entails.
Bio:
Jed Crandall
is an Assistant Professor and Qforma Lecturer in the UNM
Computer Science department. He and his graduate students do research
in computer and network security and privacy, including Internet
censorship, forensics, privacy, advanced network reconnaissance, and
natural language processing.
Date: Thursday, August 25, 2011
Time: 11:00 am — 11:50 am
Place: Centennial Engineering Center
Stamm Room (next to the southeast entrance)
Pedro Lopez-Garcia, PhD,
Researcher, IMDEA Software Institute, Madrid, Spain
We present a general resource usage analysis framework which is
parametric with respect to resources and type of approximation (lower-
and upper-bounds). The user can define the parameters of the analysis
for a particular resource by means of assertions that associate basic
cost functions with elementary operations of programs, thus expressing
how they affect the usage of a particular resource. A global static
analysis can then infer bounds on the resource usage of all the
procedures in the program, providing such usage bounds as functions of
input data sizes. We show how to instantiate such a framework for
execution time analysis. Other examples of resources that can be
analyzed by instantiating the framework are execution steps, energy
consumption, as well as other user-defined resources, like the number
of bits sent or received by an application over a socket, number of
calls to a procedure, or number of accesses to a database. Based on
the general analysis, we also present a framework for (static)
verification of general resource usage program properties. The
framework extends the criteria of correctness as the conformance of a
program to a specification expressing upper and/or lower bounds on
resource usages (given as functions on input data sizes). We have
defined an abstract semantics for resource usage properties and
operations to compare the (approximated) intended semantics of a
program (i.e., the specification) with approximated semantics inferred
by static analysis. These operations include the comparison of
arithmetic functions. A novel aspect of our framework is that the
outcome of the static checking of assertions can express intervals for
the input data sizes such that a given specification can be proved for
some intervals but disproved for others. We have implemented these
techniques within the Ciao/CiaoPP system in a natural way, resulting
in a framework that unifies static verification and static debugging
(as well as run-time verification and unit testing).
Bio:
Pedro Lopez-Garcia, PhD,
received a MS degree and a Ph.D. in Computer Science from the Technical University of Madrid (UPM),
Spain in 1994 and 2000, respectively. In May 28, 2008 he got a Scientific Researcher position
at the Spanish Council for Scientific Research (CSIC) and joined the IMDEA Software Institute.
Immediately prior to this position, he held associate and assistant professor positions at UPM
and was deputy director of the Artificial Intelligence unit at the Computer Science Department.
He has published about 30 refereed scientific papers (50% of them at conferences and journals
of high or very high impact.) He has also been coordinator of the international project ES_PASS
and participated as a researcher in many other national and international projects.
His main areas of interest include automatic analysis and verification of global and complex program
properties such as resource usage (user defined, execution time, memory, etc.), non-failure and
determinism; performance debugging; (automatic) granularity analysis/control for parallel and
distributed computing; profiling; unit-testing; type systems; constraint and logic programming.