UNM Computer Science

Colloquia



Metagenomics and High Performance Computing

Date: Thursday, May 6, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Bill Feiereisen

Director of High Performance Computing
DoD for Lockheed Martin Corporation

Abstract:
It has become a cliche to state that the biological sciences have become information sciences. Vastly increased volumes of experimentally acquired genomic and proteomic data hint at rich new insights in many areas of the biological sciences, but the demands they place on computing for their analysis are just as great. This is one of the many reasons why scientists from the more traditional areas of high performance computing have been attracted into biology. However, the character of this computing has changed -- away from modeling and simulation, upon which much of our high performance computing expertise is based, to the extraction of scientific insight from data analysis.

This talk discusses my journey in moving from traditional computational simulation into high performance bioinformatics. The motivation occurs through global climate modeling and the very large contribution that the microbial biology of the ocean has upon the carbon dioxide budget in ocean models. Current microbial models incorporated into ocean models presume knowledge of the organisms present and their metabolism. In reality, recent "metagenomic" ocean surveys have shown that most organisms are not known or understood, nor do we know about their spatial and temporal distribution. So, how would we use this new information to evaluate the performance of current models or build new ones?

Metagenomics is the study of microbial communities in situ. Over 99% of microbes in the ocean cannot be studied in the lab, because they cannot be separated from the symbiosis of their community and survive. Their genomes must be acquired together and teased apart with new computational algorithms. I will discuss work in sequence based and similarity based algorithms to categorize the mixed fragments of DNA for assembly into complete genomes. Comparison of these genome fragments and complete genomes can be performed through multiple alignment algorithms. Both of these algorithmic tasks are now overwhelming our high performance computing capability and point the way to fertile new fields for algorithm developers.

Bio:
Bill Feiereisen is the Director of High Performance Computing, DoD for Lockheed Martin Corporation. He was formerly the laboratory Chief Technologist and Division Director of the Computer and Computational Sciences Division at Los Alamos and before that the head of the NASA Advanced Supercomputing Facility at Ames Research Center.

He is active in the broader computer science community, serving on the editorial board of IEEE Computers in Science and Engineering, as the former chairman of the Advisory Committee for the Open Grid Forum, on the council of the NSF Computing Community Consortium and as a founding member of the New Mexico Computing Applications Center. He is a member of numerous review boards and advisory committees.

Bill's original interests in high performance computing come from computational fluid dynamics and range from turbulence modeling to rarefied gas dynamics with applications to achinery and hypersonic flows. However the computer science of high performance computing that underlies computational science has been a motivator for his work since the nineties. His recent computational interests are in the field of bioinformatics and high performance computing.

He holds a Doctorate and Masters in Mechanical Engineering from Stanford University and a Bachelors Degree from the University of Wisconsin.

In his copious free time he is a wannabe bicycle racer and can usually be found running last in club races in New Mexico.

The evolution of the 'funny bone': a neurocomputational theory of mirth and humo

Date: Tuesday, May 4, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Daniel Dennett
Center for Cognitive Studies
University Professor
Austin B. Fletcher Professor of Philosophy
Tufts University

Bio:
Daniel Dennett is currently Miller Fellow at SFI, and University Prof. and Co-Director, Center for Cognitive Studies, at Tufts University. He is the author of CONSCIOUSNESS EXPLAINED (1991), DARWIN'S DANGEROUS IDEA (1995), and various articles on robotics, AI, computers and the mind, technology, etc. His most recent book is BREAKING THE SPELL: RELIGION AS A NATURAL PHENOMENON (2006).

Talkative Sensors: Collaborative Machine Learning for Volcano Sensor Networks

Date: Tuesday, April 27, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Kiri Wagstaff

Senior Researcher in artificial intelligence and machine learning
Jet Propulsion Laboratory

Abstract:
Imagine a machine learning agent deployed at each station in a sensor network, so that it can analyze incoming data and determine when something interesting happens. Traditionally, this analysis would be done independently at each station. But what if each agent could talk to its neighbors and find out what they're seeing? We've developed a learning system that enables collaboration so that the agents can autonomously (without human input) improve their performance. Each agent can ask its neighbors for their opinions, then use them to refine its own results. When each agent is given the task of clustering the observed data, the opinions are expressed in the form of pairwise clustering constraints. We evaluated several heuristics for selecting which items an agent should query and found that the best strategy was to select one item close to its assigned cluster and one item at the boundary between two clusters. We applied this technique to seismic and infrasonic data collected by the Mount Erebus Volcano Observatory, in which the goal was to separate eruptions from non-eruptions. Collaborative clustering achieved a 150% improvement over regular, non-collaborative clustering. This is joint work with Jillian Green (California State Univ., Los Angeles), Rich Aster and Hunter Knox (New Mexico Institute of Mining and Technology), Terran Lane (Univ. of NM), and Umaa Rebbapragada (Tufts Univ.), funded by the NSF.

Bio:
Kiri Wagstaff is a senior researcher in artificial intelligence and machine learning at the Jet Propulsion Laboratory. Her focus is on developing new machine learning and data analysis methods, particularly those that can be used for in situ analysis onboard spacecraft (orbiters, landers, etc.). She has developed several classifiers and detectors for data collected by instruments on the EO-1 Earth orbiter, Mars Pathfinder, and Mars Odyssey. The applications range from detecting dust storms on Mars to predicting crop yield on Earth. She holds a Ph.D. in Computer Science from Cornell University (2002) and an M.S. in Geological Sciences from the University of Southern California (2008).

Understanding Cyberattack as an Instrument of U.S. Policy

Date: Thursday, April 22, 2010
Time: 2 pm (Panel begins at 3 pm )
Place: Mechanical Engineering, Room 218

Herbert S. Lin
Chief Scientist, CSTB
National Academies

Abstract:
Much has been written about the possibility that terrorists or hostile nations might conduct cyberattacks against critical sectors of the U.S. economy. However, the possibility that the United States might conduct its own cyberattacks -- defensively or otherwise -- has received almost no public discussion. Recently, the US National Academies performed a comprehensive unclassified study of the technical, legal, ethical, and policy issues surrounding cyberattack as an instrument of U.S. policy. This talk will provide a framework for understanding this emerging topic and the critical issues that surround it.

Bio:
Herbert S. Lin is chief scientist for the National Research Council's Computer Science and Telecommunications Board where he directs major study projects at the intersection of public policy and information technology. Relevant for this talk, he was study director of the 2009 Academy study "Technology, Policy, Law, and Ethics Regarding U.S. Acquisition and Use of Cyberattack Capabilities." He previously served as staff scientist in defense policy and arms control for the House Armed Services Committee. Lin holds a doctorate in physics from the Massachusetts Institute of Technology.

Panel:
David Ackley (Associate Professor, UNM Dept. of Computer Science;External Professor at the Santa Fe Institute)
Daniel Dennett (University Professor and Austin B. Fletcher Professor of Philosophy, Tufts University; Miller Scholar, Santa Fe Institute)
Robert Hutchinson (Senior Manager for Computer Science and Information Operations; Sandia National Laboratories)
Herbert Lin (Chief Scientist at the Computer Science and Telecommunications Board, National Research Council of the National Academies)
Andrew Ross (Director, UNM Center for Science, Technology and Policy; Professor, UNM Dept. of Political Science)

The report is availble here and as a PDF.
Questions about this event can be directed to crandall at cs.unm.edu

Tor and censorship: lessons learned

Date: Monday, April 12, 2010
Time: 2 pm — 3 pm
Place: George Pearl Hall, Room 101

Roger Dingledine

Abstract:
Tor is a free-software anonymizing network that helps people around the world use the Internet in safety. Tor's 1600 volunteer relays carry traffic for several hundred thousand users including ordinary citizens who want protection from identity theft and prying corporations, corporations who want to look at a competitor's website in private, and soldiers and aid workers in the Middle East who need to contact their home servers without fear of physical harm.

Tor was originally designed as a civil liberties tool for people in the West. But if governments can block connections *to* the Tor network, who cares that it provides great anonymity? A few years ago we started adapting Tor to be more robust in countries like China. We streamlined its network communications to look more like ordinary SSL, and we introduced "bridge relays" that are harder for an attacker to find and block than Tor's public relays.

In the aftermath of the Iranian elections in June, and then the late September blockings in China, we've learned a lot about how circumvention tools work in reality for activists in tough situations. I'll give an overview of the Tor architecture, and summarize the variety of people who use it and what security it provides. Then we'll focus on the use of tools like Tor in countries like Iran and China: why anonymity is important for circumvention, why transparency in design and operation is critical for trust, the role of popular media in helping -- and harming -- the effectiveness of the tools, and tradeoffs between usability and security. After describing Tor's strategy for secure circumvention (what we *thought* would work), I'll talk about how the arms race actually seems to be going in practice.

Bio:
Roger Dingledine is project leader for The Tor Project, a US non-profit working on anonymity research and development for such diverse organizations as the US Navy, the Electronic Frontier Foundation, and Voice of America. In addition to all the hats he wears for Tor, Roger organizes academic conferences on anonymity, speaks at a wide variety of industry and hacker conferences, and also does tutorials on anonymity for national and foreign law enforcement.

Using Analogy-Making to Discover the Meaning of Images

Date: Thursday, April 8, 2010
Time: 11 am — 12:15 pm
Place: CEC auditorium (Not the normal place at Mechanical Engineering, Room 218)

Melanie Mitchell
Portland State University
Santa Fe Institute

Abstract:
Enabling computers to understand images remains one of the hardest open problems in artificial intelligence. No machine vision system comes close to matching human ability at identifying the contents of images or visual scenes or at recognizing similarity between different scenes, even though such abilities pervade human cognition. In this talk I will describe research---currently in early stages---on bridging the gap between low-level perception and higher-level image understanding by integrating a cognitive model of perceptual organization and analogy-making with a neural model of the visual cortex.

Bio:
Melanie Mitchell is Professor of Computer Science at Portland State University and External Professor at the Santa Fe Institute. She attended Brown University, where she majored in mathematics and did research in astronomy, and the University of Michigan, where she received a Ph.D. in computer science, working with her advisor Douglas Hofstadter on the Copycat project, a computer program that makes analogies. She is the author or editor of five books and over 70 scholarly papers in in the fields of artificial intelligence, cognitive science, and complex systems. Her most recent book, "Complexity: A Guided Tour", published in 2009 by Oxford University Press, was named by Amazon.com as one of the ten best science books of 2009.

Parallel Computing in Phylogenetic Tree Reconstruction

Date: Thursday, April 1st, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Alexandros Stamatakis

Department of Computer Science
Technical University of Munich

Abstract:
The reconstruction of phylogenetic (evolutionary) trees from molecular sequence data under the Maximum Likelihood model represents a compute and memory intensive task.

In this talk I will address how to orchestrate the phylogenetic likelihood function on a large variety of parallel architectures ranging from FPGAs, over multi-core processors, to the IBM BlueGene architecture. I will also address the challenges of maintaining a production level sequential and parallel open-source code for phylogeny reconstruction and discuss some interesting bugs. I will conclude with future challenges in the field.

Date: Thursday, March 11, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Joao Dias

Department of Electrical Engineering and Computer Science
Tufts University

Abstract:
In programming languages, a new idea may help programmers express algorithms more easily or may guarantee that programs behave better. To evaluate such an idea, you need to see how programmers use it in practice, so you need an implementation which is good enough that programmers will actually use it. Historically, while programmers may be forgiving at first, eventually they demand compilers that generate native machine code of high quality. Such compilers are difficult to build. The goal of my research is to develop new ways of building high-quality compilers cheaply.

In building a high-quality compiler, one of the big costs is finding the services of a person who is expert in *both* the compiler *and* the target machine. The main consequence of my work is that such an expert is no longer needed: from a formal description of the semantics of a target machine, I can *generate* a translator that chooses target-machine instructions. Generating translators for such machines as x86, PowerPC, and ARM takes just minutes. These results rest on three technical contributions:

- I proved that the problem is undecidable in general, so any attack must involve heuristic search.
- I developed a new search algorithm that, unlike prior work, explores *only* computations that can be implemented on the target machine.
- I developed a new pruning heuristic that enables my algorithm to explore long sequences of instructions without allowing search times to explode.

From Robots to Molecules: Intelligent Motion Planning and Analysis with Probabilistic Roadmaps

Date: Tuesday, March 9, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Lydia Tapia

Institute for Computational Engineering and Sciences (ICES)
The University of Texas at Austin

Abstract:
At first glance, robots and proteins have little in common. Robots are commonly thought of as tools that perform tasks such as vacuuming the floor, while proteins play essential roles in many biochemical processes. However, the functionality of both robots and proteins is highly dependent on their motions. In the case of robots, complex spaces and many specialized planning methods can make finding feasible motions an expert task. In the case of protein molecules, several diseases such as Alzheimer's, Parkinson's, and Mad Cow Disease are associated with protein misfolding and aggregation. Understanding of molecular motion is still very limited because it is difficult to observe experimentally. Therefore, intelligent computational tools are essential to enable researchers to plan and understand motions.

In this talk, we draw from our unique perspective from robotics to present a novel computational approach to approximate complex motions of proteins and robots. Our technique builds a roadmap, or graph, to capture the moveable object's behavior. This roadmap-based approach has also proven successful in domains such as animation and RNA folding. With this roadmap, we can find likely motions (e.g., roadmap paths). For proteins, we demonstrate new learning-based map analysis techniques that allow us to study critical folding events such as the ordered formation of structural features and the time-based population of roadmap conformers. We will show results that capture biological findings for several proteins including Protein G and its structurally similar mutants, NuG1 and NuG2, that demonstrate different folding behaviors. For robots, we demonstrate new learning-based map construction techniques that allow us to intelligently decide where and when to apply specialized planning methods. We will show results that demonstrate automated planning in complex spaces with little to no overhead.

Bio:
Lydia Tapia is a Computing Innovation Post Doctoral Fellow in the Institute for Computational Engineering and Sciences at the University of Texas at Austin working with Prof. Ron Elber. She received a Ph.D. in 2009 from Texas A&M University after working with Prof. Nancy Amato. At A&M she participated as a fellow in the Molecular Biophysics Training and GAANN programs and was awarded a Sloan Scholarship and a P.E.O. Scholars Award. Lydia also attended Tulane University where she received a BS in Computer Science with academic and research honors. Prior to graduate school, she worked as a member of technical research staff as part of the Virtual Reality Laboratory at Sandia National Laboratories. More information about Lydia Tapia's research and publications can be found at http://parasol.tamu.edu/~ltapia

Perpetual Systems: Building systems to survive in the wild

Date: Thursday, March 4, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Jacob Sorber

Dept. of Computer Science
University of Massachusetts-Amherst

Abstract:
Recent advances in low-power electronics, energy harvesting, and sensor technologies are poised to revolutionize the field of mobile computing, by enabling mobile systems that are long-lived, energy-aware, and self-managing. When realized, this new generation of perpetual systems will have a transformational impact, improving ability to observe natural phenomena, providing network services to remote communities, and enabling many ubiquitous computing applications for which regular maintenance is not feasible. Unfortunately, energy and mobility make building even the simplest systems challenging. Energy harvesting is highly variable, battery storage is limited, and mobility introduces sparse connectivity. Instead of rising to meet these challenges, current mobile applications, operating systems, and languages have evolved very little from their desktop computing origins.

In this talk, I will describe challenges, results, and lessons learned from developing self-tuning mobile sensing systems in the context of two ongoing wildlife studies, focused on endangered tortoises and invasive mongooses. Specifically, I will describe language, runtime, and network techniques that simplify programming energy-aware systems and provide energy efficiency and fairness in energy-constrained networks.

Bio:
Jacob Sorber is a PhD candidate in Computer Science at the University of Massachusetts-Amherst, graduating Summer 2010. His research focuses on mobile systems, pervasive computing, and sensor networks, with an emphasis on making mobile computing systems more flexible, energy-aware, and self-managing.

Bias in Computer Systems Experiments

Date: Tuesday, March 2, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Todd Mytkowicz

Dept. of Computer Science
University of Colorado

Abstract:
To evaluate an innovation in computer systems a performance analyst measures execution time or other metrics using one or more standard workloads. In short, the analyst runs an experiment. To ensure the experiment is free from error, s/he carefully minimizes the amount of instrumentation, controls the environment in which the measurement takes place, repeats the measurement multiple times, and uses statistical techniques to characterize her/his data. Unfortunately, even with such a responsible approach, the analyst's experiment may still be misleading because of bias. A biased experiment occurs when one experimental setup---or the environment in which we carry out our measurements---inadvertently favors a particular outcome over others. In this talk, I demonstrate that bias is large enough to mislead systems experiments and common enough that it cannot be ignored by the systems community. I describe tools and methodologies that my co-authors and I developed to mitigate the impact of bias on our experiments. Finally, I conclude with my future plans for research---tools that aid performance analysts in understanding the complex behavior of their systems.

Bio:
Todd Mytkowicz recently defended his Ph.D. in Computer Science at the University of Colorado, advised by Amer Diwan and co-advised by Elizabeth Bradley. During his graduate tenure he was lucky enough to intern at both Xerox' PARC and IBM's T.J. Watson research lab. He was also a visiting scholar at the University of Lugano, Switzerland. His research interests focus on performance analysis of computer system---specifically, he develops tools that aid programmers in understanding and optimizing their systems.

Automatically Finding Patches Using Genetic Programming

Date: Thursday, February 25th, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering, Room 218

Westley Weimer
Assistant Professor
Dept. of Computer Science
University of Virginia

Abstract:
Automatic program repair has been a longstanding goal in software engineering, yet debugging remains a largely manual process. We introduce a fully automated method for locating and repairing bugs in software. The approach works on off-the-shelf legacy applications and does not require formal specifications, program annotations or special coding practices. Once a program fault is discovered, an extended form of genetic programing is used to evolve program variants until one is found that both retains required functionality and also avoids the defect in question. Standard test cases are used to exercise the fault and to encode program requirements. After a successful repair has been discovered, it is minimized using structural differencing algorithms and delta debugging. We describe the proposed method and report experimental results demonstrating that it can successfully and rapidly repair multiple types of defects from many different programs.

Predicate Invention and Transfer Learning

Date: Thursday, February 18th, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering 218

Jesse Davis

Department of Computer Science and Engineering
University of Washingoton

Abstract:
Machine learning has become an essential tool for analyzing biological and clinical data, but significant technical hurdles prevent it from fulfilling its promise. Standard algorithms make three key assumptions: the training data consist of independent examples, each example is described by a pre-defined set of attributes, and the training and test instances come from the same distribution. Biomedical domains consist of complex, inter-related, structured data, such as patient clinical histories, molecular structures and protein-protein interaction information. The representation chosen to store the data often does not explicitly encode all the necessary features and relations for building an accurate model. For example, when analyzing a mammogram, a radiologist records many properties of each abnormality, but does not explicitly encode how quickly a mass grows, which is a crucial indicator of malignancy. In the first part of this talk, I will focus on the concrete task of predicting whether an abnormality on a mammogram is malignant. I will describe an approach I developed for automatically discovering unseen features and relations from data, which has advanced the state-of-the-art for machine classification of abnormalities on a mammogram. It achieves superior performance compared to both previous machine learning approaches and radiologists.

Bio:
Jesse Davis is a post-doctoral researcher at the University of Washington. He received his Ph.D in computer science at the University of Madison in 2007 and a B.A. in computer science from Williams College in 2002. His research interests include machine learning, statistical relational learning, transfer learning, inductive logic programming and data mining for biomedical domains.

Service Reliability and Speed in Distributed Computing Systems with Stochastic Node Failures and Communication Delays

Date: Tuesday, January 26th, 2010
Time: 11 am — 12:15 pm
Place: Mechanical Engineering 218

Majeed M. Hayat
Professor
Electrical and Computer Engineering
University of New Mexico

Abstract:
The ability to model and optimize reliability and task-execution speed is central in designing survivable distributed computing systems (DCSs) where servers are prone to fail, possibly permanently and in a spatially correlated manner. Correlated component failures in networks have been receiving attention in recent years from government agencies due to their association with damage from weapons of mass destruction. In this talk we discuss the problem of modeling service reliability and task-execution speed of a DSC in uncertain topologies as well as the problem of load balancing in such environments. Service reliability and the mean task-execution time are analytically characterized by means of a novel regeneration-based probabilistic technique. The analysis takes into account the stochastic failure times of servers, the heterogeneity and uncertainty in service times and communication delays, as well as arbitrary task-reallocation policies. Two models are presented: the first one assumes Markovian (exponentially distributed) communication and service random times, and the second relaxes this assumption. The theory is utilized to optimize certain load-balancing policies for maximal service reliability or minimal task-execution time; the optimization is carried out by means of an algorithm that scales linearly with the number of nodes in the system. The analytical model is validated using both Monte-Carlo simulations and experiments.

Bio:
Majeed M. Hayat was born in Kuwait, in 1963. He received the B.S. degree (summa cum laude) in electrical engineering from the University of the Pacific, Stockton, CA, in 1985, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Wisconsin-Madison, Madison, in 1988 and 1992. He is currently a Professor of Electrical and Computer Engineering and a member of the Center for High Technology Materials at the University of New Mexico, Albuquerque. His research contributions cover a broad range of topics in signal/image processing and applied probability. His current areas of interest include image processing and noise reduction in thermal images, algorithms for infrared spectral sensing and recognition, queuing models and strategies for resilient distributed systems and networks, modeling of noise and stochastic carrier dynamics in avalanche photodiodes, and performance characterization of optical receivers and photon counters.