Colloquia
Google Calendar of Colloquia
Future Colloquia (Tentative Schedule)
Info on Colloquia Requirements for Students
For students taking the colloquia course, here is some information on requirements for Spring 2008.
Visual Computing - A walk down the Visual Information Processing Pipeline
Date: Tuesday, April 10th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Marcus Magnor
Braunschweig University of Technology, Germany
Abstract:
Ground-breaking advances in science and engineering are frequently triggered by either novel technologies becoming available, new application-driven demands, or progress in related disciplines. In visual computing, all three driving forces are currently coming together: imaging technology as well as graphics hardware are progressing at breath-taking pace, increasing hardware capabilities in conjunction with mass-market proliferation create ceaseless demand for novel applications, while at the same time cognitive neuroscience is beginning to provide quantitative models of visual perception that for the first time allow to computationally emulate how we see. The field of visual computing can be expected to continue to experience a tremendous increase not only in scientific recognition but also in economic relevance. In my talk, I will present examples of research in visual computing to convince you of the scientific as well as economic prospects of visual computing.
Bio:
Marcus Magnor heads the Computer Graphics Lab at Braunschweig University of Technology, Germany. He received his BA (1995) and MS (1997) in Physics from the University of Wuerzburg and the University of New Mexico, respectively, and his PhD (2000) in Electrical Engineering from the University of Erlangen, Germany. For his post-graduate studies, he first joined the Computer Graphics Lab at Stanford University before establishing his own Independent Research Group focusing on "Graphics-Optics-Vision" at the Max-Planck-Institut Informatik in Saarbruecken, Germany. His research interests meander along the visual information processing pipeline, from image formation, acquisition, and analysis to image synthesis, display, perception, and cognition. Recent and ongoing research topics include video-based rendering, 3D-TV, augmented vision, video editing, optical phenomena, as well as astrophysical visualization
Data Mining in Word Learning and the Internet Classroom
Date: Tuesday, April 1st, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Paul R. Cohen
Information Sciences Institute University of Southern California
Abstract:
The first part of this talk is about a project to have softbots and robots learn the meanings of words. Our goal is to mimic the context in which children learn word meanings, where the learner communicates about what's going on in the immediate environment with a facilitative, forgiving, capable language user. I will describe recent results from Wubble World, a game environment in which kids and their softbots communicate in English, and the softbots gradually learn word meanings. The second part of the talk is about K12 education and a project we call the internet classroom. I will quickly survey opportunities afforded by the internet classroom for research in many areas of computer science, then focus on the problem of selecting the next task or learning activity in a way that is customized to each student.
Bio:
Paul Cohen attended UCSD as an undergraduate and Stanford for his PhD in Computer Science and Psychology. He was a professor at the University of Massachusetts from 1983 - 2003, when he joined the Information Sciences Institute at the University of Southern California. At ISI, he serves as director of the Center for Research on Unexpected Events and as deputy director of the Intelligent Systems Division. At Stanford, Cohen edited the Handbook of Artificial Intelligence with Avron Barr and Edward A. Feigenbaum (extending to four volumes, eventually) which probably explains why he tries hard, though with mixed success, to understand how AI and other cognitive sciences are co-developing. Cohen is especially interested in challenge problems and research methods, and he wrote a book and many articles, and advises government agencies, on these subjects. His research develops new methods for learning, planning, and data mining, and applies them to modeling cognitive development, intelligence analysis, various military problems, and education. Cohen is a fellow of the American Association for Artificial Intelligence.
Truthful Adaptive Sponsored Search Mechanisms
Date: Tuesday, March 25th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Rica Gonen
Yahoo! Research
Abstract:
The talk will focus on two recent papers in sponsored search auctions. The first presents a truthful sponsored search auction based on an incentive-compatible multi-armed bandit mechanism. The mechanism described combines several desirable traits.
The mechanism gives advertisers the incentive to report their true bid, learns the click-through rate for advertisements, allows for slots with different quality, and loses the minimum welfare during the sampling process.
The underlying generalization of the multi-armed bandit mechanism addresses the interplay between exploration and exploitation in an online setting that is truthful in high probability while allowing for slots of different quality.
As the mechanism progresses the algorithm more closely approximates the hidden variables (click-though rates) in order to allocate advertising slots to the best advertisements. The resulting mechanism obtains the optimal welfare apart from a tightly bounded loss of welfare caused by the bandit sampling process.
The second paper extends the model of the first paper by allowing for time constrained and budget constrained advertisers who arrive and depart in an online fashion. The paper presents an online sponsored search auction that motivates advertisers to report their true budget, arrival time, departure time, and value per click.
In tackling the problem of truthful budget, arrival and departure times, it turns out that it is not possible to achieve truthfulness in the classical sense. As such, we define a new concept called delta-gain.
delta-gain bounds the utility a player can gain by lying as opposed to his utility when telling the truth. We argue that for many practical applications if the delta-gain is small, then players will not invest time and effort in making strategic choices but will truthtell as a default strategy. These concepts capture the essence of dominant strategy mechanisms as they lead the advertiser to choose truthtelling over other strategies.
In order to achieve delta-gain truthful mechanism this paper also presents a new payment scheme, for an online budget-constrained auction mechanism. The payment scheme is a generalization of the VCG principles for an online scheduling environment with budgeted players.
Using the concepts of delta-gain truthful we present the only known budget-constrained sponsored search auction with truthful guarantees on budget, arrivals, departures, and valuations. Previous works that deal with advertiser budgets only deal with the non-strategic case.
Research Areas:
My research focus is mechanism design or essentially any topic that captures the border between computer science theory, game theory and microeconomic theory. Among the topics I work on are: Combinatorial Auctions & Markets, Sponsored Search Mechanisms, Social Networks, Coalitions, Information Markets, Rational Cryptography, RationalDistributed Computation, Online Algorithms and Computation, and Approximation Algorithms.
Walking Your Dog in the Woods in Polynomial Time
Date: Thursday, March 27th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Shripad Thite
Center for the Mathematics of Information
Caltech
The Frechet distance between two curves in the plane is the minimum length of a leash that allows a dog and its owner to walk along their respective curves, from one end to the other, without backtracking. We propose a natural extension of Frechet distance to more general metric spaces, which requires the leash itself to move continuously over time. For example, for curves in the punctured plane, the leash cannot pass through or jump over point obstacles ('trees'). Thus, we introduce the homotopic Frechet distance between two curves embedded in a general metric space. We describe a polynomial-time algorithm to compute the homotopic Frechet distance between two given polygonal curves in the plane minus a given set of obstacles, which are either points or polygons.
This is joint work with Erin Wolf Chambers, Eric Colin de Verdiere, Jeff Erickson, Sylvain Lazard, and Francis Lazarus, to appear at SoCG'08 and invited to a CGTA special issue.
When is a good time? Mediating Notification Delivery to Reduce Cost of Interruption
Date: Tuesday, March 11th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Shamsi T. Iqbal
PhD Candidate
Department of Computer Science
University of Illinois at Urbana-Champaign
Abstract:
Interruptions in the workplace are becoming increasingly prevalent due to the proliferation of proactive behavior within communication applications and collaborative practices. Research has shown that interruptions at inopportune moments often result in substantial costs to users and their tasks, e.g. frustration and reduced productivity. However, information conveyed by notifications is also often beneficial to users. A current thrust within the HCI community has been to develop solutions that reduce the cost of interruption caused by notifications while maintaining their utility.
In this talk I will present my research on developing and evaluating a new solution to the problem of notification management - mediating notifications to be delivered at breakpoints during user tasks. I will first present empirical results from a study that applies theories of memory and action to understand why breakpoints have lower interruption costs. I will describe new techniques derived from theories of event perception to detect breakpoints with varying interruption costs during interactive tasks. I will then present OASIS, a system for scheduling notification delivery at moments it detects as breakpoints. OASIS allows effects of notification scheduling to be studied in practical settings and provides a test bed for experimenting with various scheduling policies. Finally, I will discuss empirical results demonstrating the utility of the system in context of authentic tasks and discuss its impacts on productivity and user affect. This work provides the first empirical evidence that intelligent notification management benefits the end user and contributes new lessons for designing effective notification management systems.
Bio:
Shamsi T. Iqbal is a Ph.D. candidate in the Department of Computer Science at the University of Illinois at Urbana-Champaign, with a focus in the area of Human-Computer Interaction. She received an M.S. in Computer Science from the University of Illinois in 2004 and a B.S. in Computer Science and Engineering from Bangladesh University of Engineering and Technology in 2001. Her dissertation work focuses on developing and evaluating computational systems for intelligently managing notifications in the desktop. Her broader research interests are in attention management, development of models of user activity, physiological measures as evaluation metrics and educational technology. Shamsi's work on interruption management has been featured in the media, e.g. the New York Times and the American Way magazine in 2007. She has served on program committees of several leading conferences in the field of human-computer interaction, including ACM UIST (Poster committee, 2007) and ACM CHI (Notes committee, 2008).
Optimization in the Dark
Date: Thursday, March 6th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Thomas Hayes
Research Assistant Professor
TTI Chicago
Consider the following model for sequential decision making in the presence of uncertainty. In each of T rounds, a decision maker must choose an action, which results in some amount of reward (or loss). This process is online: the rewards are initially unknown, but after each action, the decision maker gets the reward for that action as feedback, hopefully enabling better decisions at later times. A key feature of this problem is the need for carefully balancing Exploitation (i.e. making what seems like the best action) versus Exploration (trying an action just to see what happens).
We will focus on the case when the rewards may be viewed as (unknown) linear functions of the action space. This framework includes the classical "K-armed bandit" problem (what is the best way to play K slot machines, or "one-armed bandits"?), but is much more general. In particular, the action space may be much much larger than the dimension of the linear space it is embedded in (for the K-armed bandit, the cardinality of the action space equals the dimension). Traditional algorithms for online linear optimization, based on reduction to the K-armed bandit, suffer performance penalties proportional to the number of actions. In many practical settings, this number is too large by an exponential amount.
I will present some new algorithms, based on techniques from statistical regression, with provably near-optimal reward guarantees.
Privacy-preserving Data Aggregation over Participatory Networks
Date: Tuesday, March 4th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
PhD Candidate
University of Illinois at Urbana-Champaign
The emerging participatory networked embedded systems, designed for aggregated information collection with fine-granularity, are viewed as new generation of pervasive computing systems. We expect both social and economic impact of participatory networks. Applications of participatory networks are likely to deal with highly sensitive or private information. Hence, one of the pressing concerns of users is the confidentiality of the data collected about them. Therefore, we need a way to collect aggregated information while at the same time preserve data privacy.
In this talk, I will present two privacy-preserving data aggregation schemes for additive aggregation functions. The first scheme, Cluster-based Private Data Aggregation (CPDA), leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The second scheme, Slice-Mix-AggRegaTe (SMART), builds on slicing techniques and the associative property of addition. It has the advantage of incurring less computation overhead. The goal of this work is to bridge the gap between collaborative data collection and privacy preservation of individual data. I assess the two schemes by privacy-preservation efficacy, communication overhead, and data aggregation accuracy. Since both schemes trade message overhead for privacy, I will propose efficiency enhancement method for privacy preserving data aggregation when message overhead is a big concern. Finally, I will conclude this talk by discussing my future plan.
Bio:
Wenbo He is a Ph.D. candidate in Department of Computer Science, at University of Illinois at Urbana-Champaign, where she is supervised by Professor Klara Nahrstedt. Wenbo's research focuses on pervasive computing, security and privacy issues in networked embedded systems. Wenbo received the Mavis Memorial Fund Scholarship Award from College of Engineering of UIUC in 2006, and the C. W. Gear Outstanding Graduate Award in 2007 from the Department of Computer Science at UIUC. She is also a recipient of Vodafone Fellowship from 2005 to 2008, and NSF TRUST Fellowship in 2007. Wenbo got her M.S. from Department of Electrical and Computer Engineering at UIUC and M.Eng. from Department of Automation at Tsinghua University, Beijing, China, in 2000 and 1998 respectively. She received her B.S. from Department of Automation at Harbin Engineering University, Harbin, China in 1995. During August 2000 to January 2005, Wenbo was a software engineer in Cisco Systems, Inc.
Secure Web Applications and Expressive Security Policies
Date: Thursday, February 28, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Stephen Chong
PhD Candidate
Cornell University
Abstract:
In this talk, I'll present two recent projects that make programming with strong information security more practical: a new way of writing secure web applications, and a framework for expressing and enforcing an application's security requirements.
Swift is a new way to write secure, efficient web applications. Application code is written as Java-like code, annotated with security policies. Using these policies, Swift partitions the application into JavaScript code to run on the client, and Java code to run on the server. Code and data are placed to ensure that the specified policies are obeyed, and also to provide good interactive performance. Security critical code and data are always placed on the server. Swift makes it easier to write secure web applications: the programmer does not need to worry about the secure or efficient placement of code and data.
Declassification occurs when the confidentiality of information is weakened, for example, allowing more people to read. Erasure is the opposite, and occurs when confidentiality is strengthened, for example, allowing fewer people to read, perhaps removing the information from the system entirely. We have designed a policy framework to express, and provable enforce, applications' declassification and erasure requirements. We have used the policies in the implementation of a secure remote voting service, giving increased assurance that the voting service satisfies its information security requirements.
Bio:
Stephen Chong is a Ph.D. candidate at Cornell University, in Ithaca, NY, where he is advised by Andrew Myers. Steve's research focuses on language-based security and programming languages. He received a bachelor's degree from Victoria University of Wellington, New Zealand, and plans to complete his doctorate by May 2008.
Tree-based Overlay Networks for Scalable, Reliable Tools and Applications
Date: Tuesday, February 26, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Dorian Arnold
Computer Sciences Department
University of Wisconsin-Madison
Abstract:
HPC systems continue to grow in size and complexity making the development of scalable software systems increasingly difficult. As a result, very few tools and applications run effectively or at all at today's largest scales (tens and hundreds of thousands of processors). To make matters worse, million processor systems are scheduled for availability within the next two to four years.
Tree-based Overlay Networks (TBONs) have proven to be an effective computational model for scalable distributed tools and applications. A TBON is a network of hierarchically organized processes that exploits the logarithmic scaling properties of trees to provide scalable data multicast, gather, and in-network aggregation. In this talk, I will describe the TBON model, demonstrating its power and flexibility with unprecedented scalability results from a variety of application domains. I also will describe our novel TBON failure recovery model, state compensation, which relies on inherent information redundancies amongst TBON processes. State compensation features fast, decentralized tree reconstruction and state recovery protocols involving a small subset of the tree and no process coordination. The protocols are scalable because their performance is a function of the tree's fan-out, not total size. A tree with a fan-out of 64 recovers from failures in milliseconds: with only four levels, such a tree supports over a 16,000,000 processes!
Bio:
Dorian Arnold is a doctoral candidate and Intel Foundation Ph.D. fellow in the Computer Sciences Department at the University of Wisconsin. He holds a M.S. degree in Computer Science from the University of Tennessee and a B.S. degree in Mathematics and Computer Science from Regis University (Denver, CO). From 1999 to 2001, Dorian served as technical lead of the NetSolve project at the University of Tennessee's Innovative Computing Laboratory. In 2006, Dorian was a technical scholar at Lawrence Livermore National Laboratory. His research focuses on the performance and scalability issues of large distributed systems including efficient communication and runtime data analysis, fault-tolerance, and system deployment.
Techniques and Tools for Engineering Secure Web Applications
Date: Thursday, February 21, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Gary Wassermann
Ph.D. candidate
UC Davis
Abstract:
Web applications enable much of today's online business including banking, shopping, university admissions, and various governmental activities. Anyone with a web browser can access them, and the data they manage typically has significant value both to the users and to the service providers. Cross-site scripting (XSS) and SQL injection are classes of attacks in which an attacker interacts with a client or database, respectively, through vulnerabilities in the server thereby gaining the trust level of the server. These classes of attacks are pervasive: since 2005, they have been the most frequently reported classes of vulnerabilities. These vulnerabilities arise because web applications' layers (client, server, and database) communicate via unstructured strings, and validating untrusted input for use in these commands is error-prone and introduces a challenging software engineering problem.
In this talk, I will present a general characterization of these classes of input validation-based errors and a set of dynamic and static techniques to detect and prevent XSS and SQL injection attacks. Programmers usually do not specify their intentions explicitly regarding SQL query construction, but I will show how we can use principled techniques to characterize programmer intentions. We can then prevent attack queries from being sent to the database with a low-overhead, runtime check that precisely distinguishes legitimate queries from attacks. In order to help find bugs early in the software development process, I also pursued static analysis, and I will describe a sound and precise analysis that scales to large, real-world web applications and found known and unknown SQL injection vulnerabilities. I will further present how we extended this static analysis to the related but more difficult problem of XSS. I will conclude this talk by discussing future challenges in this domain.
Bio:
Gary Wassermann is a Ph.D. candidate in Computer Science at UC Davis, where he specializes in software engineering and programming languages. His current research focuses on software reliability and security. He received his B.S. in Computer Science also from UC Davis. Gary is a recipient of the GAANN fellowship.
Knowledge Transfer in Reinforcement Learning
Date: Tuesday, February 19, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Soumya Ray
Postdoctoral Researcher
Oregon State University
Abstract:
Humans are remarkably good at using knowledge acquired while solving past problems to efficiently solve novel, related problems. How can we build artificial agents with similar capabilities? In this talk, I focus on "reinforcement learning" (RL)—a setting where an agent must make a sequence of decisions to reach a goal, with intermittent feedback from the environment about the cost of its current decision. I describe an approach that allows agents to leverage experience gained from solving prior RL tasks. To do this, the agent learns a hierarchical Bayesian model from previously solved RL tasks and uses it to quickly infer the characteristics of a novel RL task. I present empirical evidence on navigation problems and tactical battle scenarios in a real-time strategy game, Wargus, that show that leveraging experience from prior tasks improves the rate of convergence to a solution in a new task.
Bio:
Soumya Ray obtained his baccalaureate degree from the Indian Institute of Technology, Kharagpur, and his doctorate from the University of Wisconsin, Madison in 2005. Since 2006, he has been a postdoctoral researcher in the machine learning group at Oregon State University. His research interests are in statistical machine learning, reinforcement learning and planning, and bioinformatics.
Virtual Environments: a multi-disciplinary research tool
Date: Thursday, February 14, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Betty Mohler
Post-doctoral Researcher
Max Planck Institute for Biological Cybernetics
Abstract:
Virtual environments (VEs) are computer-simulations of real or fictional environments with which users can interact. Potential applications of VEs include training, visualization, entertainment, design, rehabilitation, education, and research. The ultimate goal of VEs is to provide the full sensory experience of being in a simulated world. VEs are a very powerful tool to answer scientific research questions from many disciplines. In this talk, four specific projects will be described which use virtual environments to investigate human behavior and also provide suggestions to improve upon the current technology used for VEs. First, an empirical study of space perception within immersive VEs will be presented. Second, results which demonstrate the visual influence on human locomotor behavior will be discussed. Third, a project that analyzes the illusion of self-motion within a large-screen projection VE will be presented. Finally, the implementation of fully articulated full-body avatars within a VE in real time will be described. My research vision is to use virtual environments as a multi-disciplinary tool to provide scientists from various research backgrounds with a rich collection of data on human behavior and interaction. My goal as a scientist has always been to simultaneously investigate human perception while gaining insight from the user in order to improve virtual environments hardware and applications.
Bio:
Betty Mohler received her PhD in computer science from the University of Utah in 2007. She is now in her second year of a post-doctoral research position at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany. Her main research interest is in understanding the human observer towards the aim of improving virtual environment applications. Her approach has always been a multi-disciplinary one and, therefore, she currently collaborates with engineers, neuroscientists and psychologists.
Predicting Bugs by Analyzing Software History
Date: Tuesday, February 12, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Sunghun Kim
Postdoctoral Associate
MIT
Abstract:
Almost all software contains undiscovered bugs, ones that have not yet been exposed by testing or by users. What is the location of these bugs? This talk presents two approaches for predicting the location of bugs by analyzing software history. First, the bug cache contains 10% of the files in a software project. Through an analysis of the software's development history and the location of bugs, files are added and removed from the cache based on four bug localities: temporal, spatial, changed-entity, and new-entity locality. After processing, files in the bug cache contain 73-95% of undiscovered bugs. Second, to further improve the localization of predicted bugs, automatic change classification uses information from the configuration management commit transactions. Using machine learning techniques (Bayes Net, Support Vector Machines), we classify commits as being likely to have a fault, or unlikely to have a fault. The best precision and recall figures for each project are typically in the mid-70's. Hence, it is possible for a configuration management system to inform a developer, post-commit, that they have just created a bug (with approximately 94% likelihood).
Bio:
Sunghun Kim is a postdoctoral associate at MIT and a member of the Program Analysis Group. He completed his Ph.D. in the Computer Science Department at the University of California, Santa Cruz in 2006. He was a Chief Technical Officer (CTO), and led a 25-person team at the Nara Vision Co. Ltd, a leading Internet software company in Korea for six years. His core research area is Software Engineering, focusing on software evolution, program analysis, and empirical studies.
Some Open Problems in Supercomputing I/O
Date: Tuesday, January 29, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Lee Ward
Sandia National Laboratories
Abstract:
The dominant paradigm for I/O interfaces and implementation in the supercomputing world is based on the POSIX standards. While this is well known to be sub-optimal, many acceptable mitigation strategies and alternative approaches have not yet been investigated. This talk will give a quick overview of a few current, popular, representative, system I/O architectures, application I/O strategies, and I/O middleware. Then, we will examine and discuss a consensus-based list of open problems in the field that is used by various U.S. government agencies, such as DOE and NSF, to motivate research proposals.
Bio:
Lee Ward is a principal member of technical staff at Sandia National Laboratories. As an inveterate student of operating systems and file systems, his interests have provided the opportunity to make contributions in high performance, parallel file systems, IO libraries, hierarchical storage management, and compute cluster integration/management systems.
Learning to Walk and Swim with Shaped Manifold Control
Date: Thursday, January 24th, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Bill Smart
Department of Computer Science and Engineering
Washington University in St. Louis.
Abstract:
Learning to control high-dimensional, non-linear dynamical systems is hard, in part because of the Curse of Dimensionality. The volume of the state space increases exponentially with the number of state variables used to describe the system. Learning a controller over this space often requires an exponential amount of training data, limiting us to relatively low-dimensional systems.
Many robotic systems, however, do not inhabit the entire volume of the state space. In fact, any system with a periodic gait lives on a 1-dimensional manifold embedded in the full state space of the system. In this talk, we introduce Shaped Manifold Control (SMC), which simultaneously estimates the manifold over which the system operates and learns an effective controller over this manifold. SMC sidesteps the curse of dimensionality because is learns over a 1-dimensional manifold, regardless of the dimensionality of the full state space. We have successfully applied SMC to a number of simulated high-dimensional continuous dynamical systems, including swimming and walking robots, and will demonstrate the resulting learned controllers.
Bio:
Bill Smart holds a B.Sc. (Hons) in computer science from the University of Dundee (1991), an M.Sc. in Intelligent robotics from the University of Edinburgh (1992), and both an M.S. (1996) and Ph.D. (2002) in computer science from Brown University. He is currently an assistant professor of computer science and engineering at Washington University in St. Louis. He co-directs the Media and Machines Laboratory, a multi-disciplinary laboratory performing research in the areas of robotics, computer vision, machine learning, and computer graphics. His current research focuses on learning robust controllers for high-dimensional robot systems, human-robot interaction, and direct brain-computer interfaces.
The Genomics of Quiescent Cells:CS-driven Biological Discovery
Date: Tuesday, January 22, 2008
Time: 11 am — 12:15 pm
Place: ME 218
Maggie Werner-Washburne
Biology Department
University of New Mexico
Abstract:
Modern Biology is being revolutionized by genomic approaches, enabled by knowing the entire DNA sequence of an organism. I will talk about types of data, CS approaches, and how we have collaborated with computer scientists to understand more about quiescent yeast cells.
Bio:
Maggie Werner-Washburne received a bachelor's degree in English from Stanford in 1971, master's degree from the University of Hawaii in Botany in 1979, and a Ph.D. in Botany with a minor in Biochemistry from the University of Wisconsin in 1984. Her post-doctoral work was in yeast molecular genetics at the University of Wisconsin, where she discovered that heat shock proteins worked as chaperones. She is currently a Professor in the Department of Biology at UNM. Dr. Werner-Washburne has funded continuously since coming to UNM by NSF and NIH and is currently PI of several grants, including the NIH-funded IMSD program, which supports undergraduate and graduate research around campus. She has received a Presidential Young Investigator Award, a Presidential Award for Math, Engineering, and Science Mentoring, and is an AAAS Fellow. Dr. Werner-Washburne's work has been cited more than 3500 times in the literature. The focus of her research is genomic analysis of quiescence in yeast and development of new CS and statistical approaches for the evaluation, mining, and integration of different data types in genomics and proteomics.
Clustering and spatial data mining in computational biology
Date: Friday, December 7th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Susan Bridges
Department of Computer Science and Engineering
Mississippi State University
Abstract:
Data mining algorithms can be applied to a wide variety of problems in computational biology ranging from text mining to genome mining. This talk will cover two such applications: co-clustering of heterogenous data sets and spatial data mining of the genome.
Traditional clustering is typically based on a single feature set. In some domains, several feature sets may be available to represent the same objects, but it may not be easy to compute a useful and effective integrated feature set. We have developed two classes of algorithms to address the problem of combining the results of clustering obtained from multiple related datasets where the datasets represent identical or overlapping sets of objects but use different feature sets. Our methods are shown to yield higher quality clusters than the baseline clustering schemes that include the clustering based on individual feature sets and clustering based on concatenated feature sets.
The vast majority of DNA research has focused on genes. However, eukaryotic genomes are characterized, and often dominated by repetitive, non-genic DNA sequences and experimental evidence has shown that repetitive regions influence expression of nearby genes, alter gene structure, may be instrumental in generation of new genes, and may be a means of rapidly increasing genetic diversity during stress. We present a new method for mining the output of ab initio repeat finders to identify spatial relationships that exist among repetitive elements. We demonstrate that this method can be used to .re-discover. known elements in the genome and to discover novel elements that have not previously been described.
Bio
Susan Bridges received a bachelor's degree in botany from the University of Arkansas in 1969, master's degree from the University of Mississippi in biology in 1975 and in computer science from Mississippi State University in 1983, and a Ph.D. in computer science from the University of Alabama in Huntsville in 1989. She is currently a Professor in the Department of Computer Science and Engineering at Mississippi State University and Co-Director of the Institute of Digital Biology at Mississippi State University. She is a co-PI on an NSF CyberTrust grant and PI or co-PI on several USDA grants. The focus of her research is the application of data mining in genomes and proteomes and in integrating multiple sources of data.
Zombies, ETs and Other Encounters with Dynamic Graph Algorithms
Date: Friday, November 16th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Valerie King
University of Victoria
Abstract:
Given any graph problem, we may ask if it's possible to maintain a solution as an input graph changes incrementally, in time faster than recomputing from scratch for each update. The goal of a dynamic graph algorithm is to efficiently implement update and query operations in reasonable space.
This talk will be a whirlwind tour of fundamental ideas in dynamic graph algorithms and lower bounds, focusing mainly on the problems of connectivity and minimum spanning tree, transitive closure and shortest paths.
Bio:
Valerie King is Professor in the Computer Science Department at the University of Victoria. She received her A.B. in mathematics from Princeton, her J.D. and Ph.D. from Boalt Hall and the Computer Science Dept. at UC Berkeley. She held post-doctoral positions at Princeton University and the University of Toronto. She has been a research scientist at NECI in Princeton, and at Compaq SRC and HP Labs. She was a visiting researcher at Microsoft Research Labs in Silicon Valley, a visiting professor at Hebrew University and the University of Copenhagen and a visiting scholar at UC Berkeley. She has been a member of the California Bar since 1981. She has done extensive work in dynamic graph algorithms, and is currently working on algorithms with applications to networks, computational biology, and distributed computing, as well as sociology of the web.
Computational insights into the social life of zebras
Date: Friday, November 9th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Tanya Berger-Wolf
University of Illinois at Chicago
Abstract:
Computation has fundamentally changed the way we study nature. Recent breakthroughs in data collection technology, such as GPS and other mobile sensors, are giving biologists access to data about wild populations that are orders of magnitude richer than any previously collected. Such data offer the promise of answering some of the big ecological questions about animal populations:
Unfortunately, in this domain, our ability to analyze data lags substantially behind our ability to collect it. In particular, interactions among individuals are often modeled as social networks where nodes represent individuals and an edge exists if the corresponding individuals have interacted during the observation period. The model is essentially static in that the interactions are aggregated over time and all information about the time and ordering of social interactions is discarded. We show that such traditional social network analysis methods may result in incorrect conclusions on dynamic data about the structure of interactions and the processes that spread over those interactions. We have extended computational methods for social network analysis to explicitly address the dynamic nature of interactions among individuals. We have developed techniques for identifying persistent communities, influential individuals, and extracting patterns of interactions in dynamic social networks. We will present our approach and demonstrate its applicability by analyzing interactions among zebra populations and identifying how the structure of interactions relates to demographic status.
Bio:
Dr. Tanya Berger-Wolf is an assistant professor at the Department of Computer Science at the University of Illinois at Chicago. Her research interests are in applications of combinatorial optimization analysis and algorithm design techniques to problems in population biology of plants, animals, and humans, from genetics to social interactions.
Dr. Berger-Wolf has received her B.Sc. in Computer Science and Mathematics from Hebrew University (Jerusalem, Israel) in 1995 and her Ph.D. in Computer Science from University of Illinois at Urbana-Champaign in 2002. She has spent two years as a postdoctoral fellow at the University of New Mexico working in computational phylogenetics and a year at the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) doing research in computational epidemiology.
The End Of Computers As We Know Them?
Date: Friday, October 26th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Christof Teuscher
Los Alamos National Laboratory
Abstract:
Since the beginning of modern computer science some sixty years ago, we are building computers in more or less the same way. Silicon electronics serves as a physical substrate, the von Neumann architecture provides a computer design model, while the abstract Turing machine concept supports the theoretical foundations. Alternative computing paradigms and machines always played a rather marginal role in the past, essentially because there was little or no need to go beyond the horizon of silicon electronics. That is changing: in recent years, previously unseen computing substrates have seen the light, for example because of advances in synthetic biology, nanotechnology, material science, and neurobiology. A grand challenge in computer science consists in developing architectures, design methodologies, formal frameworks, and tools that allow to reliably compute and efficiently solve problems with such alternative devices. In this talk, I will first review some exemplary future and emerging computing devices and highlight the particular challenges that arise for performing computations with them. I will then delineate potential solutions on how these challenges might be addressed. Self-assembled nano-scale electronics will serve as a simple showcase. Without disruptive new technologies, it is expected that the ever-increasing computing performance and storage capacity achieved with existing technologies will eventually reach a plateau. Last, I will illustrate future challenges that need to be addressed in order to keep computer science going at the same pace.
Bio:
Christof Teuscher currently holds a Technical Staff Member position at Los Alamos National Laboratory (LANL). He obtained his M.Sc. and Ph.D. degree in computer science from the Swiss Federal Institute of Technology in Lausanne (EPFL) in 2000 and 2004 respectively. In 2004 he became a postdoctoral researcher at the University of California, San Diego (UCSD) and in 2005 a distinguished Director's Postdoctoral Fellow at the Los Alamos National Laboratory. His main research interests include biologically-inspired computing, emerging computing architectures and paradigms, complex & adaptive systems, and cognitive science. Teuscher has received several prestigious awards and fellowships. For more information visit: http://www.teuscher.ch/christof
Predictions and Machine Learning: A personal viewpoint
Date: Friday, October 19th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Tudor I. Oprea
Professor of Biochemistry and Molecular Biology
and Chief of the Division of Biocomputing
UNM School of Medicine
Abstract:
From the perspective of molecular bioactivity, machine learning is a widely used tool that streamlines the R&D process and enables unexpected discoveries. This talk will first showcase the use of machine learning in small molecule discovery, at UNM Biocomputing. Then we discuss the impact of the "unknown unknown" in our ability to perform accurate predictions, which is inherently limited by the past. The Black Swan, a metaphor for highly improbable, high impact events, warns us to understand the limits of predictions and expect the "unknown unknowns" everywhere, from war to science and industry. We further argue that the structure of the pharmaceutical and flavor industries are driven by, and impacted by both positive and negative Black Swans.
Bio:
Tudor I. Oprea has an MD and a PhD from the Universityof Medicine and Pharmacy in Timisoara, Romania. His past appointments were with the University of Utrecht, the Netherlands, Washington University in St.Louis, MO, and Los Alamos National Lab, before spending 6 years at AstraZeneca, Sweden. Since 2002, Tudor is Professor of Biochemistry and Molecular Biology, and Chief of the Division of Biocomputing at the UNM School of Medicine. Tudor works in the areas of QSAR, lead identification and optimization, virtual screening and drug discovery. He has co-authored two books, seventeen book chapters, more than sixty peer-reviewed papers and several patent disclosures. He received the 2002 Hansch Award from the QSAR and Molecular Modeling Society and serves as the Chair of the Cheminformatics and QSAR Society. He is the founder and CEO of Sunset Molecular Discovery LLC.
Fusion of multi-task and multi-modal brain imaging and genetic data: An integrated approach and several examples
Date: Friday, October 5th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Vince Calhoun
Department of Electrical and Computer Engineering
Univeristy of New Mexico
Abstract:
Brain imaging techniques available today provide multiple sources of complementary information. Most research studies do collect several types of brain images on the same individuals. However the vast majority of studies analyze each data set separately, and do not attempt to directly examine the inter-relationships between the different data types. Such approaches are not straightforward due to the high dimensionality of the data (tens of thousands of voxels or timepoints or genetic factors). In this talk I will present an approach for jointly analyzing brain imaging data using independent component analysis, and present examples from structural and functional MRI, Event-related potential data, and genetic data. The examples will illustrate situations in which we learn something new by combining multiple data types which would not have been revealed in a separate analysis.
Bio:
Vince Calhoun received a bachelor's degree in Electrical Engineering from the University of Kansas, Lawrence, Kansas, in 1991, master's degrees in Biomedical Engineering and Information Systems from Johns Hopkins University, Baltimore, in 1993 and 1996, respectively, and the Ph.D. degree in electrical engineering from the University of Maryland Baltimore County, Baltimore, in 2002. He is currently the Director of Image Analysis and MR Research at the MIND Institute and an Associate Professor in the Electrical and Computer Engineering Dept. at the University of New Mexico. He is the author of 69 full journal articles, over 30 technical reports, and over 110 abstracts and conference proceedings. Much of his career has been spent on the development of data driven approaches for the analysis of functional magnetic resonance imaging (fMRI) data. He has three R01 grants from the National Institute of Health on the development of various data driven methods and multimodal data fusion approaches applied to mental illness. In addition he has an NSF grant for developing methods to analyze complex-valued imaging data.
Computer Science Town Hall Meeting
Date: Friday, September 21st, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Stephanie Forrest
Department Chair
Department of Computer Science
Dept. Chair, Stephanie Forrest, will hold a "town hall" meeting with all graduate students (interested undergraduates are invited as well). This is your chance to learn more about what's going on in the department, ask questions, make suggestions, brainstorm about new ideas, and even complain about things that aren't working well for you. The meeting is intended to be highly interactive, but to kick things off Prof. Forrest will give a short overview of department activities and opportunities for students this year.
Network Scaling in Complex Systems: What happens when organisms, cities and computer chips get bigger?
Date: Friday, September 21st, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Melanie Moses
Department of Computer Science, UNM
Abstract:
Networks distribute energy, materials and information to the components of a
variety of natural and human-engineered systems, including organisms, cities,
and computer chips. Distribution networks enable integrated and
coordinated functioning of these complex systems, and they also
constrain their design. The behavior of these systems emerges from the
actions of the components and the ways in which those components are
networked together. The systems can vary enormously in size: a whale
has 1017 times more cells than a bacterium; the population of Los
Angeles is 4 orders of magnitude larger than Hatch, NM; a Dual Core
Itanium has 4 orders of magnitude more transistors than the 286.
Biologists have observed a set of nonlinear relationships between the
size of organisms and their behaviors (e.g. how much they eat, how
long they live, how dense their populations are). Here I ask whether
such regularities exist between size and behavior of other complex
systems. In other words, can bacteria and whales tell us something
about traffic in Hatch vs. Los Angeles or the power consumed by a 286
vs. an Itanium. I examine whether these apparently different systems
face similar network scaling constraints. I also discuss how different
scaling behaviors result from differences in the degree of
centralization in networks, constraints on the physical area or volume
of networks, and physical properties of components.
Bio:
Melanie received her Ph.D. in Biology from the University of New
Mexico in 2005. She received her B.S. from Stanford University in
Symbolic Systems, an interdisciplinary program in cognition and
computation. Melanie is currently an Assistant Professor in the
Department of Computer Science at the University of New Mexico where
she continues her interdisciplinary work in Computer Science and
Biology. Her research interests are in how properties of complex
systems change as they increase in size. Her work includes
mathematical models of organism growth and reproduction, models of
cardiovascular networks and the immune system, and the application of
biological network scaling models to engineered networks.
Randomized Rounding for Sensor Placement Problems
Date: Friday, September 14th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Cynthia Phillips
Sandia National Laboratories
Abstract:
The recent emphasis on homeland security in the U. S. has led to a number of new applications involving sensor placement in physical networks. We will describe some of these sensor placement problems including sensor placement in municipal water networks to minimize health effects from contamination and sensor placement for intruder detection in transportation networks or buildings. We have addressed these problems using (parallel) integer programming. In this talk we present a parallelizable heuristic method for finding approximate solutions to general sensor placement problems.
An Integer program (IP) is the optimization (maximization or minimization) of a linear function subject to linear constraints and integrality constraints on some or all of the variables. IPs naturally model NP-hard combinatorial optimization problems. Thus integer programming is itself NP-complete, but one can frequently solve instances in practice using branch and bound via commercial or research solvers. Removing the integrality constraints gives the linear-programming relaxation of an integer program. This is tractible both theoretically and in practice.
In this talk, we consider finding heuristic solutions to sensor placement IP problems using randomized rounding. This technique was introduced by Raghavan and Thompson. In its simplest form, one selects binary variables randomly independently using the LP-relaxation value as a probability distribution. For an arbitrary integer program, this will rarely find a solution that satisfies the linear constraints. However, our sensor-placement problems have only one hard constraint: a bound on the number of sensors. Other constraints compute the objective value.
We will present a new randomized rounding strategy that satisfies a single cardinality constraint. It samples uniformly from the set of "lucky" randomized roundings that happen to select exactly the correct number of variables. The algorithm requires O(k(n-k)) deterministic serial preprocessing time. We can then select a sensor placement with a small number of random numbers and O(n + k) additional time.
We compare our algorithm to a randomized rounding algorithm due to Doerr. His algorithm is also fast, meets the cardinality constraint, and has sufficient independence to allow Chernoff bound analysis. However, its implementation presents some challenging numerical issues. Our algorithm has stronger independence and a fundamentally different probability distribution for individual variables.
(Joint work with Jonathan Berry, Sandia National Laboratories)
Bio:
Cindy Phillips is a Distinguished Member of the Technical Staff in the Algorithms and Discrete Math Department at Sandia National Laboratories. She received a B.A. in applied mathematics from Harvard University in 1983 and a PhD in computer science from MIT in 1990. She joined Sandia National Laboratories in 1990 where she has conducted research in combinatorial optimization, algorithm design and analysis, and parallel computation with applications to scheduling, network and infrastructure surety, integer programming, graph algorithms, vehicle routing, computational biology, and experimental algorithmics. She is one of three main developers of the PICO massively-parallel integer programming code. She was a member of the Compute Process Allocator (CPA) team that won an R&D 100 award in 2006.
ConceptDoppler: A Weather Tracker for Internet Censorship
Date: Friday, September 7th, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Jed Crandall
Department of Computer Science, UNM
Abstract:
Imagine you want to remove the history of the Wounded Knee massacre from the Library of Congress, two ways to do this are: 1) remove "Bury my Heart at Wounded Knee" and a few other selected books; or 2) remove every book in the entire library that contains the word "massacre" in its text. If you view the Internet as one large library, Chinese Internet censorship based on keyword filtering is the equivalent of the latter. In this talk I'll present results from a paper we recently published about China's keyword-based Internet censorship mechanism.
We present two sets of results: 1) Internet measurements of keyword filtering by the Great "Firewall of China (GFC); and 2) initial results of using latent semantic analysis as an efficient way to reproduce a blacklist of censored words via probing.
Our Internet measurements suggest that the GFC's keyword filtering is more a panopticon than a firewall, i.e., it need not block every illicit word, but only enough to promote self-censorship. China's largest ISP, ChinaNET, performed 83.3% of all filtering of our probes, and 99.1% of all filtering that occurred at the first hop past the Chinese border. Filtering occurred beyond the third hop for 11.8% of our probes, and there were sometimes as many as 13 hops past the border to a filtering router. Approximately 28.3% of the Chinese hosts we sent probes to were reachable along paths that were not filtered at all. While more tests are needed to provide a definitive picture of the GFC's implementation, our results disprove the notion that GFC keyword filtering is a firewall strictly at the border of China's Internet.
While evading a firewall a single time defeats its purpose, it would be necessary to evade a panopticon almost every time. Thus, in lieu of evasion, we propose ConceptDoppler, an architecture for maintaining a censorship "weather report about what keywords are filtered over time. Probing with potentially filtered keywords is arduous due to the GFC's complexity and can be invasive if not done efficiently. Just as an understanding of the mixing of gases preceded effective weather reporting, understanding of the relationship between keywords and concepts is essential for tracking Internet censorship. We show that LSA can effectively pare down a corpus of text and cluster filtered keywords for efficient probing, present 122 keywords we discovered by probing, and underscore the need for tracking and studying censorship blacklists by discovering some surprising blacklisted keywords such as (in Chinese) conversion rate, Mein Kampf, and International geological scientific federation (Beijing).
(Joint work with Daniel Zinn, Michael Byrd, Earl Barr, UC Davis and Rich East)
Bio:
Jed received his Ph.D. from the University of California at Davis and his B.S. from Embry-Riddle Aeronautical University in Prescott, Arizona.
He is currenly an Assistant Professor with the Department of Computer Science, Univeristy of New Mexico.
Jed's research area is computer security and privacy, with a background in computer architecture, ranging from architectural support for systems security to the capturing and analyzing of Internet worms. More recent work includes behavior-based analysis of malicious code, including using a new technique called temporal search to detect timebombs within computer viruses based on their use of hardware timers. He also studies the technical issues of government censorship.
Automated Deduction and Its Application to Mathematics
Date: Friday, August 31, 2007
Time: 1 pm — 2:30 pm
Place: ME 218
Robert Veroff
Department of Computer Science, UNM
Abstract:
One of the objectives of automated deduction is to develop tools that use
mathematical logic and deduction to solve, or help people solve, problems
coming from a wide variety of application domains. Such tools are being
used for research in mathematics and have led to the solution of numerous
open questions. In this talk, I will give a brief introduction to the
field and will describe some of the activities of our research group.
Bio:
Bob Veroff is professor emeritus in the UNM Computer Science Department.
