This uses a sub-string search, so that "graph" matches both "graph" and "graphical".
This search only searches by keyword. If you'd like, you can also search by researcher.
Found 29 results.
Listing from newest to oldest
TR-CS-2011-02
Exploiting Task Relatedness for Multitask Learning of Bayesian Network
Structures
Diane Oyen and Terran Lane
We address the problem of learning Bayesian networks for a collection of unsupervised tasks when limited data is available and a metric of the relatedness of tasks is given. We exploit this valuable information about task-relatedness to learn more robust structures for each task than those learned with a standard multitask learning algorithm. Our approach is the first network structure learning algorithm that addresses zero-data learning; where no training data is available for a task, but data for related tasks and information about how the tasks are related is given. This paper describes our Task Relationship Aware Multitask (TRAM) algorithm for learning Bayesian networks. The inputs to the algorithm are task-specific sets of data and an undirected graph representing a metric of task-relatedness. TRAM regularizes over the task-relatedness graph to output a network for each task, each of which is biased toward the structures of related tasks. We empirically compare, using synthetic and neuroimaging data, TRAM with three baseline methods and show that task-relatedness knowledge impacts all learning methods. Only TRAM effectively uses this knowledge to learn more robust structures.
TR-CS-2010-08
Exploiting Task Relatedness to Learn Multiple Bayesian Network Structures
Diane Oyen and Terran Lane
We address the problem of learning multiple Bayesian network structures for experimental data where the experimental conditions define relationships among datasets. A metric of the relatedness of datasets, or tasks, can be described which contains valuable information that we exploit to learn more robust structures for each task. We represent the task-relatedness with an undirected graph. Our method uses a regularization framework over this task-relatedness graph to learn Bayesian network structures for each task that are smoothed toward the structures of related tasks. Experiments on synthetic data and real fMRI experiment data show that this method learns structures which are close to ground truth, when available, and which generalize to holdout data better than an existing multitask learning method, learning networks independently, and learning one global network for all tasks.
TR-CS-2010-06
Enabling rational democratic decision-making with collective belief models and game theoretic analysis
Kshanti Greene, Joe Kniss and George Luger
We introduce a new approach to aggregating the beliefs and preferences of many individuals to form models for democratic decision-making. Traditional social choice functions used to aggregate beliefs and preferences attempt to find a single consensus model, but produce inconsistent results when a stalemate between opposing opinions occurs. Our approach combines the probabilistic beliefs of many individuals using Bayesian decision networks to form collectives such that the aggregate of each collective, or collective belief has rational behavior. We first extract the symbolic preference order used in social choice theory from each individual's quantitative Bayesian beliefs. We then show that if a group of individuals share a preference order, their aggregate will uphold principles of rationality defined by social choice theorists. These groups will form the collectives, from which we can extract the Pareto optimal solutions. By representing the situation competitively as opposed to forcing cooperation, our approach identifies the situations for which no single consensus model exists, and returns a rational set of models and their corresponding solutions. Using an election simulation, we demonstrate that our approach more accurately predicts of the behavior of a large group of decision-makers than a single consensus approach that exhibits unstable behavior.
TR-CS-2009-10
Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models
George F. Luger, Chayan Chakrabarti
An important research enterprise for the Artificial Intelligence community since the 1970s has been the design of expert or "knowledge-based" systems. These programs used explicitly encoded human knowledge, often in the form of a production rule system, to solve problems in the areas of diagnostics and prognostics. The earliest research/development program in expert systems was created by Professor Edward Feigenbaum at Stanford University (Buchanan and Shortliff 1984). Because the expert system often addresses problems that are imprecise and not fully proposed, with data sets that are often inexact and unclear, the role of various forms of probabilistic support for reasoning is important.
The 1990s saw radical new approaches to the design of automated reasoning/diagnostic systems. With the creation of graphical models, the explicit pieces of human knowledge (of the expert system) were encoded into causal networks, sometimes referred to as Bayesian belief networks (BBNs). The reasoning supporting these networks, based on two simplifying assumptions (that reasoning could not be cyclic and that the causality supporting a child state would be expressed in the links between it and its parent states) made BBN reasoning quite manageable computationally. In recent years the use of graphical models has replaced the traditional expert system, especially in situations where reasoning was diagnostic and prognostic, i.e., extending from concrete situations to the best explanations for their occurrence. This type reasoning is often termed abductive.
In this chapter we first (Section 1) present the technology supporting the traditional knowledge-based expert system, including the production system for reasoning with rules. Next (Section 2), we discuss Bayesian inference, and the adoption of simplifying techniques such as the Stanford Certainty Factor Algebra. We then (Section 3) introduce graphical models, including the assumptions supporting the use of Bayesian belief networks (BBN), and present an example of BBN reasoning. We conclude (Section 4) with a brief introduction of a next
generation system for diagnostic reasoning with more expressive forms of the BBN.
TR-CS-2009-08
Satisficing the masses: Applying game theory to large-scale,democratic decision problems
Kshanti A. Greene, Joseph M. Kniss, George F. Luger, Carl R. Stern (Management Sciences Inc.)
We present ongoing research on large-scale decision models in which there are many invested individuals. We apply our unique Bayesian belief aggregation approach to decision problems, taking into consideration the beliefs and utilities of each individual. Instead of averaging all beliefs to form a single consensus, our aggregation approach allows divergence in beliefs and utilities to emerge. In decision models this divergence has implications for game theory-enabling the competitive aspects in an apparent cooperative situation to emerge. Current approaches to belief aggregation assume cooperative situations by forming one consensus from diverse beliefs. However, many decision problems have individuals and groups with opposing goals, therefore this forced consensus does not accurately represent the decision problem. By applying our approach to the topical issue of stem cell research using input from many diverse individuals, we analyze the behavior of a decision model including the groups of agreement that emerge. We show how to find the Pareto optimal solutions, which represent the decisions in which no group can do better without another group doing worse. We analyze a range of solutions, from attempting to "please everybody," with the solution that minimizes all emerging group's losses, to optimizing the outcome for a subset of individuals. Our approach has the long-reaching potential to help define policy and analyze the effect of policy change on individuals.
TR-CS-2009-04
Fast Bayesian Network Structure Search Using
Gaussian Processes
Blake Anderson and Terran Lane
In this paper we introduce two novel methods for performing Bayesian network structure search that make use of Gaussian Process regression. Using a relatively small number of samples from the posterior distribution of Bayesian networks, we are able to find an accurate function approximator based on Gaussian Processes. This allows us to remove our dependency on the data during the search and leads to massive speed improvements without sacrificing performance. We use our function approximator in the context of Hill Climbing, a local-score based search algorithm, and in the context of a global optimization technique based on response surfaces. We applied our methods to both synthetic and real data. Results show that we converge to networks of equal score to those found by traditional Hill Climbing, but at a fraction of the total time.
TR-CS-2009-03
Approaches to the Network-Optimal Partition Problem for fMRI Data
Benjamin Yackley, Terran Lane
Although much MRI research has been performed using existing atlases of brain regions, it is possible that these regions, which are anatomically determined, are not the truly optimal way to divide brain activity into parts. The goal of this research is to discover a way to take existing fMRI data and use it as a basis for a clustering method that would divide the brain into functional partitions independent of anatomy. Since Bayesian networks over anatomical regions have been shown to be informative models of brain activity, this paper describes attempts to solve a mathematical problem which is an abstraction of the above goal: to partition a group of variables (the voxels of the MRI data) into clusters such that the optimal Bayesian network given the clustering is as "good" as possible.
TR-CS-2008-15
Agreeing to disagree: Leveraging consensus and divergence in Bayesian belief aggregation
Kshanti A. Greene and George F. Luger
We present a new approach for combining the beliefs of many individuals using graphical models. Existing Bayesian belief aggregation methods break several theoretical assumptions for Bayesian reasoning. More practically, existing opinion pool functions that compute a single value to represent the belief of all contributors do not represent reality well, especially in cases where there are many diverse opinions. Divergence is a natural result of combining opinions from individuals with different beliefs, backgrounds and experiences. Instead of forming a single consensus value that will average out this diversity, we find clusters of agreement for each probability distribution and propagate the cluster means throughout the network during inference. We utilize a social network that tracks the agreement between individuals and the normalized graph cut algorithm to find emerging groups of consensus in the agreement network. We leverage the agreement that occurs across multiple belief estimates to help reduce the complexity that may arise as the means are propagated throughout a belief network. By monitoring agreement over time we may also expose the variety of backgrounds that will help explain divergence in belief. This paper discusses the approach, background and our motives for ongoing research.
TR-CS-2008-14
Learning structurally consistent undirected probabilistic graphical models
Sushmita Roy, Terran Lane, Margaret Werner-Washburne
In many real-world domains, undirected graphical models such as Markov random fields provide a more natural representation of the dependency structure than directed graphical models. For example, Bayesian networks cannot explicitly capture cyclic dependencies which occur commonly in real-world networks such as in biology. Unfortunately, structure learning of undirected graphs using likelihood-based scores remains difficult because of the intractability of computing the partition function. We describe a new Markov random field structure learning algorithm which is motivated by canonical parameterization of Abbeel et al. We improve on their parameterization by learning per-variable canonical factors, which makes our algorithm suitable for domains with hundreds of nodes. Our algorithm is similar to learning dependency networks, but the learned structure is guaranteed to be consistent, and, therefore represents a consistent joint probability distribution. We compare our algorithm against several algorithms for learning undirected and directed models on simulated and real datasets from the biology domain. Our algorithm frequently outperforms existing algorithms, producing higher-quality structures, suggesting that enforcing consistency during structure learning is beneficial for learning undirected graphs.
TR-CS-2008-07
Bayesian Network Score Approximation using a Metagraph Kernel
Benjamin Yackley, Eduardo Corona, and Terran Lane
Many interesting problems, including Bayesian network structure-search, can be cast in terms of finding the optimum value of a function over the space of graphs. However, this function is often expensive to compute exactly. We here present a method derived from the study of reproducing-kernel Hilbert spaces which takes advantage of the regular structure of the space of all graphs on a fixed number of nodes to obtain approximations to the desired function quickly and with reasonable accuracy. We then test this method on both a small testing set and a real-world Bayesian network; the results suggest that not only is this method reasonably accurate, but that the BDe score itself varies quadratically over the space of all graphs.
TR-CS-2008-06
Using Laplacian Methods, RKHS Smoothing Splines and Bayesian Estimation as a framework for Regression on Graph and Graph Related Domains
Eduardo Corona, Terran Lane, Curtis Storlie, Joshua Neil
We first present an overview of the Laplacian operator, its properties, and the methods that arise from analizing its eigenvalues and eigenfunctions. We mainly draw our material from (BLS00) and (Chu97), trying to combine the linear algebra approach with the more analytic one, which suggests the analogue with the continuous Laplacian Operator on manifolds.
TR-CS-2007-20
Machine Learning: Probabilistic
George F. Luger, Nikita A. Sakhanenko, Roshan R. Rammohan, Chayan Chakrabarti
In this chapter we introduce probabilistic concepts and present several important machine learning algorithms influenced by these stochastic methods. At the heart of the chapter lie Bayes' rule and Markov chains. Probabilistic models of machine learning use the Bayesian approach which allows for interpretation of new evidence based on prior knowledge and experience. Similarly, Markov chains describe the probability of a current event as a function of the probabilities of events at previous time steps. We present hidden Markov models and their variants and illustrate the Viterbi algorithm with a speech processing example. Dynamic Bayesian networks, a temporal extension to traditional Bayesian networks, are presented. We also discuss different learning techniques including Markov chain Monte-Carlo and Expectation Maximization algorithms. Finally, we show how probabilistic measures can be used to extend reinforcement learning and Markov decision processes. This chapter is published as chapter 13 in the book "Artificial Intelligence: Structures and Strategies for Complex Problem Solving" by G. F. Luger (Addison-Wesley 2009).
TR-CS-2007-10
Selecting Bayesian Network Parameterizations for Generating Simulated Data
John Burge, Terran Lane
The correct solution for an unsupervised learning task is never known. Thus, experimental results for such tasks must be validated. This can be done by turning the task into a supervised learning problem by training on simulated data where the correct solution is known and controllable. While this is a common practice, little research is done on generating interesting classes of random data that can be applied to many domains. We present a method to select the parameters of a Bayesian network (BN) which can be used to generate simulated data. Our methods allow the magnitude of correlations among random variables in the data to be precisely controlled. We apply our methods to the validation of BN structure search techniques and demonstrate that by precisely controlling correlations in the simulated data, qualitative differences among structure learning algorithms can be observed.
TR-CS-2007-04
Managing Dynamic Contexts Using Failure-Driven Stochastic Models
Nikita A. Sakhanenko, George F. Luger, Carl R. Stern
We describe an architecture for representing and managing context shifts that supports dynamic data interpretation. This architecture utilizes two layers of learning and three layers of control for adapting and evolving new stochastic models to accurately represent changing and evolving situations. At the core of this architecture is a form of probabilistic logic used to encode sets of recursive relationships defining Dynamic Bayesian Models. This logic is extended with a syntax for defining contextual restrictions on stochastic Horn clauses. EM parameter learning is used to calibrate models as well as to assess the quality of fit between the model and the data. Model failure, detected as a poor fit between model and data, triggers a model repair mechanism based on causally informed context splitting and context merging. An implementation of this architecture for distributed weather monitoring is currently under development.
TR-CS-2006-03
A Theoretical and Experimental Evaluation of Augmented Bayesian
Classifiers
Vikas Hamine and Paul Helman
Naive Bayes is a simple Bayesian network classifier with strong independence assumptions among features. This classifier despite its strong independence assumptions, often performs well in practice. It is believed that relaxing the independence assumptions of naive Bayes may improve the performance of the resulting structure. Augmented Bayesian Classifiers relax the independence assumptions of naive Bayes by adding augmenting arcs that obey certain structural restrictions, between features of a naive Bayes classifier. In this paper we present algorithms for learning Augmented Bayesian Classifiers with respect to the Minimum Description Length (MDL) and Bayesian-Dirichlet (BD) metrics. Experimental results on the performance of these algorithms on various datasets selected from the UCI Machine Learning Repository are presented. Finally, a comparison of the learning rates and accuracies of various Augmented Bayesian Classifiers is presented on artificial data. Global connectivity from local geometric constraints for sensor networks with various wireless footprints
TR-CS-2005-39
The Design and Testing of a First-Order Logic-Based Stochastic Modeling Language
Daniel J. Pless, Chayan Chakrabarti, Roshan Rammohan, and George F. Luger
We have created a logic-based, Turing-complete language for stochastic modeling. Since the inference scheme for this language is based on a variant of Pearl's loopy belief propagation algorithm, we call it Loopy Logic. Traditional Bayesian networks have limited expressive power, basically constrained to finite domains as in the propositional calculus. Our language contains variables that can capture general classes of situations, events and relationships. A first-order language is also able to reason about potentially infinite classes and situations using constructs such as hidden Markov models(HMMs). Our language uses an Expectation-Maximization (EM) type learning of parameters. This has a natural fit with the Loopy Belief Propagation used for inference since both can be viewed as iterative message passing algorithms. We present the syntax and theoretical foundations for our Loopy Logic language. We then demonstrate three examples of stochastic modeling and diagnosis that explore the representational power of the language. A mechanical fault detection example displays how Loopy Logic can model time-series processes using an HMM variant. A digital circuit example exhibits the probabilistic modeling capabilities, and finally, a parameter fitting example demonstrates the power for learning unknown stochastic values.
TR-CS-2005-24
Dynamic Bayesian Networks: Class Discriminative Structure Search with
an Application to Functional Magnetic Resonance Imaging Data
John Burge
The neuroanatomical behavior and underlying root causes of many mental illnesses are only moderately understood. To further understand these illnesses, the relationships among neuroanatomical regions of interest (ROIs) can be modeled. Neuroscientists have several standardized methods which have yielded much success, however, many of these methods make restrictive assumptions. On the other hand, the machine learning community has developed techniques that do not require many of these assumptions. I propose a method based on dynamic Bayesian networks (DBNs) that is capable of modeling the temporal relationships among ROIs. DBNs are graphical models that explicitly represent statistical correlations among random variables (RVs). Within the fMRI domain, it is not known a priori which ROIs are correlated. Thus, the DBN's structure modeling the ROI interactions must be learned. My research focuses on novel techniques capable of learning the structure of DBNs from a given set of data. First, I observe that under certain circumstances, learning multiple Bayesian networks, one for each possible class of data, is a better approach than the frequently employed method of learning a single BN with a class node. Second, I propose a method for determining which structures are representative of class-discriminating relationships. Third, I propose a structure search heuristic that takes advantage of the relationships among hierarchically related RVs. Finally, I propose to incorporate shrinkage (a statistical technique previously used in machine learning) in the calculation of the DBN's parameters. This proposal also summarizes preliminary work on the classification of functional magnetic resonance imaging data with my proposed scoring function.
TR-CS-2005-13
A First-Order Stochastic Prognostic System for the Diagnosis of Helicoptor Rotor Systems for the US Navy.
Chayan Chakrabarti, Roshan Rammohan and George F. Luger
We have created a diagnostic system for the US Navy to use in the analysis of the running health of helicopter rotor systems. Although our system is not yet deployed for real-time in-flight diagnosis, we have successfully analyzed the data sets of actual helicopter rotor failures supplied by the US Navy. We discuss both critical techniques supporting the design of our stochastic diagnostic system as well as issues related to full deployment.
Our diagnostic system, called DBAYES, is composed of a logic-based, first-order, and Turing-complete set of software tools for stochastic modeling. We use this language for modeling time-series data supplied by sensors on the mechanical system. The inference scheme for these software tools is based on a variant of Pearl's loopy belief propagation algorithm. Our language contains variables that can capture general classes of situations, events, and relationships. A Turing-complete language is able to reason about potentially infinite classes and situations, similar to the analysis of Dynamic Bayesian Networks. Since the inference algorithm is based on a variant of loopy belief propagation, the language includes the Expectation Maximization type learning of parameters in the modeled domain. In this paper we briefly present the theoretical foundations for our first-order stochastic language and then demonstrate time-series modeling and learning in the context of fault diagnosis.
TR-CS-2005-12
A First-Order Stochastic Modeling Language for Diagnosis
Chayan Chakrabarti and George F. Luger
We have created a logic-based, first-order, and Turing-complete set of software tools for stochastic modeling. Because the inference scheme for this language is based on a variant of Pearl's loopy belief propagation algorithm, we call it Loopy Logic. Traditional Bayesian belief networks have limited expressive power, basically constrained to that of atomic elements as in the propositional calculus. Our language contains variables that can capture general classes of situations, events, and relationships. A Turing-complete language is able to reason about potentially infinite classes and situations, with a Dynamic Bayesian Network. Since the inference algorithm for Loopy Logic is based on a variant of loopy belief propagation, the language includes an Expectation Maximization-type learning of parameters in the modeling domain. In this paper we briefly present the theoretical foundations for our loopy-logic language and then demonstrate several examples of stochastic modeling and diagnosis.
TR-CS-2004-28
Bayesian Classification of FMRI Data: Evidence for Altered Neural Networks in Dementia
John Burge, Vincent P. Clark, Terran Lane, Hamilton Link and Shibin Qiu
The alterations in functional relationships among brain regions associated with senile dementia are not well understood. We present a machine learning technique using dynamic Bayesian networks (DBNs) that extracts causal relationships from functional magnetic resonance imaging (fMRI) data. Based on these relationships, we build neural-anatomical networks that are used to classify patient data as belonging to healthy or demented subjects. Visual-motor reaction time task data from healthy young, healthy elderly, and demented elderly patients (Buckner et al. 2000) was obtained through the fMRI Data Center. To reduce the extremely large volume of data acquired and the high level of noise inherent in fMRI data, we averaged data over neuroanatomical regions of interest. The DBNs were able to correctly discriminate young vs. elderly subjects with 80% accuracy, and demented vs. healthy elderly subjects with 73% accuracy. In addition, the DBNs identified causal neural networks present in 93% of the healthy elderly studied. The classification efficacy of the DBN was similar to two other widely used machine learning classification techniques: support vector machines (SVMs) and Gaussian nave Bayesian networks (GNBNs), with the important advantage that the DBNs provides candidate neural anatomical networks associated with dementia. Networks found in demented but not healthy elderly patients included substantial involvement of the amygdala, which may be related to the anxiety and agitation associated with dementia. DBNs may ultimately provide a biomarker for dementia in its early stages, and may be helpful for the diagnosis and treatment of other CNS disorders.
TR-CS-2004-13
Logic-Based First-Order Stochastic Language that Learns
Roshan Rammohan, Chayan Chakrabarti, Dan Pless, Kshanti Greene and
George Luger
We have created a logic-based, first-order, and Turing-complete set of software tools for stochastic modeling. Because the inference scheme for this language is based on a variant of Pearl's loopy belief propagation algorithm, we call it Loopy Logic. Traditional Bayesian belief networks have limited expressive power, basically constrained to that of atomic elements as in the propositional calculus. A language contains variables that can capture general classes of situations, events, and relationships. A Turing-complete language is able to reason about potentially infinite classes and situations.
Since the inference algorithm for Loopy Logic is based on a variant of loopy belief propagation, the language includes an Expectation Maximization-type learning of parameters in the modeling domain. In this paper we present the theoretical foundations for our loopy-logic language and then demonstrate several examples using the Loopy Logic system for stochastic modeling and diagnosis.
TR-CS-2004-11
Learning Optimal Augmented Bayes Networks
Vikas Hamine and Paul Helman
Naive Bayes is a simple Bayesian classifier with strong independence assumptions among the attributes. This classifier, desipte its strong independence assumptions, often performs well in practice. It is believed that relaxing the independence assumptions of a naive Bayes classifier may improve the classification accuracy of the resulting structure. While finding an optimal unconstrained Bayesian Network (for most any reasonable scoring measure) is an NP-hard problem, it is possible to learn in polynomial time optimal networks obeying various structural restrictions. Several authors have examined the possibilities of adding augmenting arcs between attributes of a Naive Bayes classifier. Friedman, Geiger and Goldszmidt define the TAN structure in which the augmenting arcs form a tree on the attributes, and present a polynomial time algorithm that learns an optimal TAN with respect to MDL score. Keogh and Pazzani define Augmented Bayes Networks in which the augmenting arcs form a forest on the attributes (a collection of trees, hence a relaxation of the stuctural restriction of TAN), and present heuristic search methods for learning good, though not optimal, augmenting arc sets. The authors, however, evaluate the learned structure only in terms of observed misclassification error and not against a scoring metric, such as MDL. In this paper, we present a simple, polynomial time greedy algorithm for learning an optimal Augmented Bayes Network with respect to MDL score.
TR-CS-2004-10
Reconstructing Phylogenies from Gene-Content and Gene-Order Data
Bernard M.E. Moret, Jijun Tang, and Tandy Warnow
Gene-order data has been used successfully to reconstruct organellar phylogenies; it offers low error rates, the potential to reach farther back in time than through DNA sequences (because genome-level events are much rarer than DNA point mutations), and relative immunity from the so-called gene-tree vs. species-tree problem (caused by the fact that the evolutionary history of specific genes is not isomorphic to that of the organism as a whole). It has also provided deep mathematical and algorithmic results dealing with permutations and shortest sequences of operations on these permutations. Recent developments include generalizations to handle insertions, duplications, and deletions, scaling to large numbers of organisms, and, to a lesser extent, to larger genomes; and the first Bayesian approach to the reconstruction problem. We survey the state-of-the-art in using such data for phylogenetic reconstruction, focusing on recent work by our group that has enabled us to handle arbitrary insertions, duplications, and deletions of genes, as well as inversions of gene subsequences. We conclude with a list of research questions (mathematical, algorithmic, and biological) that will need to be addressed in order to realize the full potential of this type of data.
TR-CS-2003-04
Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data
J. Tang and B.M.E. Moret
Phylogenetic reconstruction from gene-order data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distance-based methods (such as neighbor-joining), parsimony methods using sequence-based encodings, Bayesian approaches, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach, but cannot handle more than about 15 genomes. We report here on our successful efforts to scale up direct optimization through a two-step approach: the first step decomposes the dataset into smaller pieces and runs the direct optimization (GRAPPA) on the smaller pieces, while the second step builds a tree from the results obtained on the smaller pieces. We used the sophisticated disk-covering method (DCM) pioneered by Warnow and her group, suitably modified to take into account the computational limitations of GRAPPA. We find that DCM-GRAPPA scales gracefully to at least $1,000$ genomes and retains surprisingly high accuracy throughout the range: in our experiments, the topological error rate rarely exceeded a few percent. Thus, reconstruction based on gene-order data can now be accomplished with high accuracy on datasets of significant size.
TR-CS-2003-01
EM Learning of Product Distributions in a First-Order Stochastic
Logic Language
Daniel Pless and George Luger
We describe a new logic-based stochastic modeling language called Loopy Logic. It is an extension of the Bayesian logic programming approach of Kersting and De Raedt [2000]. We specialize the Kersting and De Raedt formalism by suggesting that product distributions are an effective combining rule for Horn clause heads. We use a refinement of Pearl's loopy belief propagation [Pearl, 1998] for the inference algorithm. We also extend the Kersting and De Raedt language by adding learnable distributions. We propose a message passing algorithm based on Expectation Maximization [Dempster et al., 1977] for estimating the learned parameters in the general case of models built in our system. We have also added some additional utilities to our logic language including second order unification and equality predicates.
TR-CS-2002-33
An Investigation of Phylogenetic Likelihood Methods
Tiffani Williams and Bernard M.E. Moret
We analyze the performance of likelihood-based approaches used to reconstruct phylogenetic trees. Unlike other techniques such as Neighbor-Joining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood. We study the accuracy, speed, and likelihood scores of four representative likelihood-based methods (fastDNAml, MrBayes, PAUP*, and TREE-PUZZLE) that use either Maximum Likelihood (ML) or Bayesian inference to find the optimal tree. NJ is also studied to provide a baseline comparison. Our simulation study is based on random birth-death trees, which are deviated from ultrametricity, and use the Kimura 2-parameter + Gamma model of sequence evolution. We find that MrBayes (a Bayesian inference approach) consistently outperforms the other methods in terms of accuracy and running time.
TR-CS-2002-31
Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data
J. Tang, T. Liu, and B.M.E. Moret
Phylogenetic reconstruction from gene-order data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distance-based methods (such as neighbor-joining), parsimony methods using sequence-based encodings, Bayesian approaches, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach, but cannot handle more than about 15 genomes. We report here on our successful efforts to scale up the direct approach through a two-step approach: the first step decomposes the dataset into smaller pieces and runs the direct optimization (GRAPPA) on the smaller pieces, while the second step builds a tree from the results obtained on the smaller pieces. We used both a quartet-based approach (the quartet-puzzling method) and the more sophisticated disk-covering method (DCM) pioneered by Warnow. We find that DCM-GRAPPA scales gracefully to 100 genomes and retains surprisingly high accuracy throughout the range: in our experiments, the topological error rate rarely exceeded a few percent. The quartet-based methods perform better than might be expected from their known accuracy on sequence data, but trail far behind DCM-GRAPPA in terms of both speed and accuracy. Overall, our results demonstrate that reconstruction based on gene-order data can now be accomplished with high accuracy on datasets of significant size.
TR-CS-2002-23
Loopy Logic
Dan Pless and George Luger
In this paper we describe a new logic-based stochastic modeling language. It is an extension of the Bayesian logic programming approach of Kersting and De Raedt. We specialize the Kirsting and De Raedt formalism by suggesting that product distributions are an effective combining rule for Horn clause heads. We use a refinement of Pearl's loopy belief propagation for the inference algorithm. We also extend the Kirsting and De Raedt language by adding learnable distributions. We propose a message passing algorithm based on Expectation Maximization for estimating the learned parameters in the general case of models built in our system. We have also added some additional utilities to our logic language including second order unification and equality predicates.
TR-CS-2002-18
A Bayesian Network Classification Methodology for Gene Expression Data
Paul Helman, Robert Veroff, Susan R. Atlas, and Cheryl Willman
We present new techniques for the application of the Bayesian network learning framework to the problem of classifying gene expression data. Our techniques address the complexities of learning Bayesian nets in several ways. First, we focus on classification and demonstrate how this reduces the Bayesian net learning problem to the problem of learning subnetworks consisting of a class label node and its set of parent genes. We then consider two different approaches to identifying parent sets which are supported by current evidence; one approach employs a simple greedy algorithm to search the universe of all genes, and a second approach develops and applies a gene selection algorithm whose results are incorporated as a prior to enable an exhaustive search for parent sets over a restricted universe of genes. Two other significant contributions are the construction of classifiers from multiple, competing Bayesian network hypotheses and algorithmic methods for normalizing and binning gene expression data in the absence of prior expert knowledge. Our classifiers are first developed under a cross validation regimen against two publicly available data sets and then validated on corresponding out-of-sample test sets. The classifiers attain a classification rate in excess of 90% on each of these out-of-sample test sets.