Homework 3

Due: Tues, Mar 20, 2012, start of class. (Tues after Spring Break)

turnin key: cs429-529.hw3

All students:

  • Write a maximum likelihood estimate learner for the mean and covariance matrix of a Gaussian distribution in arbitrary dimensional space. Apply it to the four data sets in hw3_q1_data.zip.

    For each data set, report the following:

    • Estimated mean vector for each class. For data set D (100 dims) report only the first five dimensions of each mean vector.
    • Eigenvalues of the estimated covariance matrix for each class. For data set D (100 dims) report only the five largest eigenvalues for each class.
    • Empirically estimated error rate (e.g., by cross-validation)
    • Exact Bayes optimal error rate, according to the distribution models you learned for the two classes
  • Given the following degenerate gamma distribution:
    f(x) = 1/(26!⋅β27)x26 exp(-x/β)
    for data x>0 and parameter β>0,
    1. Write down the likelihood function for an IID data sample X, L(β|X)
    2. Find the log-likelihood, l(β|X)
    3. Find the maximum-likelihood estimator for β

    529 Section:

    1. Derive a Bayesian classification boundary for binary class data with an arbitrary cost matrix, C, where ci,j gives the cost of misclassifying an item of class i as class j