Homework 1

Due: Thurs, Feb 4, 2010, start of class.

All students:

  1. Devise a decision tree learning algorithm that can accept an arbitrary lost (cost) matrix for its loss function.
Students enrolled in the 529 section also do the following:
  1. Show that entropy gain is concave (i.e., anti-convex).
  2. Show that a binary, categorical decision tree, using information gain as a splitting criterion, always increases purity. That is, information gain is non-negative for all possible splits, and is 0 only when the split leaves the data distribution unchanged in both leaves.
  3. Prove that the basic decision tree learning algorithm (i.e., the greedy, recursive, tree-growing algorithm from class, with no pruning), using the information gain splitting criterion, halts.