CS591 S'06 Class Readings
All members of the class (including auditors) are to read all of the assigned papers and meet with your group members to discuss the paper before class. The goal is that each group member should contribute her or his own insights and background knowledge to the small group discussions and hopefully clear up many confusions before we get to class. Topics that I would like you to cover in this discussion include some subset of:
- Does everybody understand the content of the paper? If not, what issues need to be clarified to improve understanding? Does the group have the collective knowledge to answer these questions, or does it require outside input? (E.g., from me, your other classmates, etc.) Feel free to come see me in office hours or to send mail to ml-class.
- What machine learning techniques are under investigation? Are these innovative ML algorithms, variants on existing algorithms, or well understood algorithms applied in novel ways or tested on novel data?
- What problem domain is under investigation? Did it benefit from the application of ML techniques? Did this approach solve the domain completely, or is further work necessary?
- How would you extend/improve this work? Did the authors make mistakes or oversimplifications? Is the proposal an approximation that could be improved? Could the ML algorithm be extended or a more sophisticated algorithm used in its place? Is there another (better, more interesting, or just novel) problem domain that this approach would apply to? (If so, what modifications to the approach would be necessary to apply it?)
- Was the experimental work rigorous? Did the authors use interesting or trivial data sets? How did they validate their methods? Do their experimental results fully support their claims? If not, what experimental work should they have done?
Note You don't have to answer all of these questions. You only have to address things that are relevant to the paper at hand. These questions are suggestions on things to keep your eye out for as you discuss the paper. What I'm mostly looking for is your thoughts on the paper and how you would go beyond what these authors did.
Deliverables
Each group should turn in (at the beginning of class) a short (1-2 pages), typewritten summary of their discussion. Specifically, your writeup should include:- A summary of the content of the paper (1-2 paragraphs). Don't simply copy the abstract -- formulate your own summary of the paper. This should be both a description of what domain was studied, what ML algorithms were used, and the results of the study. (Hint: this is good practice at writing your own abstracts, for those who haven't done this much yet.)
- A description of how you would extend/improve this work (1-3 paragraphs). Again, please don't just take the authors' "future work" -- formulate your own thoughts about where to take this work. See the discussion points above for some starting places on this.
Finally, I want to encourage you to have fun with these papers. They sometimes seem pretty dry, but they're discussing some fascinating things and you'll really learn far more about ML in practice through these than through the high-level presentations that you get in lecture or the book.
Enjoy!
The Papers
- Feb 21
- Dietterich, T.,
"An Experimental Comparison of Three Methods for Constructing Ensembles of
Decision Trees: Bagging, Boosting, and Randomization."
Machine Learning, 40(2), 139-157. 2000. (Available in
full text via UNM subscription; either retrieve from a campus URL or
use the libraries log-in service to retrieve it.)
Some possibly useful background materials:
- Dietterich, T. G. "Ensemble Learning". In The Handbook of Brain Theory and Neural Networks, Second edition, (M.A. Arbib, Ed.), Cambridge, MA: The MIT Press, 2002.
- DH&S Ch 9.5.1-2
