Possibly Useful Resources
Other useful machine learning reference books
- Duda, R. O., Hart, P. E., and Stork, D. G., Pattern Classification, 2nd. John Wiley & Sons, 2001. (A recent revision of a classic in the field. Fairly mathematically dense, but chock-full o' ML goodness.)
- Mitchell, T. M., Machine Learning, McGraw-Hill, 1997. (Another canonical text. Very accessible, but not as unified as the DHS text.)
- Witten, I. H and Frank, E., Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2000. (Much less technical, but gives some good, practical, real-world examples. And code.)
- Hastie, T., Tibshirani, R., and Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, 2001. (Very, very technical. Excellent and thorough, but can be difficult to follow.)
- Sutton, R. and Barto, A., Reinforcement Learning, MIT Press, 1998. (The only text that really covers reinforcement learning in any depth.)
- Hand, D., Mannila, H., and Smyth, P., Principles of Data Mining, MIT Press, 2001. (A good, relatively in-depth text, focusing on the theory and practice of data mining specifically.)
- Rasmussen, C. E. and Williams, C. K. I., Gaussian Processes for Machine Learning. MIT Press, 2006. (A more specialized book, on a single, but important, category of learning machines.)
- Scholkopf, B. and Smola, A.&J., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2001. (A surprisingly accessible text on support vector machines, kernel methods, and the related computational learning theory.)
Software
- Weka Machine learning software suite. Not really industrial scale, but supports a wide range of methods and makes experimentation fairly convenient. Also provides an archive of nicely formatted "benchmark" data sets, including many of the UCI data sets.
- ML Demos A suite for interactive experimentation with ML algorithms. Currently only supports a handful of methods, and is restricted to 2D data visualization, but provides some nice intuition about what is happening and how different hyperparameters affect the learned model.
