CS 533 - Experimental Methods in Computer Science

Course description:

This course explores the design, experimentation, testing, and pitfalls of empirical research in Computer Science. In particular, students will learn how to use a data-driven approach to understand computing phenomena, formulate hypotheses, design computing experiments to test and validate or refute said hypotheses, evaluate and interpret empirical results. Overall, the goal of this course is to provide the students with the foundations of rigorous empirical research.


Most lectures will be loosely based on the following optional textbooks and additional reading material:

Empirical Methods for Artificial Intelligence

  • Author: Paul Cohen
  • Publisher: A Bradford Book (August 3, 1995)
  • ISBN-10: 0262032252
  • ISBN-13: 978-0262032254

A Guide to Experimental Algorithmics

  • Author: Catherine McGeoch
  • Publisher: Cambridge University Press; 1 edition (January 30, 2012)
  • ISBN-10: 0521173019
  • ISBN-13: 978-0521173018

Measuring Computer Performance: A Practitioner's Guide

  • Author: David J. Lilja
  • Publisher: Cambridge University Press; 1 edition (September 8, 2005)
  • ISBN-10: 0521646707
  • ISBN-13: 978-0521646703


Academic honesty:

Unless otherwise specified, you must write/code your own homework assignments. You cannot use the web to find answers to any assignment. If you do not have time to complete an assignment, it is better to submit your partial solutions than to get answers from someone else. Cheating students will be prosecuted according to University guidelines. Students should get acquainted with their rights and responsibilities as explained in the Student Code of Conduct

Any and all acts of plagiarism will result in an immediate dismissal from the course and an official report to the dean of students.

Instances of plagiarism include, but are not limited to: downloading code and snippets from the Internet without explicit permission from the instructor and/or without proper acknowledgment, citation, or license use; using code from a classmate or any other past or present student; quoting text directly or slightly paraphrasing from a source without proper reference; any other act of copying material and trying to make it look like it is yours.

Note that dismissal from the class means that the student will be dropped with an F from the course.

The best way of avoiding plagiarism is to start your assignments early. Whenever you feel like you cannot keep up with the course material, your instructor is happy to find a way to help you. Make an appointment or come to office hours, but DO NOT plagiarize; it is not worth it!

Class attendance:

Attendance to class is expected (read mandatory) and note taking encouraged. Important information (about exams, assignments, projects, policies) may be communicated only in the lectures. We may also cover additional material (not available in the book) during the lecture. If you miss a lecture, you should find what material was covered and if any announcement was made.



Homework will be assigned to reinforce concepts covered in class. Homework may include exercises, coding, or data analysis. Homework accounts for 10% of your final grade and no late homework will be accepted.

Paper discussions

Papers will be discussed every week. Students are required to read and prepare a one page review of the paper. Each time one student will act as the discussion leader and he/she is expected to prepare slides or other adequate material for the presentation of main points in the paper. Depending on the topic, the rest of the class will participate on an open discussion or a debate of the paper.

  • Presentation and discussion of the paper will be done during class time and account for 15% of your final grade.
  • Paper reviews are due at 8 am on the day of the paper discussion and account for 25% of your final grade. Late paper reviews won't be accepted.

Daily assignments and participation

You can expect to have simple exercises every meeting. These daily assignments will be done in groups specified by the instructor and they will account for your participation grade (10% of your final grade)


Exams are our formal evaluation tool. In the exams you will be tested with respect to the learning goals of this course (see the schedule below for the list of learning goals). Exams will comprise a mix of practical exercises and concepts. I don't encourage you to learn concepts and definitions by hart, but to be able to explain them with your own words and to place these concepts into the broader context they belong to. There will be one midterm exam on TBD and one final exam by the end of April

The exam is open notes, but only personal, hand-written notes are accepted. Restrictions in this matter include (but are not limited): you cannot download notes from Internet, you cannot use the electronic notes of the course, and you cannot photocopy notes from your classmates.  In fact, the key point is that they must be your own hand-written notes because I expect you to reinforce what you learned in class by writing down key concepts.


  • Participation 15 pts
  • Homework 10 pts
  • Paper reviews 25 pts
  • Paper presentation 10 pts
  • Exams 40 pts

Grades will be based on your earned points, following this grade scale. You need to get the specified number of points or more to obtain the grade from the same column. Scores will be rounded to the closest integer value.

A	A-	B+	B	B-	C+	C	F
95	93	90	85	83	80	75	<75

Please note:

  • Incomplete can be assigned only for a documented medical reason
  • Change of grade to CR/NC after the semester deadline will be granted ONLY under special, documented extenuating circumstances.


I value student's opinions regarding the course and I will take them in consideration to make this course as exciting and engaging as possible. Thus, through the semester I will ask students formal and informal feedback. Formal feedback includes short surveys on my teaching effectiveness, preferred teaching methods, and pace of the class. Informal feedback will be in the form of polls or in-class questions regarding learning preferences. You can also leave anonymous feedback in the form of a note in my departmental mail box, under my office door, or using this form. Remember that it is in the best interest of the class if you bring up to my attention if something is not working properly (e.g the pace of the class is too slow, the projects are boring, my teaching style is not effective) so that I can make the corrective steps.


In accordance with University Policy 2310 and the Americans with Disabilities Act (ADA), academic accommodations may be made for any student who notifies the instructor of the need for an accommodation. If you have a disability, either permanent or temporary, contact Accessibility Resource Center at 277-3506 for additional information.



  • Empirical research
  • Types of empirical studies


  • Computing as a Science: A Survey of Competing Viewpoints Link
Matti Tedre, Computing as a Science: A Survey of Competing Viewpoints, Minds and Machines, August 2011, Volume 21, Issue 3, pp 361–387


  • Data
  • Diagnosing data
  • Data Cleaning
  • Exploratory data analysis
  • Data Visualization


  • Visualising a state-wide patient data collection: a case study to expand the audience for healthcare data Link
Wei Luo, Marcus Gallagher, Di O'Kane, Jason Connor, Mark Dooris, Col Roberts, Lachlan Mortimer, and Janet Wiles. 2010. Visualising a state-wide patient data collection: a case study to expand the audience for healthcare data. In Proceedings of the Fourth Australasian Workshop on Health Informatics and Knowledge Management - Volume 108 (HIKM '10), Anthony Maeder and David Hansen (Eds.), Vol. 108. Australian Computer Society, Inc., Darlinghurst, Australia, Australia,


  • Statistical inference
  • Hypothesis testing
  • Sampling distributions
  • Parameter estimation and confidence intervals


  • Supporting Exploratory Hypothesis Testing and Analysis Link
Guimei Liu, Haojun Zhang, Mengling Feng, Limsoon Wong, and See-Kiong Ng. 2015. Supporting Exploratory Hypothesis Testing and Analysis. ACM Trans. Knowl. Discov. Data 9, 4, Article 31 (June 2015), 24 pages.
  • A Multiple Test Correction for Streams and Cascades of Statistical Hypothesis Tests Link
Geoffrey I. Webb and François Petitjean. 2016. A Multiple Test Correction for Streams and Cascades of Statistical Hypothesis Tests. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 1255-1264.



  • Experimental design
  • Control
  • Sampling bias
  • Pilot experiments
  • Factorial design
  • Challenges and opportunities of experimental design in CS


Measuring and Testing

Measuring computer performance


  • Statistical methods
  • Monte Carlo testing
  • Bootstrap
  • Randomization tests
  • Jackknife and cross validation
  • Non parametric tests


  • Scientific Benchmarking of Parallel Computing Systems Link
JT. Hoefler, R. Belli. 2015. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15). ACM, ISBN: 978-1-4503-3723-6, Nov. 2015.
  • The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q Link
Fabrizio Petrini, Darren J. Kerbyson, and Scott Pakin. 2003. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q. In Proceedings of the 2003 ACM/IEEE conference on Supercomputing (SC '03).



  • Linear models
  • Probabilistic models
  • Stochastic models


  • Simulation of stochastic processes
  • Monte Carlo simulation
  • Discrete event simulation

Experimental algorithmics

  • Tuning algorithms


  • Towards a Discipline of Experimental Algorithmics Link
Bernard M. E. Moret; Towards a Discipline of Experimental Algorithmics; Communications of the ACM, 1999

Analysis and Interpretation

Descriptive, diagnostic, predictive, and prescriptive analytics

  • Modern statistical analysis
  • Regression analysis
  • Bayesian analysis
  • Power analysis
  • Sensitivity analysis
  • Interpretation of results


  • Multiple Comparisons in Induction Algorithms Link
Jensen, D.D. & Cohen, P.R. Machine Learning (2000) 38: 309.
  • Twenty tips for interpreting scientific claims Link
William J. Sutherland, David Spiegelhalter, Mark Burgman; Twenty tips for interpreting scientific claims; Nature 503, 335–337 (21 November 2013)

Final Considerations

  • Reproducibility
  • Ethical issues


  • From Repeatability to Reproducibility and Corroboration Link
Dror G. Feitelson. 2015. From Repeatability to Reproducibility and Corroboration. SIGOPS Oper. Syst. Rev. 49, 1 (January 2015), 3-11.
  • Ethical Issues in Empirical Studies of Software Engineering Link
Janice Singer and Norman G. Vinson. 2002. Ethical Issues in Empirical Studies of Software Engineering. IEEE Trans. Softw. Eng. 28, 12 (December 2002), 1171-1180.