From Teaching

EM: Syllabus

CS 533 - Experimental Methods in Computer Science

Course description:

This course explores the design, experimentation, testing, and pitfalls of empirical research in Computer Science. In particular, students will learn how to use a data-driven approach to understand computing phenomena, formulate hypotheses, design computing experiments to test and validate or refute said hypotheses, evaluate and interpret empirical results. Overall, the goal of this course is to provide the students with the foundations of rigorous empirical research.

Textbooks

Most lectures will be loosely based on the following optional textbooks and additional reading material:

Empirical Methods for Artificial Intelligence

A Guide to Experimental Algorithmics

Measuring Computer Performance: A Practitioner's Guide


POLICIES

Academic honesty:

Unless otherwise specified, you must write/code your own homework assignments. You cannot use the web to find answers to any assignment. If you do not have time to complete an assignment, it is better to submit your partial solutions than to get answers from someone else. Cheating students will be prosecuted according to University guidelines. Students should get acquainted with their rights and responsibilities as explained in the Student Code of Conduct

Any and all acts of plagiarism will result in an immediate dismissal from the course and an official report to the dean of students.

Instances of plagiarism include, but are not limited to: downloading code and snippets from the Internet without explicit permission from the instructor and/or without proper acknowledgment, citation, or license use; using code from a classmate or any other past or present student; quoting text directly or slightly paraphrasing from a source without proper reference; any other act of copying material and trying to make it look like it is yours.

Note that dismissal from the class means that the student will be dropped with an F from the course.

The best way of avoiding plagiarism is to start your assignments early. Whenever you feel like you cannot keep up with the course material, your instructor is happy to find a way to help you. Make an appointment or come to office hours, but DO NOT plagiarize; it is not worth it!

Class attendance:

Attendance to class is expected (read mandatory) and note taking encouraged. Important information (about exams, assignments, projects, policies) may be communicated only in the lectures. We may also cover additional material (not available in the book) during the lecture. If you miss a lecture, you should find what material was covered and if any announcement was made.

ASSIGNMENTS

Homework

Homework will be assigned to reinforce concepts covered in class. Homework may include exercises, coding, or data analysis. Homework accounts for 10% of your final grade and no late homework will be accepted.

Paper discussions

Papers will be discussed every week. Students are required to read and prepare a one page review of the paper. Each time one student will act as the discussion leader and he/she is expected to prepare slides or other adequate material for the presentation of main points in the paper. Depending on the topic, the rest of the class will participate on an open discussion or a debate of the paper.

Daily assignments and participation

You can expect to have simple exercises every meeting. These daily assignments will be done in groups specified by the instructor and they will account for your participation grade (10% of your final grade)

EXAM

Exams are our formal evaluation tool. In the exams you will be tested with respect to the learning goals of this course (see the schedule below for the list of learning goals). Exams will comprise a mix of practical exercises and concepts. I don't encourage you to learn concepts and definitions by hart, but to be able to explain them with your own words and to place these concepts into the broader context they belong to. There will be one midterm exam on TBD and one final exam by the end of April

The exam is open notes, but only personal, hand-written notes are accepted. Restrictions in this matter include (but are not limited): you cannot download notes from Internet, you cannot use the electronic notes of the course, and you cannot photocopy notes from your classmates.  In fact, the key point is that they must be your own hand-written notes because I expect you to reinforce what you learned in class by writing down key concepts.

GRADING

Grades will be based on your earned points, following this grade scale. You need to get the specified number of points or more to obtain the grade from the same column. Scores will be rounded to the closest integer value.

A	A-	B+	B	B-	C+	C	F
95	93	90	85	83	80	75	<75

Please note:

FEEDBACK

I value student's opinions regarding the course and I will take them in consideration to make this course as exciting and engaging as possible. Thus, through the semester I will ask students formal and informal feedback. Formal feedback includes short surveys on my teaching effectiveness, preferred teaching methods, and pace of the class. Informal feedback will be in the form of polls or in-class questions regarding learning preferences. You can also leave anonymous feedback in the form of a note in my departmental mail box, under my office door, or using this form. Remember that it is in the best interest of the class if you bring up to my attention if something is not working properly (e.g the pace of the class is too slow, the projects are boring, my teaching style is not effective) so that I can make the corrective steps.

ADA:

In accordance with University Policy 2310 and the Americans with Disabilities Act (ADA), academic accommodations may be made for any student who notifies the instructor of the need for an accommodation. If you have a disability, either permanent or temporary, contact Accessibility Resource Center at 277-3506 for additional information.


SCHEDULE

Introduction

Reading:

Matti Tedre, Computing as a Science: A Survey of Competing Viewpoints, Minds and Machines, August 2011, Volume 21, Issue 3, pp 361–387
Dror G. Feitelson; Experimental Computer Science: The Need for a Cultural Change; 2006

Observation

Reading:

Dror G. Feitelson; Looking at data. International Parallel and Distributed Processing Symposium, 2008, 2008
Wei Luo, Marcus Gallagher, Di O'Kane, Jason Connor, Mark Dooris, Col Roberts, Lachlan Mortimer, and Janet Wiles. 2010. Visualising a state-wide patient data collection: a case study to expand the audience for healthcare data. In Proceedings of the Fourth Australasian Workshop on Health Informatics and Knowledge Management - Volume 108 (HIKM '10), Anthony Maeder and David Hansen (Eds.), Vol. 108. Australian Computer Society, Inc., Darlinghurst, Australia, Australia,

Hypothesis

Reading:

Guimei Liu, Haojun Zhang, Mengling Feng, Limsoon Wong, and See-Kiong Ng. 2015. Supporting Exploratory Hypothesis Testing and Analysis. ACM Trans. Knowl. Discov. Data 9, 4, Article 31 (June 2015), 24 pages.
Geoffrey I. Webb and François Petitjean. 2016. A Multiple Test Correction for Streams and Cascades of Statistical Hypothesis Tests. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 1255-1264.

Design

Reading:

Peisert S., Bishop M. (2007) How to Design Computer Security Experiments. In: Fifth World Conference on Information Security Education. IFIP — International Federation for Information Processing, vol 237. Springer

Measuring and Testing

Measuring computer performance

Testing

Reading:

Fabrizio Petrini, Darren J. Kerbyson, and Scott Pakin. 2003. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q. In Proceedings of the 2003 ACM/IEEE conference on Supercomputing (SC '03).
JT. Hoefler, R. Belli. 2015. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15). ACM, ISBN: 978-1-4503-3723-6, Nov. 2015.

Experimentation

Modeling

Simulation

Experimental algorithmics

Readings:

Edi Shmueli, Dror G. Feitelson; On Simulation and Design of Parallel-Systems Schedulers: Are We Doing the Right Thing ?; in IEEE Transactions on Parallel and Distributed Systems, Vol 20, issue 7
Bernard M. E. Moret; Towards a Discipline of Experimental Algorithmics; Communications of the ACM, 1999

Analysis and Interpretation

Descriptive, diagnostic, predictive, and prescriptive analytics

Reading:

Jensen, D.D. & Cohen, P.R. Machine Learning (2000) 38: 309.
William J. Sutherland, David Spiegelhalter, Mark Burgman; Twenty tips for interpreting scientific claims; Nature 503, 335–337 (21 November 2013)

Final Considerations

Readings:

Dror G. Feitelson. 2015. From Repeatability to Reproducibility and Corroboration. SIGOPS Oper. Syst. Rev. 49, 1 (January 2015), 3-11.
Janice Singer and Norman G. Vinson. 2002. Ethical Issues in Empirical Studies of Software Engineering. IEEE Trans. Softw. Eng. 28, 12 (December 2002), 1171-1180.
Retrieved from http://www.cs.unm.edu/~estrada/teaching/trilce/index.php?n=EM.Syllabus
Page last modified on May 11, 2017, at 03:52 PM EST