« You're doing science with what? | Main | What is the probability of a 9/11-size terrorist attack? »

August 22, 2011

Data Analysis and Models for Complex Systems (CSCI 7000)

This semester, I am again teaching my topics course on data analysis and models for complex systems. The course content is a bit of a mash-up, intended to give students a crash course in probability theory, complex networks, model estimation, comparison and testing, random walks, a little stochastic processes, simulation techniques and lots of data. I'm continuing to require students to both give a technical lecture on a topic related to inference and models, and do a small independent project.

One goal is that, by the end of course, students will be able to take a new data set, identify its interesting structure, and then develop, simulate and test probabilistic models of that structure.

For those of you interested in following along, here's a list of lecture topics, which I'll update with links to lecture notes as they pass by.

CSCI 7000 Inference, Models and Simulation for Complex Systems

Computers are fundamentally changing how science is done by allowing us to record enormous amounts of data on social, biological, technological and physical systems. This data glut should allow us both to address old questions about complex systems in new ways and to answer fundamentally new kinds of questions. But, our ability to do this relies on the development of new algorithms to automatically extract information about patterns in these huge data sets and to reliably test complex scientific hypotheses. Examples include understanding the evolution of the Internet (at various levels), social behavior on the Web (Facebook, Wikipedia, Twitter, etc.), the role of networks in structuring complex social behaviors and biological functions, the emergence of population-scale patterns from seemingly random individual-level behavior, and forecasting rare but important events in the future.

This graduate-level topics course will cover a selection of recent developments in computational approaches to doing science with complex systems. It is not a scientific computing course. Topics will include statistical inference, the structure of complex networks, macro-phenomena in biological evolution and in wars and terrorism, simple mathematical models, and simulation techniques for more complicated models. The focus will be on using computational tools (algorithms) to do science (work with data; test hypotheses; build understanding; make predictions).

Lecture 0: Overview, probability distributions (lecture notes)
Lecture 1: Poisson process, exponential distribution, likelihood functions (lecture notes)
Lecture 2: Power-law distributions (mathematics and properties) (lecture notes)
Lecture 3: Power-law distributions (empirical data and mechanisms) (lecture notes)
Lecture 4: Testing models (hypothesis tests and model plausibility) (lecture notes)
Lecture 5: Comparing models (marginalization, likelihood ratios, etc.) (lecture notes)
Lecture 6: Time series analysis, random walks (lecture notes)
Lecture 7: More time series analysis, random walks (lecture notes)
Lecture 8: Random walk models of macroevolution (lecture notes)
Lecture 9: More macroevolution (lecture notes)
Lecture 10: Probabilistic models of terrorism and wars
Lecture 11: More terrorism and wars
Lecture 12: Introduction to complex networks (lecture notes)
Lecture 13: More networks (lecture notes)
Lecture 14: Random graph models (lecture notes)
Lecture 15: Citation networks (lecture notes)
Lecture 16: Modular and hierarchical network structures (lecture notes)
Lecture 17: More modules and hierarchies (lecture notes)
Lecture 18: (student lecture) The bootstrap
Lecture 19: (student lecture) Duplication-mutation models
Lecture 20: (student lecture) Privacy and de-anonymizing human data
Lecture 21: (student lecture) Levy flights
Lecture 22: (student lecture) Regression models
Lecture 23: (student lecture) Metabolic networks
Lecture 24: (student lecture) Gaussian mixture models
Lecture 25: (student lecture) People are not? particles
Lecture 26+: Project presentations

posted August 22, 2011 09:08 AM in Teaching | permalink


For those of us who are too lazy to look up last year's version of the course and make a comparison, what are you changing (or at least have already decided to change) between last year and this year?

Posted by: Mason Porter at August 22, 2011 10:50 AM

Very little, to be honest. I've reorganized the lectures a little, to put the macroevolutionary and terrorism topics next to the random walks and probability distributions material, and I'm going to revise the way I present certain topics. I also picked the topics for the student lectures this year (last time the students picked, with input from me). And, I'm putting a little more emphasis on data analysis from a computational statistics point of view. Finally, this version of the course is also cross-listed for upper-level undergraduates.

That's about it!

Posted by: Aaron at August 22, 2011 08:54 PM