« Conference: International Workshop on Coping with Crises in Complex Socio-Economic Systems 2011 | Main | Design and Analysis of Algorithms (CSCI 5454) »

February 02, 2011

Whither computational paleobiology?

This week I'm in Washington DC, hanging out at the Smithsonian (aka the National Museum of Natural History) with paleontologists, paleobiologists, paleobotanists, palaeoentomologist, geochronographers, geochemists, macrostratigraphers and other types of rock scientists. The meeting is an NSF-sponsored workshop on the Deep Time Earth-Life Observatory Network (DETELON) project, which is a community effort to persuade NSF to fund large-ish interdisciplinary groups of scientists exploring questions about earth-life system dynamics and processes using data drawn from deep time (i.e., the fossil record).

One of the motivations here is the possibility to draw many different skill sets and data sources together in a synergistic way to shed new light on fundamental questions about how the biosphere interacts with (i.e., drives and is driven by) geological processes and how it works at the large scale, potentially in ways that might be relevant to understanding the changing biosphere today. My role in all this is to represent the potential of mathematical and computational modeling, especially of biotic processes.

I like this idea. Paleobiology is a wonderfully fascinating field, not just because it involves studying fossils (and dinosaurs; who doesn't love dinosaurs?), but also because it's a field rich with interesting puzzles. Surprisingly, the fossil record, or rather, the geological record (which includes things that are not strictly fossils), is incredibly rich, and the paleo folks have become very sophisticated in extracting information from it. Like many other sciences, they're now experiencing a data glut, brought on by the combination of several hundred years of hard work, a large community of data collectors and museums, along with computers and other modern technologies that make extracting, measuring and storing the data easier to do at scale. And, they're building large, central data repositories (for instance, this one and this one), which span the entire globe and all of time. What's lacking in many ways is the set of tools that can allow the field to automatically extract knowledge and build models around these big data bases in novel ways.

Enter "computational paleobiology", which draws on the tools and methods of computer science (and statistics and machine learning and physics) and the questions of paleobiology, ecology, macroevolution, etc. At this point, there aren't many people who would call themselves a computational paleobiologist (or computational paleo-anything), which is unfortunate. But, if you think evolution and fossils are cool, if you like data with interesting stories, if you like developing clever algorithms for hard inference problems or if you like developing mathematical or computational models for complex systems, if you like having an impact on real scientific questions, and if you like a wide-open field, then I think this might be the field for you.

posted February 2, 2011 08:26 PM in Interdisciplinarity | permalink


I need to add this to my list. It sounds awesome, and dinosaurs were in fact one of my major childhood fascinations.

As usual, the big problem is having time to actually do it...

Posted by: Mason Porter at February 3, 2011 02:34 AM

Hi Aaron,
Thanks for the neat post. I wanted to bring your attention to some very neat computational paleobiology work that was done in the 1960s by David Raup, that started the field of theoretical morphospaces. It is important work in macroevolution (though maybe not in geology which is what you are interested in modeling?) Anyway thought you might be interested:


(CSSS Beijing 2008)

Posted by: Viviane at February 3, 2011 07:58 AM


Actually, I'm very interested in morphology and macroevolution, and Raup's work is a classic antecedent of the kind of thing I mean by computational paleobiology. There are many others, too, and like all other sciences, paleo has been incorporating computers into the work flow in various ways for many years. But, I think computers, and particular smart algorithms and statistical models, can contribute much much more. Scale makes a big difference, that's true, both in terms of the amount of data now available and the speed of the calculations computers can now make, but I think there are genuinely novel questions that computers can help answer and genuinely novel problems they can help solve, perhaps in a similar way that "computational biology" (which is probably better called computational genetics) has helped change the way microbiology is done.


p.s. Mason, it is awesome, and yeah, the time is the issue. Maybe we could do something together (read: have our students do something together).

Posted by: Aaron at February 3, 2011 11:35 AM

It would be great to do that! And even though my students (and occasionally even postdocs) do the heavy lifting these days, I still get to have lots of fascinating scientific discussions with them in the process. I will be talking to a prospective Masters students on Monday and need to replace a project (because somebody else is now already doing that with me), so if you have any specific ideas now, I could tell the student about it. Otherwise, we could brainstorm and come up with something nice. One of my PhD students has applied to be at SFI this summer, so that is one possible mechanism if he's interested in the topic.

The term "computational biology" is often used more broadly than that, especially when it comes to the faculty hired in a program.

Of course, we have a Systems Biology Doctoral Training Centre (using UK spelling because it's a proper name) that defines "systems biology" to only be at the cellular scale and below or something like that (which constrains the PhD projects of its students). In particular, my understanding is that the first research in systems biology was in ecology, and that is disallowed by this definition. This becomes even stranger given that Bob May is a local. And for some reason, my colleagues seem to think this definition of systems biology actually makes sense. Ah well. There's always our Life Sciences Interface Doctoral Training Centre, which happens to have a one-sided interface. :)

Posted by: Mason Porter at February 4, 2011 04:43 PM

Excellent posting. You’re beyond doubt an expert of such writing topics. This is absolutely the first time I went through your post and to tell the truth it succeeds in making me visit here time and again.

Posted by: vikings resort dominican republic at February 11, 2011 06:18 AM

hi Aaron,

I hope you're not predicting the premature demise of Computational Paleobiology? - "Withering" on the vine! On the other hand, Whither Computational Paleobiology? with its emphasis on future directions, seems admirably positive!

Posted by: Richard Overill at April 11, 2011 07:58 AM