« Frigate bird | Main | The Future of Terrorism »

May 13, 2010

Nature's Journal Club

A few months ago, I was invited to write a column for Nature's Journal Club. This series appears every week in their print edition and the text is also posted on the Journal Club's blog. The instructions were pretty simple: pick a paper with broad appeal, which has been published sometime in the last year or so and which has not appeared in Nature [1], and write about 260 words summarizing the results and describing why I like it. I was encouraged to be provocative, too, but I'll leave it to you to decide if I was that bold.

After mulling it over, I agreed to write a column on a paper by James O'Dwyer and Jessica Green, both at the University of Oregon in Eugene [2,3]. You can read the blog version of the column here or the print version here. Here's the setup:

Many species are concentrated in biodiversity hot spots such as tropical rainforests and coral reefs. But our estimates of how many species these and other ecosystems contain are very rough. Conservation efforts and ecological theories would be better served by a more accurate picture.

Our best guesses come from empirical species–area relationships, which count the number of species observed as a function of geographical area. These relationships show sharp increases at local and continental scales, but slow growth at intermediate scales. Despite decades of study, ecologists have no clear explanation of this pattern's origins or what causes deviations from it.

These species-area relationships (SARs) are ubiquitous in ecology largely because ecological survey practices have long focused on counting species within a specific study region. Most such data is collected from small survey areas and these data are then combined within a meta-study to get up to the regional or continental scales. Perhaps because of the ease of constructing SARs, much ink has been spilled over their structure. They're also our only reliable tool for estimating how many species live in places like the Amazon or the Great Barrier Reef, which are too large to survey completely.

What's clear from all this work is that there are some general patterns in SARs, and that if we want to use them in unconventional places, such as in estimating the number of microbial species in the world (or in smaller regions, like your gut), then we need a good theoretical explanation of where those patterns come from and what processes cause deviations from them at different length scales. That is, we need a good null model.

Creating one is largely what O'Dwyer and Green have done. There have, of course, been previous explanations of parts of the SAR pattern, with various amounts of biological realism. On the more unrealistic side, simple iid sampling from a sufficiently heavy-tailed distribution can generate SARs with power-law slopes in the right neighborhood. But, this kind of explanation ignores biological processes like speciation, extinction, dispersal in space, competition, etc., not to mention abiotic factors like geography and climate.

Building on previous work on neutral processes to explain biodiversity patterns, O'Dwyer and Green built a null model containing only the neutral processes of birth, death and dispersal. What makes this model different from, and better than, previous efforts is that it explicitly incorporates a notion of spatial structure by embedding species in space and allowing them to move around. This is helpful because it gets directly at the SAR. The problem, however, is that spatially explicit stochastic processes can be difficult to solve mathematically.

Fortunately, O'Dwyer and Green could use tools from quantum field theory, which is well suited to solving this kind of spatial stochastic model. Aside from the coolness of using quantum field theory in ecology and the fact that it predicts an SAR that agrees with decades of data, what I like about this result is that it illustrates two things close to my heart. First, it's a beautiful example of a null model. Because it includes only boring, neutral processes in generating its prediction for the SAR, when empirical data deviates from the model's prediction, those deviations can be interpreted as model mis-specification errors. In this case, that means interesting, non-neutral ecologically significant processes like competition, predation, habitat, climate, etc. In this way, it can generate new, specific hypotheses about what to test next.

The second is that this approach to model building moves the emphasis of the science away from small-scale (in time or space), context-dependent processes and towards more large-scale (in time and space) neutral dynamics and principles. This kind of perspective is currently more common in the physical sciences than in the biological ones [4], but I hope to see more of it in biology in the future [5], and it's one of the things I think physics has to offer ecology [6].

This aspect of O'Dwyer and Green's work fits nicely with my own on explaining why we see such huge morphological diversity in living and extinct species, and how whales got so much bigger than mice. In a way, the model I've been using is pretty similar to O'Dwyer and Green's: it omits all ecological processes, climate and geography, but includes neutral processes representing species birth, death (extinction), and dispersal (changes in body mass). The fact that both our neutral models do pretty well at correctly predicting the observed empirical data suggests that perhaps randomness, structured by a few relatively banal processes, might be a pretty good general explanation of how the biological world works at these very large scales. I suspect also that similar models, expanded to include some role for social institutions, will also work well to explain how societies work at large spatial and temporal scales. Finding out if this is true is something I hope to be around for.

-----

[1] This criterion was a pleasant surprise. As much as I dislike Nature's outsized status and influence in science, I've been pleasantly surprised on several occasions by some of their policies. Someone there genuinely seems to care about the integrity of the scientific process.

[2] Full disclosure: Jessica recently joined the external faculty here at SFI and James will be starting as a postdoc at SFI in the Fall. That being said, I haven't really interacted much with either of them.

[3] O'Dwyer and Green, "Field theory for biogeography: a spatially explicit model for predicting patterns of biodiversity." Ecology Letters 13, 87-95 (2010).

[4] It's uncommon in the biological sciences, but not unknown. Mathematical evolutionary theory and population genetics are good examples of communities that frequently use null models in this way [7]. I think the reason such an approach is more common in the physical sciences today is that we actually understand a great deal about the fundamental processes there, and what things can and should vary in different contexts, while we're still sorting those things out in biology. For sure, we're making progress, but it's slow going.

[5] It would be good for other fields, too, such as sociology and political science. The issue is, I think, that scientific progress toward general principles is always limited by the availability of data that reveal those principles. When scientists of any kind are restricted to having either rich data on a small number of examples (think of alchemy), or poor data on a large number of examples (think of polling data), it's hard to make real progress. In both cases, there are typically an embarrassment of reasonable explanations for the observed patterns and it's difficult to distinguish them with the crappy data we can get. This is partly why I'm excited about the increasing availability of "big data" on social behavior, largely coming out of digital systems like email, Facebook, Twitter, etc. These data are not a panacea for social science, since they have they have their own weird biases and pathologies, but they're rich data on huge samples of individuals, which is qualitatively different than what was available to social scientists in the past. Perhaps we can answer old questions using these new data, and perhaps we can even ask some new questions such as, Are the behavioral patterns at the population scale simply scaled up versions of the behavioral patterns at the individual scale?

[6] To summarize: what I think physics has to offer ecology, among other fields, is (i) a very impressive and useful set of mathematical tools and models, and (ii) a valuable shift in perspective, away from small-scale processes and toward large-scale processes and general principles. I'm not advocating that we replace ecologists with physicists, but rather that we encourage physicists to train and work with ecologists, and vice versa. Biology will always need scientists focused on understanding specific contexts, but it also needs scientists focused on synthesizing those context-specific results into more general theories, as I think O'Dwyer and Green have done. Generally, physicists often have a good intuition about which details will be important at the large-scale and they often have good mathematical tools for working out whether its true.

[7] The statistical models that underly most statistical hypothesis tests, which are ubiquitous in the biological and social sciences, are technically null models, too. But, in many cases, these are wholly inappropriate since their iid assumptions are grossly violated by the mechanistic processes actually at play. That being said, it can be hard to come up with a good null model because often we don't know which processes are the important ones to include. A topic for another day, I think.

posted May 13, 2010 08:05 AM in Interdisciplinarity | permalink

Comments

Good discussion, particularly on construction and use of null models. Must make my students read this. :)

Posted by: Matthew Berryman at June 8, 2010 08:34 PM