May 21, 2012

If it disagrees with experiment, it is wrong

The eloquent Feynman on the essential nature of science. And, he nails it exactly: science is a process of making certain types of guesses about the world around us (what we call "theories" or hypotheses), deriving their consequences (what we call "predictions") and then comparing those consequences with experiment (what we call "validation" or "testing").

Although he doesn't elaborate them, the two transitions around the middle step are, I think, quite crucial.

First, how do we derive the consequences of our theories? It depends, in fact, on what kind of theory it is. Classically, these were mathematical equations and deriving the consequences meant deriving mathematical structures from them, in the form of relationships between variables. Today, with the rise of computational science, theories can be algorithmic and stochastic in nature, and this makes the derivation of their consequences trickier. The degree to which we can derive clean consequences from our theory is a measure of how well we have done in specifying our guess, but not a measure of how likely our guess is to be true. If a theory makes no measurable predictions, i.e., if there are no consequences that can be compared with experiment or if there is no experiment that could disagree with the theory, then it is not a scientific theory. Science is the process of learning about the world around us and measuring our mistakes relative to our expectations is how we learn. Thus, a theory that can make no mistakes teaches us nothing. [1]

Second, how do we compare our predictions with experiment? Here, the answer is clear: using statistics. This part remains true regardless of what kind of theory we have. If the theory predicts that two variables should be related in a certain way, when we measure those variables, we must decide to what degree the data support that relation. This is a subtle point even for experienced scientists because it requires making specific but defensible choices about what constitutes a genuine deviation from the target pattern and how large a deviation is allowed before the theory is discarded (or must be modified) [2]. Choosing an optimistic threshold is what gets many papers into trouble.

For experimental science, designing a better experiment can make it easier to compare predictions with data [3], although complicated experiments necessarily require sensitive comparisons, i.e., statistics. For observational science (which includes astronomy, paleontology, as well as many social and biological questions), we are often stuck with the data we can get rather than the data we want, and here careful statistics is the only path forward. The difficulty is knowing just how small or large a deviation is allowed by your theory. Here again, Feynman has something to say about what is required to be a good scientist:

I'm talking about a specific, extra type of integrity that is not lying, but bending over backwards to show how you are maybe wrong, that you ought to have when acting as a scientist. And this is our responsibility as scientists...

This is a very high standard to meet. But, that is the point. Most people, scientists included, find it difficult to be proven wrong, and Feynman is advocating the active self-infliction of these psychic wounds. I've heard a number of (sometimes quite eminent) scientists argue that it is not their responsibility to disprove their theories, only to show that their theory is plausible. Other people can repeat the experiments and check if it holds under other conditions. At its extreme, this is a very cynical perspective, because it can take a very long time for the self-corrective nature of science to get around to disproving celebrated but ultimately false ideas.

The problem, I think, is that externalizing the validation step in science, i.e., lowering the bar of what qualifies as a "plausible" claim, assumes that other scientists will actually check the work. But that's not really the case since almost all the glory and priority goes to the proposer not the tester. There's little incentive to close that loop. Further, we teach the next generation of scientists to hold themselves to a lower standard of evidence, and this almost surely limits the forward progress of science. The solution is to strive for that high standard of truth, to torture our pet theories until the false ones confess and we are left with the good ideas [4].


[1] Duty obliges me to recommend Mayo's classic book "Error and the Growth of Experimental Knowledge." If you read one book on the role of statistics in science, read this one.

[2] A good historical example of precisely this problem is Millikan's efforts to measure the charge on an electron. The most insightful analysis I know of is the second chapter of Goodstein's "Fact or Fraud". The key point is that Millikan had a very specific and scientifically grounded notion of what constituted a real deviation from his theory, and he used this notion to guide his data collection efforts. Fundamentally, the "controversy" over his results is about this specific question.

[3] I imagine Lord Rutherford would not be pleased with his disciples in high energy physics.

[4] There's a soft middle ground here as to how much should be done by the investigator and how much should be done by others. Fights with referees during the peer-review process often seem to come down to a disagreement about how much and what kind of torture methods should be used before publication. This kind of adversarial relationship with referees is also a problem, I think, as it encourages authors to do what is conventional rather than what is right, and it encourages referees to nitpick or to move the goal posts.

posted May 21, 2012 04:29 PM in Scientifically Speaking | permalink | Comments (1)

December 19, 2007

Just in time for the holidays

There's excellent coverage elsewhere of the recent trashing of science funding by Congress and the Whitehouse, for instance, Cosmic Variance, Science Magazine (with gruesome details about who didn't get what), and Computer Research Policy Blog (with details on all the devious tricks pulled to make the funding changes look less terrible than they actually are).

The past few years have seen support for basic research decline quite a bit at the federal level (for instance, the DoD basically eliminated all of its basic research funding, which forced many of its previous researchers to go instead to the already overburdened NSF for funds). The funding decrease this year would have only produced a lot of grumbling and complaining, and probably even a lot of upset letters had it not been for the promises by Congress and the Whitehouse for funding increases this year (via the America COMPETES initiative ( link).

In other, more happy news, I just had my first paper accepted at Nature. The review process was considerably more painful than I expected. The final product is certainly improved over the initial one, on account of changes we made to address some of the questions by the reviewers (along with changes based on feedback from many talks I've given on the topic over the past year). It's comforting to know that good, interdisciplinary work can actually get published in a vanity journal like Nature. That being said, I really prefer to write papers in which I can actually discuss the technical details in the main body of the paper, rather than hiding them all in the appendices that no one actually reads. I also like writing papers longer than 15 paragraphs.

With that, happy holidays to you all. I head east in the morning for the usual family festivities, which will induce yet another hiatus from blogging for me. I'll return to beautiful New Mexico just in time to do my usual year-end blog wrap up here.

posted December 19, 2007 09:05 PM in Rant | permalink | Comments (3)

February 21, 2006

Pirates off the Coast of Paradise

At the beginning of graduate school, few people have a clear idea of what area of research they ultimately want to get into. Many come in with vague or ill-informed notions of their likes and dislikes, most of which are due to the idiosyncrasies of their undergraduate major's curriculum, and perhaps scraps of advice from busy professors. For Computer Science, it seems that most undergraduate curricula emphasize the physical computer, i.e., the programming, the operating system and basic algorithm analysis, over the science, let alone the underlying theory that makes computing itself understandable. For instance, as a teaching assistant for an algorithms course during my first semester in grad school, I was disabused of any preconceptions when many students had trouble designing, carrying-out, and writing-up a simple numerical experiment to measure the running time of an algorithm as a function of its input size, and I distinctly remember seeing several minds explode (and, not in the Eureka! sense) during a sketch of Cantor's diagonalization argument. When you consider these anecdotes along with the flat or declining numbers of students enrolling in computer science, we have a grim picture of both the value that society attributes to Computer Science and the future of the discipline.

The naive inference here would be that students are (rightly) shying away from a field that serves little purpose to society, or to them, beyond providing programming talent for other fields (e.g., the various biological or medical sciences, or IT departments, which have a bottomless appetite for people who can manage information with a computer). And, with programming jobs being outsourced to India and China, one might wonder if the future holds anything but an increasing Dilbert-ization of Computer Science.

This brings us to a recent talk delivered by Prof. Bernard Chazelle (CS, Princeton) at the AAAS Annual Meeting about the relevance of the Theory of Computer Science (TCS for short). Chazelle's talk was covered briefly by PhysOrg, although his separate and longer essay really does a better job of making the point,

Moore's Law has fueled computer science's sizzle and sparkle, but it may have obscured its uncanny resemblance to pre-Einstein physics: healthy and plump and ripe for a revolution. Computing promises to be the most disruptive scientific paradigm since quantum mechanics. Unfortunately, it is the proverbial riddle wrapped in a mystery inside an enigma. The stakes are high, for our inability to “get” what computing is all about may well play iceberg to the Titanic of modern science.

He means that behind the glitz and glam of iPods, Internet porn, and unmanned autonomous vehicles armed with GPS-guided missles, TCS has been drawing fundamental connections, through the paradigm of abstract computation, between previously disparate areas throughout science. Suresh Venkatasubramanian (see also Jeff Erickson and Lance Fortnow) phrases it in the form of something like a Buddhist koan,

Theoretical computer science would exist even if there were no computers.

Scott Aaronson, in his inimitable style, puts it more directly and draws an important connection with physics,

The first lesson is that computational complexity theory is really, really, really not about computers. Computers play the same role in complexity that clocks, trains, and elevators play in relativity. They're a great way to illustrate the point, they were probably essential for discovering the point, but they're not the point. The best definition of complexity theory I can think of is that it's quantitative theology: the mathematical study of hypothetical superintelligent beings such as gods.

Actually, that last bit may be overstating things a little, but the idea is fair. Just as theoretical physics describes the physical limits of reality, theoretical computer science describes both the limits of what can be computed and how. But, what is physically possible is tightly related to what is computationally possible; physics is a certain kind of computation. For instance, a guiding principle of physics is that of energy minimization, which is a specific kind of search problem, and search problems are the hallmark of CS.

The Theory of Computer Science is, quite to the contrary of the impression with which I was left after my several TCS courses in graduate school, much more than proving that certain problems are "hard" (NP-complete) or "easy" (in P), or that we can sometimes get "close" to the best much more easily than we can find the best itself (approximation algorithms), or especially that working in TCS requires learning a host of seemingly unrelated tricks, hacks and gimmicks. Were it only these, TCS would be interesting in the same way that Sudoku puzzles are interesting - mildly diverting for some time, but eventually you get tired of doing the same thing over and over.

Fortunately, TCS is much more than these things. It is the thin filament that connects the mathematics of every natural science, touching at once game theory, information theory, learning theory, search and optimization, number theory, and many more. Results in TCS, and in complexity theory specifically, have deep and profound implications for what the future will look like. (E.g., do we live in a world where no secret can actually be kept hidden from a nosey third party?) A few TCS-related topics that John Baez, a mathematical physicist at UC Riverside who's become a promoter of TCS, pointed to recently include "cryptographic hash functions, pseudo-random number generators, and the amazing theorem of Razborov and Rudich which says roughly that if P is not equal to NP, then this fact is hard to prove." (If you know what P and NP mean, then this last one probably doesn't seem that surprising, but that means you're thinking about it in the wrong direction!) In fact, the question of P versus NP may even have something to say about the kind of self-consistency we can expect in the laws of physics, and whether we can ever hope to find a Grand Unified Theory. (For those of you hoping for worm-hole-based FTL travel in the future, P vs. NP now concerns you, too.)

Alas my enthusiasm for these implications and connections is stunted by a developing cynicism, not because of a failure to deliver on its founding promises (as, for instance, was the problem that ultimately toppled artificial intelligence), but rather because of its inability to convince not just the funding agencies like NSF that it matters, but its inability to convince the rest of Computer Science that it matters. That is, TCS is a vitally important, but a needlessly remote, field of CS, and is valued by the rest of CS for reasons analogous to those for which CS is valued by other disciplines: its ability to get things done, i.e., actual algorithms. This problem is aggravated by the fact that the mathematical training necessary to build toward a career in TCS is not a part of the standard CS curriculum (I mean at the undergraduate level, but the graduate one seems equally faulted). Instead, you acquire that knowledge by either working with the luminaries of the field (if you end up at the right school), or by essentially picking up the equivalent of a degree in higher mathematics (e.g., analysis, measure theory, abstract algebra, group theory, etc.). As Chazelle puts it in his pre-talk interview, "Computer science ... is messy and infuriatingly complex." I argue that this complexity is what makes CS, and particularly TCS, inaccessible and hard-to-appreciated. If Computer Science as a discipline wants to survive to see the "revolution" Chazelle forecasts, it needs to reevaluate how it trains its future members, what it means to have a science of computers, and even further, what it means to have a theory of computers (a point CS does abysmally on). No computer scientist likes to be told her particular area of study is glorified programming, but without significant internal and external restructuring, that is all Computer Science will be to the rest of the world.

posted February 21, 2006 12:06 AM in Scientifically Speaking | permalink | Comments (0)

February 01, 2006

Defending academic freedom

Michael Bérubé, a literature and culture studies professor at Penn. State University, has written a lecture (now an essay) on the academic freedom of the professoriat and the demands by (radical right) conservatives to demolish it, through state-oversight, in the name of... academic freedom. The Medium Lobster would indeed be proud.

As someone who believes deeply in the importance of the free pursuit of intellectual endeavors, and who has a strong interest in the institutions that facilitate that path (understandable given my current choice of careers), Bérubé's commentary resonated strongly with me. Primarily, I just want to advertise Bérubé's essay, but I can't help but editorialize a little. Let's start with the late Sidney Hook, a liberal who turned staunchly conservative as a result of pondering the threat of Communism, who wrote in his 1970 book Academic Freedom and Academic Anarchy that

The qualified teacher, whose qualifications may be inferred from his acquisition of tenure, has the right honestly to reach, and hold, and proclaim any conclusion in the field of his competence. In other words, academic freedom carries with it the right to heresy as well as the right to restate and defend the traditional views. This takes in considerable ground. If a teacher in honest pursuit of an inquiry or argument comes to a conclusion that appears fascist or communist or racist or what-not in the eyes of others, once he has been certified as professionally competent in the eyes of his peers, then those who believe in academic freedom must defend his right to be wrong—if they consider him wrong—whatever their orthodoxy may be.

That is, it doesn't matter what your political or religious stripes may be, academic freedom is a foundational part of having a free society. At it's heart, Hook's statement is simply a more academic restatement of Voltaire's famous assertion: "I disapprove of what you say, but I will defend to the death your right to say it." In today's age of unblinking irony (e.g., Bush's "Healthy Forests" initiative) for formerly shameful acts of corruption, cronyism and outright greed, such sentiments are depressingly rare.

Although I had read a little about the radical right's effort to install affirmative action for conservative professors in public universities (again, these people have no sense of irony), what I didn't know about is the national effort to introduce legislation (passed into law in Pennsylvania and pending in more than twenty other states) that gives the state oversight ability of the contents of the classroom, mostly by allowing students (non-experts) to sue professors (experts) for introducing controversial material in the classroom. Thus, the legislature and the courts (non-experts) would be able to define what is legally permissible classroom content, by clarifying the legal term "controversial", rather than professors (experts). Bérubé:

When [Ohio state senator Larry Mumper] introduced Senate Bill 24 [which allows students to sue professors, as described above] last year, he was asked by a Columbus Dispatch reporter what he would consider 'controversial matter' that should be barred from the classroom. "Religion and politics, those are the main things," he replied.

All I can say in response is that college is not a kind of dinner party. It can indeed be rude to bring up religion or politics at a dinner party, particularly if you are not familiar with all the guests. But at American universities, religion and politics are two of the hundreds of things we discuss on a daily basis. It really is part of our job, even — or especially — if some of us have unpopular opinions on those subjects.

How else do we learn but by having our pre- and misconceptions challenged by those people who have studied these things, been trained by other experts and been recognized by their peers as an authority? Without academic freedom as defined by Hook and defended by Bérubé, a university degree will signify nothing more than having received the official State-sanctioned version of truth. Few things would be more toxic to freedom and democracy.

posted February 1, 2006 10:45 PM in Simply Academic | permalink | Comments (0)

January 24, 2005

The Dark Underbelly

Fear and Loathing are not words that you typically associate with people engaged in research. Things like Serious and Measured, or even, for some people, Creative and Dramatic. I recently had a pair of extremely unpleasant experiences, in which the guilty, who shall remain nameless, exhibited all the open-mindedness and aplomb of a jealous and insecure thirteen year old. What on earth causes grown men, established academics no less, to behave like this?

Academic research, although it pretends to be a meritocracy, uses social constructs like reputation, affiliation and social-circles as a short-hand for quality. This is the heart of how we can avoid reading every paper or listening to every presentation with a totally open mind - after all, if someone has produced a lot of good work before, that's probably a pretty good indicator that they'll do it again. "The best predictor of the future is past behavior." Unfortunately, these social constructs eventually become themselves elements of optimization in a competitive system, and some people focus on them in lieu of doing good work. This, I believe, was the root cause of the overt and insulting hostility I experienced.

Ultimately, because everyone has a finite amount of time and energy, you do have to become more choosy about whom you collaborate with and what ideas you push on. But if everyone only ever did things that moved them "up" in these constructs, no one would ever work with anyone else. What's the point of being an intellectual if it's all turf wars and hostility? Shouldn't one work on things that bring pleasure instead of a constant stream of frustration over poor prestige or paranoia over being scooped? Shouldn't the whole point of being supported by the largess of society be to give as much back as possible, even if this means occasionally not being the most famous or not the guy who breaks the big news?

Maybe these guys don't, but I sure do.

posted January 24, 2005 10:58 AM in Rant | permalink | Comments (0)