mediocre lab

The Information Content of an Ebert Review

Roger Ebert uses two systems to rate movies: thumbs up or down, and 0 to 4 stars in half star increments. The thumb system provides one bit of information, while the star system gives a maximum of approximately 3.17 bits. He uses the thumb system for his television program and the star system for his views in the Chicago SunTimes. My colleagues and I here at the mediocre lab at the University of New Mexico noticed, however, that Ebert gives many movies three stars, and we wondered whether the information content of his suntimes review was really closer to one bit. We speculated that he indeed had noticed this, and did away with the star system for his television program.

I therefore set out to test this hypothesis. I wrote a script which calculates the number of reviews for each half star increment using the search function on the Ebert archive page and used that to calculate the information content. How many bits does an Ebert review provide?

Approximately 2.71 bits.

This, of course, was much higher than expected.

Here is the raw data:

Stars Count
0 28
1/2 65
1 198
1 1/2 213
2 634
2 1/2 356
3 1124
4 463

31% of all reviews are 3 stars. 45% of reviews are 3 or 3 1/2 stars. By year:

Year Entropy Average
1985 2.75 2.87
1986 2.84 2.59
1987 2.91 2.5
1988 2.7 2.65
1989 2.83 2.54
1990 2.65 2.67
1991 2.61 2.71
1992 2.6 2.72
1993 2.72 2.72
1994 2.76 2.65
1995 2.63 2.67
1996 2.63 2.68
1997 2.53 2.7
1998 2.63 2.63
1999 2.59 2.96
2000 2.56 2.78
2001 2.65 2.79
2002 2.62 2.88
2003 2.75 2.81

Apparently 1999 was a terrific year for movies (at least for Roger Ebert).