Pythagoras, Bill James, and the Mets

Yesterday I was working with simply using a linear regression to determine expected wins based on run differential (runs scored minus runs allowed). What about Bill James’ Pythagorean expectation? So, just to be thorough (sort of) I went ahead and looked at the difference between what the ’93 Mets should have won based on James’ formula and the 59 games the team actually won.

Oy vey.

By Pythagorean expectation, the ’93 team fell from the 36th worst team in Major League history1 to the 12th worst, even accounting for all the number wonkiness from the 19th century clubs. The good news is that there are two teams from the 20th century worse in this regard: the 1905 Chicago Cubs squad that won 92 games was seventh and the 1911 Pittsburgh Pirates that won 85 games. Of course, those are the Cubs that featured the famed trio of Tinker, Evers, and Chance along with Mordecai Brown. The Pirates featured Honus Wagner, Max Carey, and Fred Clarke.

Either by linear regression or PE, the ’93 team should have won 73 games. The residual (recall that residual is the error from the expected win total based on run differential to what actually occurred) based on Pythagorean expectation grew a little worse, however, as it dropped from -0.0849 to -0.0851.

Here’s an updated scatterplot to be thorough:

Pyth_93I even took this a step further and determined a new exponent to use in James’ formula that would more closely align with the actual winning percentages over the years2. For this pass, I eliminated all the teams prior to 1900 since there were fewer games, too much data needed to be cleaned, and honestly, I figured 19th century teams wouldn’t really be of much value. Anyway, after running a linear regression to determine the new exponent, I came up with 1.861.

So, my new formula looks like this:

\frac{runs scored^{1.861}}{runs scored^{1.861}\ +\ runs allowed^{1.861}}

How did the ’93 Mets do? Well, since this is a post about the 1993 Mets and I’m all about schadenfreude, the 1993 team bottomed out.  They were, based on the difference between what they should have won and what they actually did, the worst team since the turn of the 20th century.

And . . . the scatterplot:

Updated Pythagorean ExpectationAnother way of looking at this information is that the ’93 Mets were the unluckiest team over the last 114 years.  Sure.  We’ll call that a silver lining. I like to think the unluckiest ones were those of us who rooted for this team back then.  I do miss Howard Johnson, though.  I loved Ho-Jo when I was a kid.

Being the inquisitive sort, you’re probably wondering what teams exceeded their expected win totals the most over the last 114 years. Since you asked, here’s a top ten list of the teams that exceeded our expectations the most (expected wins are rounded up):

Year Team Wins Expected Wins Residuals
1905 Detroit Tigers 79 65 0.091
1981 Cincinnati Reds 66 57 0.086
2004 NY Yankees 101 89 0.075
1954 Brooklyn Dodgers 92 81 0.074
2008 LA Angels 100 88 0.074
1972 NY Mets 83 71 0.074
1984 NY Mets 90 78 0.072
1981 Baltimore Orioles 59 52 0.071
2005 Arizona Diamondbacks 77 66 0.070
1917 St. Louis Cardinals 82 71 0.069

 Great Results – Not So Great Expectations

Two Mets teams made this list. Redemption! See, book learnin’ is fun.

  1. Technically, we’re discussing professional baseball since the years begin in 1871 and the National League we all know and love wasn’t founded until 1876, but I’ll use Majors here for simplicity’s sake.
  2. All of this can be found in the book Analyzing Baseball Data with R by the way, so it’s not like I’m some math wizard breaking new ground.

