Introducing SIERA

My FG articles · by **MattS** » Mon Feb 08, 2010 13:52:30

Eric Seidman and I have started our five-part series about SIERA today: link

It will be explained over the course of the week, with an article Monday-Friday breaking it down. The basics of it are that it's a new ERA estimator that consistently beats FIP, xFIP, tRA, and QERA and it's luck-neutral, defense-neutral, park-neutral, etc. It adjusts for the fact that pitchers that get more walks and fewer strikeouts need ground balls more (for more GIDPs and fewer multi-run HR), that ground balls have an accelerating run prevention affect as you get more and more of them because you get more double plays after singles, and that strikeouts have a decelerating affect on run prevention because you don't have that many runners on base in the first place. It's free content this week, too.

by **jeff2sf** » Mon Feb 08, 2010 14:01:50

Which Phils will look better than other advanced pitching stats would have you think, which ones will look worse?

by **TenuredVulture** » Mon Feb 08, 2010 14:16:18

For the slow class--SIERA is used to predict ERA? That is, old fashioned ERA is your dependent variable? That doesn't exactly sound right to me.

I get the idea that you consider a metric better if it correlates well with itself year to year. But there has to be more than that to determine whether a metric is useful or not, right?

My FG articles · by **MattS** » Mon Feb 08, 2010 14:17:31

For 2009:

Pitcher ERA SIERA FIP xFIP
Halladay 2.79 3.09 3.06 3.05
Hamels 4.32 3.55 3.72 3.69
Blanton 4.05 3.92 4.45 4.07
Happ 2.93 4.37 4.33 4.49
Moyer 4.89 4.68 4.94 4.74
Lidge 7.21 4.20 5.45 4.76
Madson 3.26 3.18 3.23 3.25

Hamels is a good example of someone who looks better using SIERA by about 0.15-0.20 because he gives up tons of solo HR. On Friday, we'll talk about Johan Santana and how badly FIP, xFIP, QERA, and tRA treat pitchers who have very low BB% very high K% but medicore GB%. The issue is that they give up way fewer multi-run HR. This is a subset of pitchers that the other estimators miss out on badly.

Lidge is tricky because he was throwing hurt so his HR/FB and LD% wasn't really bad luck as much as it was that he was hurt, so SIERA will do better at pinning him where he should be for 2010 but miss out on 2009.

It does a little better on groundball pitchers like Halladay but here FIP and xFIP did pretty well with him. Really extreme ground ball pitchers and really extreme contact pitchers are really good examples in general.

My FG articles · by **MattS** » Mon Feb 08, 2010 14:19:27

TenuredVulture wrote:For the slow class--SIERA is used to predict ERA? That is, old fashioned ERA is your dependent variable? That doesn't exactly sound right to me.

I get the idea that you consider a metric better if it correlates well with itself year to year. But there has to be more than that to determine whether a metric is useful or not, right?

The way we checked it was two ways. First, we checked how well it predicted ERA for the same-year versus xFIP and QERA which both treat HR/FB as luck (obviously the more luck-stats you treat as skill, the better you do so FIP and tRA do better with same-year) and how well it predicted ERA the following year, where it was consistently ahead of all four on tons of different subgroups of pitchers.

by **BigEd76** » Mon Feb 08, 2010 14:20:28

So a guy that had one of the worst seasons in closer history has a better rating than the guy that should've won Rookie of the Year?

by **joe table** » Mon Feb 08, 2010 14:21:01

Very interesting work.

The following questions probably aren't even applicable to SIERA since HR/FB is not an input in the equation, but I was wondering if you considered looking at the impact of ballpark dimensions/trends to give HR/FB rates "context," if you will

By that I mean, for the park adjustments, did you look at a "baseline/neutral" HR/FB ratio based on past data, or is the ballpark adjustment factor for SIERA derived from more descriptive run-based park factors?

Did you get into analyzing different baseline HR/FB rates for certain areas of the field at ballparks? For example CBP seems to play more to its bandbox rep on flyballs to left, and less so on flyballs to right. I know Safeco is supposedly absolute death to RH pull power and less so to LH power. I always have wondered about this when considering the methodology behind "park-adjusted" stats

Not trying to hijack the thread into an obscure tangent here, but I was just wondering whether you attempted to isolate specific ballpark effects specifically on HR/FB or simply took HR/FB out of consideration altogether because as you stated, it doesn't seem to indicate any pitcher "skill"

My FG articles · by **MattS** » Mon Feb 08, 2010 14:22:32

BigEd76 wrote:So a guy that had one of the worst seasons in closer history has a better rating than the guy that should've won Rookie of the Year?

1) A 4.20 ERA as a reliever is much worse than a 4.37 ERA as a starter. Typically a 4.20 ERA as a reliever takes as much skill as a 5.20 ERA as a starter.
2) Happ was very lucky this year.
3) Lidge was injured which is why his ERA estimators are all underrating him.
4) I'd bet they have similar ERA's this year if Lidge is healthy, with Happ's obviously being more valuable because it's easier to relieve.

My FG articles · by **MattS** » Mon Feb 08, 2010 14:24:32

joe table wrote:Very interesting work.

The following questions probably aren't even applicable to SIERA since HR/FB is not an input in the equation, but I was wondering if you considered looking at the impact of ballpark dimensions/trends to give HR/FB rates "context," if you will

By that I mean, for the park adjustments, did you look at a "baseline/neutral" HR/FB ratio based on past data, or is the ballpark adjustment factor for SIERA derived from more descriptive run-based park factors?

Did you get into analyzing different baseline HR/FB rates for certain areas of the field at ballparks? For example CBP seems to play more to its bandbox rep on flyballs to left, and less so on flyballs to right. I know Safeco is supposedly absolute death to RH pull power and less so to LH power. I always have wondered about this when considering the methodology behind "park-adjusted" stats

Not trying to hijack the thread into an obscure tangent here, but I was just wondering whether you attempted to isolate specific ballpark effects specifically on HR/FB or simply took HR/FB out of consideration altogether because as you stated, it doesn't seem to indicate any pitcher "skill"

I definitely would like to work towards this, but we started with just run-based park factors. Because we're using batted ball rates in a regression, we're going to pick up some BABIP skill in this too the way-- so if certain pitchers are prone to have low BABIPs, SIERA will pick that up while ignoring the luck in BABIP-- we couldn't just use HR/FB park factors themselves. Once there is more data out there on batted ball rates, we'll look into this moreso. Very good point, though.

by **TenuredVulture** » Mon Feb 08, 2010 14:30:01

MattS wrote:
TenuredVulture wrote:For the slow class--SIERA is used to predict ERA? That is, old fashioned ERA is your dependent variable? That doesn't exactly sound right to me.

I get the idea that you consider a metric better if it correlates well with itself year to year. But there has to be more than that to determine whether a metric is useful or not, right?

The way we checked it was two ways. First, we checked how well it predicted ERA for the same-year versus xFIP and QERA which both treat HR/FB as luck (obviously the more luck-stats you treat as skill, the better you do so FIP and tRA do better with same-year) and how well it predicted ERA the following year, where it was consistently ahead of all four on tons of different subgroups of pitchers.

That's interesting. So, again, for the slow class--SIERA (lifetime?) predicts future ERA than any other (lifetime?) pitching metric.

What about predicting something better than ERA, like WHIP? Or will that not work so well?

I like WHIP, because it seems to have face validity much like OBP and SLG. Or, alternatively, you could use OBS against as a dependent variable. Obviously, neither are defense neutral.

My FG articles · by **MattS** » Mon Feb 08, 2010 14:37:43

TenuredVulture wrote:
MattS wrote:
TenuredVulture wrote:For the slow class--SIERA is used to predict ERA? That is, old fashioned ERA is your dependent variable? That doesn't exactly sound right to me.

I get the idea that you consider a metric better if it correlates well with itself year to year. But there has to be more than that to determine whether a metric is useful or not, right?

The way we checked it was two ways. First, we checked how well it predicted ERA for the same-year versus xFIP and QERA which both treat HR/FB as luck (obviously the more luck-stats you treat as skill, the better you do so FIP and tRA do better with same-year) and how well it predicted ERA the following year, where it was consistently ahead of all four on tons of different subgroups of pitchers.

That's interesting. So, again, for the slow class--SIERA (lifetime?) predicts future ERA than any other (lifetime?) pitching metric.

What about predicting something better than ERA, like WHIP? Or will that not work so well?

I like WHIP, because it seems to have face validity much like OBP and SLG. Or, alternatively, you could use OBS against as a dependent variable. Obviously, neither are defense neutral.

SIERA predicts ERA. I wouldn't say lifetime, because it basically predicts what ERA should have been with neutral luck, defense, park, and timing of events. It will do well with lifetime ERAs for pitchers that don't change their repertoire significantly in their careers though, since the sample size is larger and anything with sample size bigger will let SIERA get a chance to look better.

WHIP is really pitcher OBP without GDP, outs on the base paths, and HBP. It's just the ratio of (baserunners)/(outs/3), so it's the same two variables as OBP. I hadn't really thought of trying to estimate that, but I guess that is similar to looking at BABIP luck in a way, and then approximating #baserunners and #outs accordingly.

by **Woody** » Mon Feb 08, 2010 14:38:28

Matt if you could just go ahead and tell me how the 2010 season is going to end up for the Phils so I don't have to dedicate hundreds of hours of my life to it, that'd be okay with me.

by **Werthless** » Mon Feb 08, 2010 14:38:50

pics of sierra?

by **joe table** » Mon Feb 08, 2010 14:41:06

For what category of pitcher (ie GB, contact, control, strikeout, etc) will SIERA generally differ from xFIP most/least?

by **BigEd76** » Mon Feb 08, 2010 14:42:25

Werthless wrote:pics of sierra?

My FG articles · by **MattS** » Mon Feb 08, 2010 14:46:48

joe table wrote:For what category of pitcher (ie GB, contact, control, strikeout, etc) will SIERA generally differ from xFIP most/least?

Well, the main problem with xFIP is that it uses rate stats per IP instead of per PA and that it treats all HR as the same. So for pitchers with really low or really high WHIPS, it's going to do bad, and it's going to get less useful at neutralizing BABIP luck the further from average a guy's BABIP is.

Ground ball pitchers are really going to be a major area where SIERA looks best, because of the fact that ground balls have that accelerating affect on lowering ERAs. So xFIP won't notice that pitchers get a lot of double plays.

So basically it's very good pitchers where xFIP will do worse-- if they have really high K/BB ratios, xFIP won't know that their HR are solo shots, and really high GB% pitchers won't get credit for the extra GIDP's they create.

by **bleh** » Mon Feb 08, 2010 15:34:33

SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)*((GB-FB-PU)/PA) – 4.027*(BB/PA)*((GB-FB-PU)/PA)

The beauty of it is its simplicity

My FG articles · by **MattS** » Mon Feb 08, 2010 15:53:40

bleh wrote:SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)*((GB-FB-PU)/PA) – 4.027*(BB/PA)*((GB-FB-PU)/PA)

The beauty of it is its simplicity

I mean, it's something that you need to put into Excel or go to BP to get like everything else. How many people calculate xFIP in their head or by hand? The best thing about it, in reality, is that it gets rid of a lot of the nonsense simplification in other estimators. What good is an ERA estimator that misses badly for anything but reasonably average or somewhat above average pitchers? I think it's more useful to be able to look at elite pitchers and figure out how good they really might be.

It'll all be on the stats page very soon, so it's not like anyone will need to calculate by hand.

by **Shore** » Mon Feb 08, 2010 16:54:10

MattS wrote:For 2009:

Pitcher ERA SIERA FIP xFIP
Halladay 2.79 3.09 3.06 3.05
Hamels 4.32 3.55 3.72 3.69
Blanton 4.05 3.92 4.45 4.07
Happ 2.93 4.37 4.33 4.49
Moyer 4.89 4.68 4.94 4.74
Lidge 7.21 4.20 5.45 4.76
Madson 3.26 3.18 3.23 3.25

For the Phillies, then, the difference (between xFIP and SIERA) is somewhere between 0 and 3 runs for each starter... is this "typical", or are there other groups more sensitive to the change in calculation?

What I'm asking, really, is will a "projected standings" using SIERA be any more than about a game different that one using xFIP? Or are we at a point where successive enhancements to predictive-ERA models aren't going to have a delta of more than 10-15 runs for a full pitching staff?

My FG articles · by **MattS** » Mon Feb 08, 2010 17:03:35

Shore wrote:
MattS wrote:For 2009:

Pitcher ERA SIERA FIP xFIP
Halladay 2.79 3.09 3.06 3.05
Hamels 4.32 3.55 3.72 3.69
Blanton 4.05 3.92 4.45 4.07
Happ 2.93 4.37 4.33 4.49
Moyer 4.89 4.68 4.94 4.74
Lidge 7.21 4.20 5.45 4.76
Madson 3.26 3.18 3.23 3.25

For the Phillies, then, the difference (between xFIP and SIERA) is somewhere between 0 and 3 runs for each starter... is this "typical", or are there other groups more sensitive to the change in calculation?

What I'm asking, really, is will a "projected standings" using SIERA be any more than about a game different that one using xFIP? Or are we at a point where successive enhancements to predictive-ERA models aren't going to have a delta of more than 10-15 runs for a full pitching staff?

Thursday's article will actually break down the RMSE in more detail. I suspect that doing projections based on xFIP might change a projected standings by a couple of games but I'm not sure. The difference between SIERA and the other estimators is between 0.30-0.50 of ERA on average. I think the Phillies are a staff without major ground ball pitchers, so the effect is probably not as clear as with other teams.

Archived

Introducing SIERA

Introducing SIERA