Jan 30, 2014

Who Had The Best Projections in 2013? – Part 1: Hitters




The list of baseball projection systems seems to grow each year as more people think they can predict the future. It's important to know which of those projections are the best because you can win or lose your league depending on what projection system you use in fantasy baseball. This website gives you cheatsheets and you are given you the choose which projection system you want to use and that becomes a heavy decision. To help understand which projections are best, I'll look at the 2013 hitter projections and see which projections would have helped you most for your fantasy leagues last year.


The Method

If you're using one projection system for your fantasy baseball draft analysis, you're actually not totally focused on whether a prediction of 40 HR, for instance, ends up being correct. Because, what if that entire projection system had predicted fifty guys to have 40+ HR? In that case, you may not have paid as much attention the guy that was accurately predicted from that group. What you want to be correct is how far above or below average they are projecting a player to be in each stat. 

So, with that being said, I standardize all of the projections for each statistic so that I look at the predicted z-score in that stat for each player (z-score being how many standard deviations above/below the mean that projection was).

At that point, there are a variety of statistical methods to use to analyze whether the projections were accurate in projecting a player to be above or below average. I chose to use Mean Absolute Error (MAE) for this analysis. Compared to Root Mean Squared Error (RMSE), it doesn't penalize large mistakes quite as much which I find to be a good thing because you have the option to bench a player in fantasy baseball that is performing vastly different than expected. Mean Absolute Error, in this case, is the difference between the projected z-score and the actual z-score from the actual 2013 results averaged out among the whole pool of players we are comparing.

I only included players in this analysis that were showing up in drafts last pre-season and were shared among all of these analyzed projection systems. Though some players did well that weren't drafted, this is about who helped you the most on draft day. I also removed players who ended up not playing last season or played in an extremely limited capacity. This left 259 hitters in my pool of payers from last season.


The 2013 Rankings

For a full rundown of the competitors, see my post that introduces the baseball projections.

After performing the analysis as stated above, here are the rankings for how well each projection system performed:

HR AVG R RBI SB WERTH*
CAIRO 4 7 2 4 9 3
Steamer 8 2 1 2 2 2
Fangraphs 1 4 6 7 4 5
ZiPS 3 6 10 6 5 8
Oliver 10 1 8 8 3 10
Marcel 9 9 9 10 7 6
MORPS 5 10 7 3 8 9
Clay 7 8 5 9 10 7
Steamer-Fans 6 5 4 5 6 4
Special Blend 2 3 3 1 1 1
*WERTH is the value of adding the five roto z-score values together for a player to get their total projected rotisserie value.
 
Those results tell part of the story but sometimes the gap between first and second place is larger than we think. Here's a visual representation of the rankings that shows precisely how above or below average each system was in comparison to one another.
 
 


   

Trends or Anomalies

I'm not surprised that my optimized combination of the projections performed the best in a majority of categories. I am actually somewhat surprised it didn't perform even better so I'll need to do some more tweaking to the weights now that I have more data to analyze. This sort of projection combination is a bit of a cop-out as it minimizes outliers and brings everyone closer to a safe middle.
My biggest surprise is the weak performance by Steamer in the HR category. Otherwise, Steamer nearly swept the rest of the race. The Fangraphs Fans seemed to have done a phenomenal job at predicting HR totals so I went back and looked at past years and this seems to be a trend as the Fans had the best HR projections last season (and a 2nd place finish in 2011).

CAIRO also had a very good performance across the board with the exception of the AVG and SB categories. Somehow the marriage of the Fangraphs Fans and Steamer projection did not go as well as expected for the hitting projections. We'll have to see if that trend continues for the pitching projections when we look at that next.

The Awards


Well, it's clear that Steamer deserves the gold medal for the third straight year here. If you want projections for your fantasy baseball hitters that will be most closely aligned with actual results, Steamer is the clear choice for you. The only concerning part of their projections in 2013 was the very poor performance in projecting HR's. Aside from that, they "steamed" the competition (puns are fun?).

CAIRO deserves the silver medal for a strong but inconsistent performance. Despite being a bit all over the map, they still did a great job at predicting the overall WERTH value for each player and that's certainly a good thing.

In addition, the non-scientific method of just letting the Fans do the picking has shown some value, especially when it comes to projecting HR totals. It's a bit hit-or-miss beyond that but we have to start to acknowledge the power of the fan HR projection.

Filed Under:

8 comments:

  1. Would love to see the same analysis done on a per-PA basis, so we know who is best predicting skills vs. predicting playing time. Similarly, a pure assessment of who predicts playing time most accurately would be interesting. Thanks.

    ReplyDelete
    Replies
    1. MP, I get where you are coming from, but from a comparison standpoint I think the fields listed above are the only ones really important to people when drafting. I believe playing time is calculated into projections in some cases, but in reality playing time is a crapshoot. Injuries (or suspensions!) always derail your best laid plans and if you are playing poorly, you will play less. While projections are difficult to predict as well, at least they are based upon some mathematical formula meant to lend some science to the gamble. In my humble opinion, anyway!

      Delete
    2. The results certainly are a bit different when you look at the Per-PA results and I do agree with Hogtown about the PA being wrapped into the projections themselves. However, I do plan on including a bit of "other analysis" such as Per-PA in a Part 3 after I post the pitcher results soon.

      Delete
    3. @HH -- It's definitely not a "crapshoot" -- and in this day and age of parity in projection quality, getting playing time right is one of largest sources of edge out there.

      Delete
  2. Can I just second the request for an analysis of per-pa performance and then a second analysis of who did a good job predicting playing time? If steamer's high rank is largely because of good playing time estimates (which I doubt it is) and cairo has amazing per-pa predictions but is terrible at Playing Time estimates you could combine the two into a very strong projection.

    ReplyDelete
  3. I'd be curious how the FantasyPros composite projections compare to the others.
    DJ

    ReplyDelete
  4. To piggy back on someone else's mention of the per PA performance of each projection system something I would really be interested in seeing (if only I had the time and patience to perform the study myself) is a rate based comparison of each system. For example breaking every category into a rate per plate appearance number (e.g. HR's per PA, RBI per PA etc.) See how Steamers compares to Fan Graphs compares to Zips and so on on a rate per PA of each individual category. Then creating a blend projection that maximizes each specific projection systems strength into one number. Not sure if this is clear, but essentially if its determined Fan Graphs is best at projected HR's per PA, Steamers for RBI's & Runs per Plate appearance, PECOTA for average and then finally one new category to add to the research is which projection system is best at projecting playing time all together. Then create a master projection that uses per say Fan Graphs playing time / plate appearances if that has the best playing time of each projection system and then times that plate appearance number by the projection system with the best per plate appearance rates at each category to give one master number using the strengths of each projection system. Would be interesting to see how that fares against just creating a blend mixing a percent of the overall board numbers when each projection system has certain statistical arena flaws.

    Again, its a crazy idea, still not sure if I'm being clear, but maybe one last example. FanGraphs deemed best at prediction playing time & HR's, Steamers gets highest for RBI's, runs and steals. Then create one final projection for each player using the plate appearance projection from fan graphs, the hr's are the rate of HR/PA from Fangraphs times the plate appearance projection, use the RBI, Runs and Steals per PA from Steamer timed by the fangraphs PA number to get values for each of those, and take the hits per plate appearance from PECOTA and multiply it by the Fangraphs PA to get your hits number. All of this mirrored for pitchers as well. Would love to see if this achieved good results.

    ReplyDelete
  5. Bummer that I didn't do better in your rankings. I've adjusted some of my calculations. Let's see what this year holds. Also, you may want to consider comparing final player rankings at the end of the year to each system's player rankings in lieu of actual statistical numbers. If someone can take my list and pick the best players, that is more meaningful to a fantasy manager than getting the actual number exactly right. Systems that use heavy regression may never compare favorably in Las Vegas, but I would contend that they will stack up favorably during a fantasy draft. Thanks for your work. Obie of MORPS

    ReplyDelete