Let's talk about sample size

Remember the April concerns about E5? (USATSI)
Remember the April concerns about Edwin Encarnacion? (USATSI)

It's time to talk about sample size. Every year, Fantasy owners rush to try and determine if a player's performance is legitimate early in the year. This turns into a dangerous game where owners start to look at just a few games worth of stats in order to make their decisions. Bias plays a role too. If a Fantasy owner wants to like a certain player, they may only focus on the positive trends they see with that player, ignoring the larger issues at hand. When it comes to player performance, the larger the sample, the more accurate the assessment. 

I get why Fantasy owners try to break down performances early. In some cases, it can lead to an early waiver wire pickup -- think Chris Davis in 2013 -- who can lead your team to victory. More often than not, however, it leads to bad analysis or poor roster moves. 

Remember how Fantasy owners were nervous about Edwin Encarnacion 20 games into the year? That's proof that small samples are extremely problematic. Thankfully, there's an article I rely on when it comes to taking samples seriously. Russell A. Carleton of Baseball Prospectus put this together a few years ago, and much of it is still accurate now. The article discusses the points in which stats become reliable. As an example, Carleton says we can get an accurate gauge on a player's strikeout rate after just 60 plate appearances. I will always be an advocate for waiting on more data, but that's the point where I think Fantasy owners can start to show concern about a certain player. You can apply his work to a number of relevant Fantasy stats, making it extremely convenient. 

But let's say you aren't the type of person who likes advanced stats. Well, think of things this way. If it were mid-August, would you notice a 10 game slump? Would you act on it? The answer is probably not. 

Context matters greatly here. In the case of Encarnacion, we now have a couple years of data to suggest he's an elite power hitting first baseman. Until he proves otherwise, he should be expected to continue producing that way. Players generally perform to their career-norms, which is why it's easier to preach patience with veterans. Carlos Santana's stats might look awful right now, but history says he'll get to .245/.364/.433 at the end of the year. 

Let's discuss a guy like George Springer. I wrote about him yesterday, and there were some mixed comments on the piece. The main criticism seemed to be that Springer had cut down on his strikeouts over the past 12 games or so. That's not a big enough sample for me to trust. You can attempt to make the argument that Springer is adjusting to the league, and won't strike out as much, but his history of high strikeout rates in the minors says the opposite. Given that those numbers are a much larger sample, I'm going to trust them. Plus, I would be a terrible analyst if I ignored years of numbers based on just 12 games. The thing is, I don't hate Springer, I just think he'll hit .230 this year due to his high strikeout rate. That doesn't mean it will always be this way. Springer has immense talent, and can improve eventually, but these things take time.

Where does that leave us? For one, check out Carleton's article, it's a good starting place if you want to start evaluating your teams. Be patient with established players, there's plenty of evidence they'll return to form. Don't overreact to small samples, and don't let your biases enter your research. Just because you want to like a player doesn't mean you can ignore their faults. I know, I loved Danny Salazar this year. 

It's may not be fun to sit back and be patient, but it will lead to far more success in your league.

Show Comments Hide Comments
Our Latest Stories