clock menu more-arrow no yes mobile

Filed under:

The madness of ignorance

I don't know if he's trying, and there's no definitive way to figure it out. This is a problem.
I don't know if he's trying, and there's no definitive way to figure it out. This is a problem.

When I started this piece, it was going to be my last shot at hunting down the white whale of my football fandom. As you've gathered from listening to Fusillade and probably reading my posts and comments on here, I'm somewhat an Andrei Arshavin fan. It's a lonely place these days, as he's kind of crap. I acknowledge that and do my best not to get too upset when he screws something up and the match thread here explodes in Meerkat rage. He misplaces passes, he dribbles into a stout defense when he shouldn't (which is most of the time), and oftentimes he just falls over, which looks silly and isn't altogether helpful. I get all of that, I see it, and I'm not ignoring it.

What I don't get is the accusations of laziness - I just don't see it. In the pilot episode of Fusillade, Aidan and I compared him to Carlos Beltran, who I think suffers from a similar malady. It doesn't matter how hard he tries, it always looks like he's gliding, bored with whatever he's doing. It's not quite the same with Arshavin (mostly because he doesn't succeed like Beltran usually does/did), but close enough that the comparison is apt, I think.

If you've been paying attention there's another thing you've likely picked up on about me; I'm a huge baseball fan, particularly of the Mets. A big part of why I found The Short Fuse is because I have long been an avid reader (and commenter) of Amazin' Avenue, SB Nation's Mets blog. The community there is heavily skewed toward sabermetric analysis, which if you aren't a baseball basically fan means that they like using evidence (typically statistical evidence, and often advanced stats at that) to prove points rather than just making things up.

While it's not relevant in all cases in all sports, I've tried to carry that philosophy over to my other spheres of influence as often as I possibly can - if I'm going to make a claim, if at all possible, I try to make sure that it can be backed up with some kind of evidence or experience or something other than saying "well, I think that's what happened, so it's what happened."

I know that the hypothesis "Arshavin does work hard" would be tough to definitively prove or disprove without analyzing statistics and actually spending some time with the club and with him specifically, but I thought that there would be enough statistical evidence one way or the other to at least start a reasonable conversation, if not enough to definitively prove anything. And that is where I was wrong.

I'll lay out my methodology for you, first of all. I figured that the technical opposite of "laziness" was "involvement in the game," which I figured I'd measure by tracking various measures and comparing them with some of his competitors for playing time. The idea was, if he was moving around and playing on and off the ball, that would be a good way to determine an effort level. It wasn't going to be a scientifically strenuous thing or anything like that, but I figured it would be better than what we'd done before. I planned on looking at things like passes attempted, dribbles attempted, tackles attempted, and ground covered - things that would show what he tried to do in a game rather than what he successfully did. Remember, the hypothesis wasn't that Arshavin is underrated, it's that Arshavin tries and fails because he has lost his ability to affect games. Ground covered was particularly important because it could be looked at as evidence of what he did off the ball - anyone can pass, but movement when you don't have the ball in an effort to get in position to get the ball shows an effort.

The problem is that while I could find a bunch of that data - passes and dribbles and tackles are pretty easily located - and while that's all helpful and could function as a part of an argument one way or the other, the main part of it was going to be based around ground covered during games. Like I said, what better argument for effort than the old standard, running about like a human dynamo! Does Arshavin do this? Who knows!

I asked Orbinho, everyone's favorite Arsenal-fan-cum-statistician, about it, and according to him Opta doesn't do distance stats, but clubs do (secretly, of course). It is very occasionally published. A couple of times in the past, I've come across people on Twitter who somehow have access to these numbers, and I've brought them up before, so I know they exist. I think, based on what little I've heard, that they may back me up, but I would at the very least like to be well-informed so that if I'm wrong, I can shut up and move on with my life.

*A case in my favor: after finally finding the UEFA press kits (well, Ted found them, but still), I scoured them for the coveted "ground covered" statistics. They're not found there. The only place you can get them is on the MatchCentre of individual Champions League matches. You can compare one player on one team to one on the opposition. You can't compare a player to a teammate, nor to himself or a teammate in a different game. So in short, the information is difficult-to-impossible to find, and nearly impossible to use for anything productive. It wouldn't be out of bounds for the less bulldoggish among us to think they are mythical lost objects, like the Holy Grail, the Ark of the Covenant, or Carlos Vela. Since half the battle with stats is presenting them in a way that people can understand, this isn't a lot better than having nothing at all.

Obviously all teams have the right to have their own statistical models and keep them secret. That happens in all sports; for example I know that the Red Sox have had proprietary defensive statistics for years that they do not share (except that now, presumably, Theo Epstein has taken them to the Cubs, but that's neither here nor there). What annoys me is that there's nothing but proprietary stats in football, it seems like. Sure we get the simple stuff and maybe a little more than that, but we don't have anything available to the fans that's really that advanced. Opta has stuff that they leak out little by little, but there's not really a database that we (well, in this case I) would have access to.

I'll compare it to baseball - admittedly that's a tad unfair, as it's statistically probably the most advanced sport for various reasons that aren't important here, but it's the one outside of football that I know the best. In football we have goals, assists, saves, save percentage, and things like that. That's the approximate equivalent of home runs, RBIs, errors, and fielding percentage. Those are the least useful predictive and analytical statistics in all of baseball, and most of the time if people try to make points only using these statistics, you have my permission to laugh at them (particularly errors. And don't get me started on pitcher wins). Then we have things like what I mentioned above - passing attempts, tackle attempts, interceptions, stuff like that. I think the best comparison there is the on-base percentage, slugging percentage, ERA, WHIP family; it tells you a bit more than the lowest level statistics, but not quite enough still to be able to cogently predict future outcomes or really deeply analyze what's happening to a team or player.

The big problem is that in baseball, there's another level above that, with metrics like WAR (wins above replacement) and xFIP and wOBA that most non-baseball fans haven't even heard of, and many fans even don't understand. They're not perfect, obviously, because nothing that I'm aware of is perfect, but judging hitters based on wOBA (weighted on-base average, if you were wondering) or pitchers based on xFIP (expected fielding-independent pitching) is in almost every situation better than batting average or ERA, and degrees better than simply using home runs and wins. There's nothing like that in football, at least not that we the fans know about.

*There are good reasons for that; namely, the fact that baseball's easier to analyze because it's a series of discrete events involving individual matchups, while football is a constantly moving game involving interplay of multiple actors performing simultaneously. On the other hand, there's a fairly healthy advanced statistical community in basketball (a usually-moving game involving interplay of multiple actors performing simultaneously), which leads me to believe that it's possible to do it in football as well.

My point is not that Arsene Wenger should hold a series of seminars detailing his methods of analysis. Clubs should keep their business their own, and not risk giving other clubs a leg up. That's fine. But there ought to be a Fangraphs or a Baseball Reference for football, a free resource that all people can use to get access to the best information possible. More importantly, though, those sites serve as forums where new ideas can be presented to an educated public and debated, and knowledge of the game can grow. They don't just have spreadsheets full of stats - though the glories of the Fangraphs player pages must be seen to be believed - Fangraphs also has a host of regular writers who do some great analytical work. Fangraphs analyzes individual players, the entire range of major leaguers, and even the economy. The closest thing to that for football is Who Scored, and surely the other baseball fans here will back me up on this - it's not that close (for example I just tried to look up some statistics from Arsenal 1-0 Leeds to make a point on another SB Nation post, and WHOSCORED DOESN'T EVEN HAVE FA CUP STATS). There's also, but as far as I can tell, you have to pay a subscription fee to gain access to any of the statistical data they have. They do have a blog somewhat similar to what Fangraphs has, but it's nowhere near the quality.

I guess my point is this: there are football statistics, and there are fledgling communities for them, but we can do a lot better than we are.


The title of this piece is "The madness of ignorance," and I think I should explain that. When I say "ignorance" I don't mean it in the way it's typically used (in other words, stupidity). What I mean is simply a lack of knowledge, which is a wholly different thing. If an incredibly intelligent person speaks without information, he is operating in a state of ignorance despite that personal intelligence. We're all pretty ignorant when it comes to football whether or not we're willing to admit it, because of this dearth of objective data. We've all got opinions, and we make do with what we have - here, I think Aidan does a particularly good job of analyzing tactics with the tools available - but without a significant amount of legitimate data, it's really hard to tell what's accurate and what's not. Without people publicly working on this stuff, with an informal (or even formal) peer-review process like baseball has developed, it's really hard to develop good analytical tools, and without those, it's hard to know what's going on, really.

And honestly that upsets me. I hate seeing an argument that I think I could easily disprove, if only I knew this, this and this, then when I go looking for that information it's harder to find than Wellington Silva (sidebar: I forgot he existed until this morning). Not having data makes smart people say and think things that might be crazy, and it really annoys me to see that. Sources do exist, but I really wish that as a football community, there was a better way to develop, discuss, and present that knowledge publicly, in the open. There would be a little less ignorance - and madness - from all of us.