This article on the Arsenal official website earlier this week prompted some thinking in the comments on this blog about the value and usage of statistical data in the world of football. Arsenal are planning on using GPS tracking to build up a database of motion for each individual player on the team in the hopes that analysis of the numbers will help prevent overworking them and thus lessen the frequency of injury. Given Arsenal's focus on staying on the cutting edge of technology, fitness, and nutrition, it is no surprise that the club has taken this step. What remains to be seen, though, is whether this data will actually reduce the number of injuries that players suffer.
This question ties in with the nature of football and with its natural resistance to statistical analysis. There is no sport, and probably no game of any kind, that is as "analog" as football is. (Very) generally speaking, nearly every sport or game is broken up into discrete packets of some kind of information; think of one pitch in baseball, one down in American football, one possession in basketball, one move in chess. Ice hockey approaches football in its flow, but nonetheless has developed a somewhat more robust statistical analysis, perhaps due to the higher incidence of scoring. These sports, with their quantifiable events that accumulate into nice, large, meaningful sample sizes across players, seasons, and teams, lend themselves to math in a way that allows for evaluation and prediction in a way that football seems to defy.
Football statistics are traditionally limited to goals and games played. Recently, with the advent of Opta stats and things such as the Guardian Chalkboards, more numbers are available, such as distance traveled during a match. Systems now track every pass and every tackle during matches, track average pitch position, and every shot vector. This data is invaluable when analyzing a single match, but less valuable when evaluating a single player or predicting how he will play in the future, as the data relates not only to his ability but also to the tactics of the team, the tactics of the other team, and the general flow of the play. Unlike baseball, football is not a matter of a player confronted with the same situation thousands of times in his career with quantifiable variables, but rather a player faced with thousands of similar situations with almost infinitely small variations that make comparison tricky over the long haul.
Thus, the GPS injury system faces a number of challenges. While tracking distance, load, and speed across both matches and training for all players will provide a lot of data, it cannot account for the fact that the injuries that really knocked Arsenal's chances this year had little to do with fatigue or wear, and more to do with Giorgio Chellini and Ryan Shawcross playing recklessly.
The injury to Cesc Fábregas, on the other hand, may have more to do with fatigue, but this raises another question. Cesc only played 36 matches this year, missing time in December both before and after the Villa match, and he missed time in April and May following the first leg of the Barcelona tie. 36 matches does not seem like much, but one must consider that Cesc runs as much as anyone on the team, gives his all every time he plays, and takes on a lot of physical punishment from opponents trying to stop him playing. If GPS data were to show that Cesc needed a rest based on information collected over a number of years, though, would Wenger rest him? Does Wenger not rest him every chance he gets already?
I'm not against Arsenal utilizing the GPS system, for the same reasons that I am not against pitch counts in baseball (although, of course, pitching is a repetitive stress situation, and football is not in the same way). But I do wonder, given opponents' approaches to stopping Arsenal and the sheer number of matches that Arsenal play every year, if this data is the answer, or if it simply another layer on top of the mystery of why Arsenal suffer more injuries than other clubs.