I am always trying to improve my xG model, today I am introducing the latest updates, defensive pressure, number of defenders between shooter and goal, and a finishing ability measure.
The data for this comes from Stratabet, I have used this data in my projection model since the start of the season but I had not used it in the rest of the data because getting the two data sets to sync up was a nice challenge to increase my data wrangling skills.
Before digging into the details of the model let’s go over what each of these terms mean.
Defensive Pressure Definition:
Defensive pressure is measured on a scale of 0 – 5 (5 being the highest). This value is judged by the Stratbet data analyst to determine how much pressure the shooter is under.
0 - No defensive players around, nobody blocking the shot
1 - Light defensive pressure, no direct tackle but a player stood a few yards away causing some part of the goal to be blocked
2 - Low defensive pressure, a player a few yard away but could be sticking a leg out looking to make a block
3 - Medium defensive pressure - Close contact with a defender, a player blocking the ball from close range, s player holding onto the shirt but behind the man
4 - High defensive pressure - Many defenders crowding around the shooting player, Tackles being made as the shot is taken, very close contact when jumping to meet a header
5 - Intense defensive pressure - a player being held while taking a shot, many players all making tackles together giving very little room for a strike, a player crowded out when challenging for a header
The average defensive pressure for this season is 2.18, with the following distribution for this season:
Number of Defenders Definition:
This one is pretty self explanatory as it counts the number of defenders between the shooter and the goal. This includes the goalkeeper as a defender so for example a one on one with the goal keeper would have one defender and an empty net would have zero.
The average number of defenders per shot this season is 2.6 with the following distribution for the number of defenders per shot this season:
To create a measure of finishing skills, I used Stratabets shot quality measure and compared that to the league average.
Here is the definition of what shot quality measures with some examples:
This is a subjective measure on the quality of the shot. If the shot is particularly well struck and gave the keeper no chance then this would be rated as 5, while a scuffed effort that was poorly hit, or a wild swing that goes well wide would be rated as 1.
For example a free kick will almost always be a Poor/Fairly Good Chance but the Shot Quality could be rated as 5 if the player bends it into the top corner away from the goalkeeper completely out of his reach.
A standard shot would be rated as 3, so for those times when an open net happens then that is the benchmark.
Shot Quality 0 is for occasions such as a player blocking the ball and it going into the goal or a cross that is not intended as a shot drifting all the way in, basically any accidental shot.
The average shot quality this season is 2.26 with the following distribution this season:
To create the finishing skill metric I took the average shot quality for each player plus a regression of 50 shots of league average shot quality and then compared that to how far above or below they were to the league average.
Here is the distribution of estimated finishing skill:
The vast majority of players are at 1, which given the regression added makes sense, as more data is added to the model the distribution will probably flatten out but to me this passes the sanity test.
Estimated Finishing Skill Top 5
|Player||Estimated Finishing Skill|
|Player||Estimated Finishing Skill|
I’m a little surprised that the top includes Xherdan Shaqiri and Jermain Defoe nut I guess Shaqiri has the pedigree of a top prospect and Defoe is a classic finisher if nothing else.
Incorporating Defensive Pressure, Number of Defenders and Finishing Quality into xG Model
To incorporate these into my xG model I ran a logistic regression with these added variables. The full equation is:
(1-(1/(1+((EXP(-1.21607045725358+((Feet*0.912665835348231)+(Header*0.1*7068942017931)+(Center of the box*0.262249804783672)+(Six Yard Box*0.249332579010623)+(Very Close Range*0.860952204760409)+(Left Box*0.513867909742021)+(Right Box*0.420783180649214)+(Outside of the Box*-0.334599640501185)+(Long Range*2.47764497413238)+(more than 35 yards*0.902831454617313)+(Difficult angle*0.0561404821804883)+(Cross*-0.182266350084979)+(Set Piece*0.746331864881083)+(Free Kick*1.33776813057468)+(following a corner*0.762567501536443)+(Through Ball*0.462822354397946)+(Headed Pass*-0.0286463733816039)+(Big Chance*1.4090869533203)+(Fast Break*0.834840794063029)+(Meters to Goal*-0.125010966131251)+(Def Pressure*-0.206002245965051)+(Num Def Players*-0.355069759206502)+)))))))
Overall this regression has an r squared of 0.38 which in my opinion is a very good estimate of a very difficult to measure outcome.
To add in finishing skill the xG is simply multiplied by the estimated finishing skill from above. So if a player had an estimated finishing skill of 1.107 like Shaqiri and he took a 0.09 xG shot that would become 0.0996 xG.
For most individual shots the change isn’t big but over a large number of shots the numbers do converge.
These numbers are currently only updated for the Premier League and are on my Tableau page as Def Pressure xG and Finishing Skill xG.
This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations.