Using open source data a new low-cost, time-efficient, and highly accurate xG prediction tool for soccer has been created.

Berkeley, CA, November 2020– A team of five undergrads at the University of California, Berkeley, with guidance from the Sutardja Center for Entrepreneurship, released a new expected goals predictive tool for soccer. Inspired by the lack of low-cost options available for soccer analytics, current Cal Women’s Soccer Player, Emma Westin, mobilized this group of students to create a simple website to predict scoring patterns. The expected goal value, otherwise referred to as xG, is a measurement of the probability of the shot being a goal based on the current conditions of the game. 

The intersection of sport and technology has become increasingly important in the improvement of performances in soccer leagues across the globe.  Professional leagues have long since been able to hire their own data scientists to perform team specific analysis on matches. Teams with a smaller budget may be able to outsource their film to third-party companies to perform this analysis. Many youth, club and college teams, however,  lack the funds for either of these options. This product attempts to bridge this gap, using data from hundreds of soccer games and a variety of levels, splitting it by gender, and choosing the most relevant and user friendly parameters to create a predictive model. As a result, a basic analytics tool was created that any team can use to analyze performance and goal scoring proficiency, and to create relevant game strategies to optimize scoring opportunities. 

“As a member of the Cal Women’s Soccer Team, as well as the professional swedish club Hammarby IF, I have been able to see firsthand how influential game analytics using the xG value can be for increasing performance,” claims team leader, Emma Westin. “We have worked hard to make this type of analytics available to any team, regardless of their budget.” 

By indicating the position of the shot on the field, and then selecting from drop down menus the shot type, number of players between the ball and the goal, and if the shot was first time, users will receive the predicted xG value for a particular shot. These simple parameters can even be input while watching the game live, negating even the cost of hiring videographers for film. Preliminary results yielded a ROC-AUC score of 74. 

Press Contact:                                                              Public Code Review:
Emma Westin                                                                DataX2020_github