Berkeley Jockey Club was founded in January ’20 by four UC Berkeley students – EJ Gardiner, Ethan Dixon, Tyler Umansky and Cole Striler – inspired by the story of Bill Benter from the 1980s. They sought to mirror and improve the algorithms produced by the billionaire, using data from racetracks in Hong Kong. They created a web scraper to pull data from the Hong Kong Jockey Club website and used public data pulled from Kaggle. Once the data was compiled and cleaned, the students were able to produce three different models: a linear regression, a random forest, and an XGBoost.

The Berkeley Jockey Club

The linear regression model used the finishing time of the horse to predict a finishing position for a given race. The random forest model used a multi-class classification to output ranked class-1 probabilities. The XGBoost model used (0/1) probabilities to output the ranked probabilities of the results of a given race. The random forest model found 0.998 training accuracy and 0.30 testing accuracy. Using public odds, the students found that a $500 investment would amount to a $200 return over 500 races, but using their models, they saw an increase to an expected return of $1,100. The linear regression model showed 0.295 training accuracy and 0.325 testing accuracy.

Again, using public odds, the students found that a $500 investment would amount to a $300 return over 500 races, but using their models, they saw an increase to an expected return of $5,500. The Berkeley Jockey Club believes there is incredible room to grow this project. They are excited to test their models on more recent data, vary their model output type, expand outside of Hong Kong, and explore exotic bets and optimized bet placement. Club member Ethan Dixon says, “this was a small test, which proves that a larger, more comprehensive system is possible.” We look forward to tracking the progress of this group as they expand their reach in the horse betting field.

Built by: Ethan Dixon, EJ Gardiner, Tyler Umansky, Cole Striler