It is a pleasure to welcome Ashmeet Anand to this guest post on data visualization and sports science. Ash has worked as a Sports Scientist across multiple US Collegiate and Major League Soccer (MLS) teams and is now working as a Lab scientist for Therabody.
[For more on developing Tableau and PowerBI skills, check out our recent post with Sport Horizon, which includes a 15% discount for their online courses.]
I was an 8-year-old kid when I saw Rocky Balboa IV, which first inspired me to wear a white lab coat and test athletes, just like Ivan Drago. I moved to the US in 2012 to follow this dream. I completed my Masters in Florida (with an outstanding graduate student award) and began a sports science internship with the Colorado Rapids in 2019.
I remember one pivotal moment that highlighted the need for improving my data analytics skillset, when my boss requested an analysis of a player's physical output. Despite spending considerable time on Google Sheets, I was outperformed by a cleaner and more detailed graph created just a fraction of the time by someone else using Tableau. This experience underscored the limitations of using Excel and Google Sheets for data analysis and visualization, prompting me to explore business intelligence tools.
Professional Development in Data Science
It is a challenge to excel at both Sports Science and Data Science and I found myself overwhelmed by the rapidly expanding data science roles in sports. This led me to delve into programming languages, initially Python, to open new opportunities. Only a few sports scientists can perform the tasks executed by data engineers and data scientists, yet sports scientists can create visuals to communicate messages to coaches. This is the realm of Data Analysts!
With Jo as a mentor, we often pondered the merits of an on-field practical sports science role versus leveraging that title to help build a team's robust back-end structure. This fueled my ambition to excel in both domains, enhancing my collaboration with the analytics department to learn more about data storage, management, and existing football models.
Seattle Sounders's reputation for data-driven decisions, particularly Ravi Ramineni's expertise in Python, inspired me to blend coding with my sports science background. I had already been using Tableau for 4 years at this point, but over the next two years, I honed my coding skills, building a repertoire that included linear regression, polynomial regression, cluster analysis, PCA, and XGBoost models (supervised and unsupervised learning).
The remainder of this article highlights a unique blend of skills that links Python and demonstrates how it can be used to download and manipulate the data followed by Tableau for visualization. The figures shown below display a gif of the visualization, but you can also visit the 'MLS Story' on my Tableau Online profile to interact with these outputs.
Football revolves around minutes. While 'minutes played' may seem simple, it is a gateway to understanding a player's impact, resiliency, and value to their team. It is often at the core of staff discussions. This curiosity led me to explore insights from publicly available MLS data from the 2018-2022 seasons.
While 'minutes played' may seem simple, it is a gateway to understanding a player's impact, resiliency, and value to their team. It is often at the core of staff discussions.
For Python, I recommend using Jupyter Notebook to run the code. I started by web scraping from the FBref website. Web scraping is a game-changer in sports, enabling the extraction of vast amounts of data from websites such as Transfer Market, Fotmob, and SofaScore.
You can also use ChatGPT for your benefit, but I only recommend it if you are competent at reading and reviewing the code. Mastery of HTML and Python tools, using Beautiful Soup and pandas for instance, are crucial for efficient web scraping, enabling navigation through web pages and storing data. Subsequently, connecting this data to platforms, such as Tableau, opens up a world of possibilities.
Building Interactive Visuals with Tableau
As Jo has repeatedly stated, “Sports Scientists need to be able to translate data and research”. To translate data to key stakeholders, I chose Tableau as I like the simplicity and how much it has been used in the business/analytics world! While Tableau helps you make attractive charts, its table calculations, which have over 130 options, can help you manipulate data in numerous ways. With each visual below, I will highlight what table calculations were used.
Each visual corresponds to a real-life football analytics question that can be answered with publicly available MLS data.
Q1: How much time do Designated Players (DPs) typically accumulate on the field, and what impact does their presence have?
The average cost for a DP in the MLS is between $4-8 million. For the DPs to leave their mark on the field, they usually make up 6-9% of the available team minutes (i.e. you have 10 players on the field and the team plays 34 games in the regular season). Those 6-9% can have a significant impact in winning a game.
Table Calculations: IF, ELSE, CASE, WHEN, DATEPART, DATEPARSE
Q2: Is the MLS really the 'retirement league'?
MLS teams now feature a mix of veteran leadership and youthful energy, challenging the perception of the MLS as a retirement age. Below is the distribution of playing time across different age groups, to see how teams use talent according to age. The visual highlights teams with veterans versus youngsters. It is particularly interesting in MLS as it is known for its mix of seasoned veterans and emerging talents.
Table Calculations: TOOLTIP, PERCENT, GROUP, and BIN
Q3: Which team rotates the squad vs which teams use consistent players?
One of the pivotal questions that emerged was the role of player durability and squad rotation. Players was categorized based on their accumulated minutes as follows:
“elite” (>2500 min),
“sub-elite” (2500-2000 min),
“good” (2000-1500 min),
“decent” (1500-1000 min),
“off-bench” (<1000 min)
With this, I aimed to highlight the importance of player reliability and the strategic implications for teams. Dividing time into subsets makes it granular and enables analysis to see how teams have utilized their squads.
Table Calculations: RANK, WHEN, IF, GROUP, ELSEIF, COUNT
Q4: Can you build a dashboard to track players' minutes and their performance over time?
This visual allows you to compare five players and their progression over time. For instance, to analyze the trajectories of rising stars like Cade Cowell, Ricardo Pepi, and Sam Vines. This can offer insights into their development and performance trajectories.
By tracking player minutes and performance, teams can make more informed decisions regarding squad management, talent development, and transfer strategies. You can start by selecting a team and the players on the left get filtered based on your selection, the second step is to click on the circle and add the selected player to slots 1-5. After your selection, click back on the team to update your visual.
Table Calculations: FILTER, GROUP, IF, ACTION, PARAMETER, SET
My journey in upskilling in Data Science and Analytics has been guided by mentors, online courses, and unique insights from coaches. Throughout this experience, football analytics has emerged as a beacon of light, showcasing the profound power of data in unraveling the complexities of the beautiful game.
I have aimed to forge connections across the four major departments: Technical, Performance, Medical, and Scouting. Each department plays a crucial role in the success of a team, and through analytics, I've sought to bridge these areas, leveraging data to enhance decision-making and performance. My primary aim has always been to absorb knowledge, bridge the gap, and ultimately, create something impactful from scratch. By embracing the game's complexities and fostering collaboration across departments, I have learned to unlock new dimensions of understanding and elevate my skills to new heights.
My Tableau Online Profile: MLS_story, MLS_minutes
Remember to check out the 15% discount for Global Performance Insights readers to use on Sport Horizon's Tableau for Sport Scientists and PowerBI for Sport Scientists online courses. Learn more in our recent post.