Toss2Win Analysis — A Newbie’s crack at Cricket Analytics
Always Remember, Beginning is the hardest part — Foundr (Motivation)
First things first, let’s define a few keywords:
Cricket: A Game played between two teams of 11 for various numbers of days/overs with a bat and ball.
What’s an Over?: An over comprises of 6 legal deliveries.
you mentioned Days, what? — Well, Test Cricket takes place over 5 days of enthralling RED ball cricket where each day a team gets to bowl about 90 odd overs, each team gets to bat twice, this is called an innings.
Well, a Short form of the game is called non-test cricket? — haha... No, cricket is a test of sportsmanship, skill, fitness, comradery, belief, and perseverance, and the Crickets formats are as follows:
White Ball Formats:
50 Over a side- One Day International
20 Over a side- for Franchise & international Twenty20
T10 — Franchise based highlight’s-sort of cricket tourney
Pink/Red Ball: 5 Days - TEST CRICKET.
So, what's a toss? : The toss is the first action performed by the captains of each team, where they flip a coin to determine which captain will have the right to choose whether their team will bat or bowl/field in the current match.
Analytics: The meticulous & detailed computational analysis of sports data.
Cricket Analytics: Analysing the cricket data to predict and analyze details to get information of the past matches and predict future outcomes through advanced predicting algorithms and machine learning algorithms.
Tools that I have used for the below analysis: Anaconda, Jupyter Notebook, Python, Pandas, URL Lib, Beautiful Soup, Excel Sheet.
Data Source: The data that has been pulled from the ESPN Cricinfo’s STATISTICS/STATSGURU/ONE-DAY INTERNATIONALS/TEAM RECORDS database.
Also, let's establish a few things before I deep dive into the intricate details of this mini-project I undertook!- This project is a part of the assignment that was given to our batch by Vaibhav Pipara as a part of the MAD ABOUT SPORTS — Introduction to Cricket Analytics using Python Initiative.
- ) What's my Mission? : Create readable and accurate statistics of data from 2015 to 2021 of the ODI data to find out if winning the toss, wins the team the game...
- ) What teams am I going to analyze on? : All the teams that played ODI in the duration from 2015 Jan to Jan 2021–22 Teams, with special focus on top 8 teams stats-wise on victories when they win the toss.
- ) How am I going to do it? :
Well, it first starts off, with how I web scrape the data to my needs from the available data, I would need to get the Toss & Result with the 2 playing teams/opposition for sure. (Note: I am not going to details web scrapping as it would deviate from the purpose of the article)
Once the above is sorted out, I would need to make sure there are no redundancies by using the Date/Ground details, whilst cross verifying a few selected matches in google for data accuracy- this is just to check if the scrapping has been accurate(can surely be avoided by confident pro’s).
Once the data is validated, I am going to segregate the information in different worksheets as labeled below
master_data is the total data extracted off the website, home denotes teams performance at home venues, away at the touring venues, and visualization for the graphical representation of all the data analyzed, the conclusion is just a phrasing that will draw up a ton of talking points about each team and the surprising and shocking find from this project.
Explanation of columns:
Planned: The Original COUNT of Fixtures that were decided by the cricket boards.
Played: The Total number of fixtures carried out that were not abandoned, canceled due to either weather or pandemic or any other reason.
Abandoned/Cancelled/No Result: Cancelled match either without a ball being bowled or due to weather, bad light, pandemic, etc..,
WON: matches that were won by the home team.
LOST: Matches lost by the home team.
TIED: Amazing Match where both teams have played so awesome that a terrific match has no winners at the end of the day.
Toss Count: Toss either won or lost
Win %: (Total Numbers of Matches Won / Total Matches Played)* 100
Match Win % + TOSS Win: (Total number of matches Won While winning Toss/Matches WON)*100
Match Win % + TOSS Lost: (Total number of matches Won While losing Toss/Matches WON)*100
The average number of matches played comes up to 24.14, so we are going to focus mainly on the teams that have over 24/25 matches recorded.
Well, then, What are my results ?? (Data Points)
For the top 8 teams on the ICC ODI rankings:
England’s Records are too straight forward 64 matches at HOME soil, 44 out of those won with a whopping win of 68.75% in ODI’s over 6 years. The most bizarre stats is the fact that the ENGLAND Men’s Cricket team are victorious almost equally when they lose the toss or win it, with a difference of less than 6% on their win record.
India has played 40 out of which they have won tosses 19 times only, yeah, some poor tossing the coin by KING Kohli, however, at home, they are winning at a constant rate and yet have accumulated only 63% of victories when they have won the toss.
Thanks to few wonderful & consistent performances by the bowlers and the middle order over a few years, the NEW Zeland Cricket team has recorded a whopping 75% victory at home with 77% of matches won when they win the toss, their show every time is so consistent that the win or lose the team has a higher average victory to loss ratio than any other at-home soil...
Australia, the team from the land down under are recording 70% victory in the said period with 71% victory when they win the toss.
The remaining teams have an impeccable record as well as we see from the table posted above, but the most interesting stat still remains that 6 years of teams touring Bangladesh have not gotten the worst out of them when Bangladesh wins the toss followed by Pakistan(@ UAE).
Srilanka has the absolute worst record amongst the top 8 with only 42% victory and 44% when they have won the toss at home.
The Away Records are as follows:
India has the best outing records as compared to the other teams as we see and surprisingly New Zeland has the worst.
South Africa when touring and given decision instead of choosing have performed exclusively amazing by winning almost 68% of all matches played while losing the toss,
Bangladesh is the absolutely tormented team with 18% of victories coming when losing the toss and an overall of 21% victories when touring countries.
Here’s a representation of the team's performance through a graph for better visualization.
*up till Jan 26th, 2021
When it comes to the 50-Over White ball game, it is hard to judge a teams performance just by the numbers, while the above being about the top 8 teams, the numbers game is tricky since we do not factor in a lot of non-quantifiable factors and must be left at this.
India when playing AWAY has been absolutely dominant with an aggregate of 60% victory and has negated the Toss factor, while other teams have done well, South Africa is the only other team with the second-best of up to 50% matches ending in victory.
At Home, apart from Srilanka, Every other country in the TOP 8 has put on a great display of talent and has recorded over 60% victories and negated the toss factor in the matches they have played.
Since the game of numbers only, I would love to display the win% of teams at home and away wrt TOSS.
So, What do you think of the article? — please do let me know in the comments below. Kindly Support me with what more could be done better from my end and how these analyses could be performed at an amateur level so that I can grow from being a rookie!