I am always curious on what’s new in the analytics world and how it will impact business outcomes. We constantly hear about latest big data and analytics product launch but only few really have the potential to change how things work on a larger scale.
Recently, IBM (my employer) launched a new analytics tool based on the three powerful technologies – cloud, big data and cognitive analytics. For those who don’t know IBM Watson, here is a quick primer:
IBM developed Watson a super computer capable of understanding and answering natural language. Watson gained popularity in 2011 when it received the first place in Jeopardy beating former champions.
Watson wins Jeopardy
Banking on the success of Jeopardy, IBM went on to develop the IBM Watson Business unit to solve larger world problems. The challenge that comes with large technology initiatives such as Watson is the cost associated with it. Instead of being a technology for all it becomes limited to few big pocket organizations. Another part of the problem is an average analyst and business user do not have time to learn new programming language or database skills to uncover insights from large volumes of data. Plus, getting the entire organization in agreement for implementing major software/hardware solution is a mammoth task and can take months if not years.
IBM knew this and to overcome these challenges Watson Analytics (WA) was developed. Watson Analytics combines the power of cloud, big data and cognitive analytics into one tool. The best part about WA is the limited version is completely free on WatsonAnalytics.com. You can run any type of analysis using WA with few mouse clicks. Here are some examples to help you get started:
Predict the dependency of marketing campaigns on revenue. (Marketing)
Accurately identify and focus on deals that are most likely to close. (Sales)
How to resolve IT which impact critical resources. (IT)
How to create top performers and attract the right talent. (HR)
WA usage is limitless, and it depends on what business problem you are trying to solve.
Now instead of converting this blog post into an advertisement of my employer’s new product, I would like to put the tool to test. What could be the best way to test the analytical capabilities of Watson Analytics than to predict my favorite game Cricket’s outcome? Cricket as some of you may already know is a highly popular game played by approx. 30 countries worldwide. For the purpose of this post, I will stick to the most popular One Day International (ODI) format of the game.
Before I jump on to the analysis piece, I want to make sure we setup the right objectives for this project. My main focus is to identify key factors that increase the odds of winning the ODI. Simply put, I want to know what does it takes for a team to win the game of cricket. Simple and straight forward!
There are hundreds of factors that can have an influence on winning and losing the game of cricket but only a handful has a major impact, so I would focus on:
Teams with players who took 5 wickets
Teams with players who hit 100 runs (century)
Who won the toss
Whether the game was played at home or away
Whether the team chose to bat or bowl first
After setting the right objective and selecting the criteria that can impact the key objective, we are now going to gather data.
Data collection: There are several good cricket data resources online, and one of the good ones is ESPN Cricinfo.com. I tasked a cricket enthusiast data entry person on Odesk.com to compile a list of all ODI matches from 1984 to March 2014 in csv and xls format.
30 Years of One Day International Cricket data
Data Preparation: I created a free account at WatsonAnalytics.com. The Odesk hire did a pretty good job in providing clean data, so I spent most of my time in making sure the data was accurate and formatted correctly. The next step was to upload the data to WA and test the data quality. To upload data to WA you have to click the “Add” icon and then click “Drop your file here or tap to browse”. Select the file you would like to upload, WA will upload the file and display it on your WA dashboard.
Upload data to Watson Analytics
Watson Analytics Visual Data List
WA has a unique feature to measure and report data quality in a neat chart. 63 is not bad for 30 years of worth of sports data.
Watson Analytics Data Quality Score chart
Once the data is uploaded and has a decent quality score you are ready to run your analysis.
Predictive Analytics: You can perform three types of tasks on the uploaded data, i.e. exploration, prediction and view. For the purpose of this post, we will focus on the prediction. Click on your data set and then click on “Prediction.” This will open up the “Create a Prediction” screen. On this screen, you can complete the following tasks:
1. Name your workbook (I named mine as ODI-12)
2. Select target(s) to predict – This is where you will select your key objectives. My objective is to predict what it takes to win the ODI cricket game, so I selected “Match Winner” as the target. You can have multiple objectives as well.
3. Click “Create Prediction” button.
WA will crunch numbers in the background based on your criteria and will create a dashboard.
Let’s review the WA dashboard and analyze the results of this prediction.
Watson Analytics Prediction dashboard
WA dashboard is pretty simple and straightforward as the tool is design with business user in mind. The drag and drop UI allows users to analyze the results from a desktop or a tablet. There are five sections on this dashboard, and each can be used to customize the results.
The most important one is the “influencing factors” section, which can be used to change prediction base one or more factors.
Interpreting results: To understand the prediction results, we can either drill down the charts under section “What influences Match winners?” or simply hover mouse over the center circle. The circle resembles our solar system, and the objective “Match winner” is the center of the solar system (Sun) while the influencing factors are the planets. WA provides results based on the prediction strength so the end user can be assured of higher accuracy prediction. The planets closer to the Sun have the higher predictive strength than the planets further away from it. When you hover over the planets close to the Sun (Match winner) you will notice the factor “5 wicket taker” has 86.7% predictive strength. What it means is teams with players who take 5 or more wickets have highest chances of winning the ODI cricket game.
The second most accurate match winning factor is the “century maker” i.e. teams with players who make 100 runs have higher chance of winning ODI.
WA Prediction Chart
We can also drill down into each of the matches winning factors, and I would like to discuss a couple of these here. First, we will look at the detailed prediction charts when we select “one field” as the prediction influencing factor.
Prediction with one influencing factor
There are three prediction charts with the strongest factor “5 wicket taker” displayed at the left and weakest factor “who won the toss” at the right. Let’s click on the “5 wicket taker” to further review the results. The main insights show a graphical presentation of each of the 5 wicket taking player along with the teams who won the match. You can hover over the bar chart to get more details about each winning team and its match winning players.
Match winning influenced by 5 wicket taker
To improve the predictive strength it is highly recommended to select “combination field” as this option will consider broader data set before providing results. In our case, selecting “combination field” increases the number of prediction results and increases the prediction accuracy. We see a 5.7% improvement in prediction strength and a combination of “5 wicket taker” and “who won the toss” drives match winners with 92% predictive strength.
Prediction with multiple influencing factors
We can now say with 92% accuracy the team which won the toss and picked 5 wickets will have better chance of winning ODI cricket game. I will continue to add more datapoints to improve the accuracy of this prediction and run similar analysis on T20 format of the game.
Now it’s your turn and here are some next steps:
1. Provide your comments and feedback below and suggest additional fun analytics projects we can run using WA.
2. Share the blog post on your favorite social network!
3. Signup for WatsonAnalytics.com and run your own analysis.