12 – 6 Steps to Big Data Success for Digital Marketers [Video+PPT+Transcript]


Recently, we did a keynote presentation at Digimarcon West conference. Our goal was to educate traditional digital marketers on how to use data science and big data analytics to amplify digital results. We have recorded the show for our listeners in both audio and video format. We have also added the powerpoint presentation for your to review with the recording. Enjoy!

A Quick Preview of the Big Data for Digital Marketing show (show notes):

  • Sameer’s career and introduction to analytics
  • Data science vs big data
  • What does N=all means and how it is tied to Big Data
  • How we improved our churn prediction analysis accuracy
  • Sameer’s 3 point criteria for selecting the right analytics tool
  • Introduction to IBM Watson Analytics, R-Studio and BigML
  • What does Gartner’s recent study tell us about unused data
  • What is the “IBM Consumer Conversation Study” about
  • How consumer behavior is having an impact on digital marketing
  • Why should digital marketers care about data science
  • How to prepare your data for analysis using the right tools
  • How to perform analysis using WatsonAnalytics.com
  • How to predict the virality of your content marketing efforts
  • 6 steps any digital marketers can take using free tools
    Video recording of my keynote

    YouTube Preview Image


    Listen to the 6 Steps to Big Data Success for Digital Marketers show


      Slideshare powerpoint of the presentation


      Resources discussed in this podcast:

      Read Transcription

      When I started, I started as a campaign manager. I was a digital marketing campaign manager and part of my job was to do basic analysis. I used to look at web analytics, social media data, and do some basic digital marketing analysis that we are all used to. Then, one of my friend who happened to be a Hadoop developer, he introduced me to the practice of data science, and I’ve not looked back since then. In this session, what I wanted to do is by the end of the session, I want all of us to be capable enough to run basic data science analytics on your digital marketing data. Let’s get started. If I crossed the chasm, I’m pretty sure you can. When I talk to a lot of people who are not related to the big data or data science industry, and they have a different perspective, they think that data science itself is a tech whiz of Iron Man, and big data analytics is like a power of the Gods.

      That’s their competing with each other, but that’s not the case. Data science is a practice. It’s a methodology that you can apply to data of any type, big or small. Versus big data analytics is specifically geared and targeted towards large volumes of data. What does that mean? Big data analytics in a common term, when you talk to somebody who is insider from the big data analytics industry, they’re going to tell you big data is N = All, which means take all the data that is possible. Here’s an example. We were doing a customer turned reduction analysis. Basically we were trying to identify how many customers are we losing, and can we predict that. We took the 12 months interaction history for the customers, and we got up to a 62% prediction accuracy. Which is good, and we were able to, with a very good accuracy, predict what customers are we going to lose.

      Then we started using all the customer data, and we performed all the analysis we put that data in the Hadoop, and we ran analysis in it of Hadoop. Our accuracy improved 73%, which in data science world is pretty massive jump in terms of prediction. Imagine going from 62% to 73%. That’s the power and promise of big data analytics, and that’s the reason why it’s important to consider that for digital marketing initiatives, especially if we have so much data. Just as the art reactor powers the Iron Man suit, we digital marketers, we data analysts and we data scientists, we all have access to tools. There’s no shortage of tools in the market, there’s so many different types of tools, I don’t know what to do with it. What I did is for this particular section I had a criteria. 3 point criteria. First, the tool has to be completely free so everybody can access it, and download it, and create an account, or it at least has to have a free tier.

      Second, the tool has to have widespread users so you can go in the forum and participate in the community and have a discussion about it to get your questions answered. Third, it is easy to use. Keeping that 3 criteria in mind, I have 3 different tools that I’ve listed here. Specifically I’ve called out 2 different tools and one of them that I’m going to talk about.


      IBM Watson Analytics

      First one is Watson Analytics, which is IBM’s analytical platform. Raise your hand if you have heard of a popular TV show, Jeopardy. Great. A few years ago, we created a super computer that was successfully able to beat the all time Jeopardy champions, 2 champions at one time. That computer was called Watson. Now, after the game was over, we were like, “Okay, what are we going to do with this?” We started creating Watson business unit, and we applied the algorithms inside of that Watson computer to different types of products. One of them was IBM Watson Analytics, which is predictive analytics platform. Again, it’s completely free to use. You can create an account, you don’t have to be a programmer to use it, and it has Watson analytics, textual analytics capabilities as well. Again, Watsonanalytics.com. We’re going to talk about how to use this later.

      Then we have the Rstudio platform. Programming language and Python are the 2 most popular programming languages for data scientists. There’s a learning curve to it, so there are courses on EDX. One of them is called Analytics Edge, there’s a course one on Coursera. All of them free, you can get started with R very, very quickly. It’s not like writing an application code, and then you can download Rstudio on your computer, or run it in the cloud.

      Then final tool is BigML. Again, it’s a very, very popular platform and the beauty of BigML is again, there’s a free tier, you can create a free account, it’s point and click. Just like you use your Marketo, or Silverpop, or social media application, you can point, and click, and run predictive analytics inside of BigML.

      Why it’s important to combine digital marketing with data science, what is it all about? Typically we’re marketers. I feel very constrained to the tools that we have. We have a web analytics tool, we do social media analysis, maybe we’re going to do some Twitter analysis, maybe CRM analysis, but it’s very limited. Now it’s the time to open up the horizon. When you open up the horizon, you get an opportunity to run a bunch of different types of analysis, and at the same time you get massive different results that you can utilize and improve your digital marketing performance. Let’s look at some of the key reason why it’s important to combine digital marketing in data science. First reason, in India there is a snack mix, it’s called “Chivda”. What happens is, there’s a lot of different combination of lentils, peanuts, nuts, cereals. We combine that and there are thousands of different combinations available in market for the same type of snack, and everything is delicious.

      The reason why is because there’s a unique combination that’s helping provide this new variety of snack. It’s the same concept for digital marketing, when you combine it with data science, you get new things that you have not even thought about before. One of the other thing, before I jump to this data, is there’s a massive explosion of data and marketing technology. From a digital marketing standpoint, you get all the Twitter fire hose data, you have all this Facebook data, you don’t know what to do with it. Gardener says 90% of the organizational data goes unused. Imagine that 90%. With this data coming in, we have another problem. Marketing technology explosion. Every single day I’m pretty just like me, you get bombarded with phone calls and LinkedIn in mails with new marketing application, “Hey why don’t you buy my application and let’s start using it? You’re going to get X, Y, Z benefit out of it.”


      IBM Consumer Conversation Study 2015

      We are experiencing a big explosion of marketing technology, and chiefmartech.com, a blog that follows all the marketing technology space has an info graphic that he shows all the different marketing technology platforms. We marketers, we’re facing a major challenge. We have a data problem and we have a technology problem. IBM research conducted a study a few months ago. It was called consumer conversation. The idea of the study was to identify the gaps between what the consumers are saying, and what the brand’s saying. There is a huge gap. The first one is 69% of the brands are saying that we are delivering positive experience to our customer digitally. When you look at the consumer side, it’s a completely different story. 51% of the customer says, “We stopped doing business with the brand because we had a failure of experience on the digital side.” Again, 81% of the brands are saying, “We understand our customers holistically,” But only 37% of the customers agree.

      There’s a big disconnect. What’s going on here? Again, going back to the Gardener survey, 90% of the data is getting unused. That’s what’s creating the problem. Last but not the least, it’s all about timing. In Marketing or in digital marketing world, if you’re not able to deliver the right message at the right time in the right context, then you end up with a situation like this, where this guy’s looking for insurance 9 months ago in a convertible, now it’s too late. It’s all about timing. What are some of the use cases? The first one, use cases of big data analytics and digital marketing. The first one is customer segmentation and journey analytics. What does that mean? For example, if I’m running a marketing campaign, what if I could define a very specific targeted customer cluster where I can apply the marketing communication message directly to that cluster? If I’m targeting maybe soccer mom, they want to buy a wearable tech product.

      I can define that particular criteria and I can target them with a specific message or content. The other part is, Forester says more than half of all the attraction that the customers have with the brand are multi-device and multi-channel. What it means is we not only just have to understand the marketing attribution, but also understand the marketing journey, because all the journeys are taking place in multi-device and multi-channel environment. The second one is, the marketing campaign propensity, or marketing performance improvement, which if you definitely target the customer in the right way, you understand the journey of your customer, then you improve the chances of getting better results from your marketing campaign. The third one is my favorite. Raise your hand if you ever thought of creating viral content. I have too. Typically what happens, when you think about viral content, we marketers, we get in the room, we have social media people, and content writers, all of them together and sitting.

      We do brainstorming sessions and we come up with a list of content that we think is going to go viral. We definitely have some data from social, maybe some share data, maybe some web analytics, and that’s about it. There is no science to it. Later in this section, I’m going to talk about how to apply science to the art of content marketing prediction. The predictive attribution. Generally again, the same thing. We have attribution inside of web analytics platform that tells us what combination of channels – social, email, marketing, and so on, so forth, resulted in optimal order delivery or sign up, or whatever your objective is. Predictive attribution takes that historical data, applies prediction, and tells you in future what’s going to happen. Let’s say from a month from now, let’s say from a quarter from now. What will be the maximum potential you can drive from your available budget and available marketing channel? That’s what prediction attribution is about.

      Churn and loyalty, we all want to prevent customers and we all want to create loyal customers. Data science and statistical analysis can allow you to do that. If you don’t end up in a situation like this, where we are targeting the same customer with the product they’ve already purchased, and to the point they get so frustrated they don’t want to do business with us. This is one of the part of the experience that I was talking about. The disconnect and the experience between brands and the customers.

      6 Steps

      Let’s get started. As promised, here are the 6 steps on how to apply data science to predict the content marketing popularity.

      Step#1: Prepare your data. This is how the prediction engine works. In order for you to predict something, you’ve got to have an historical data. In this case, if I am trying to create a popular or a viral content, then I have to have a historical list of the content that’s already popular in my industry. There are multiple tools you can use to do that.

      My favorite is BuzzSumo. Again, completely free to create an account. You can go on BuzzSumo, create an account, and then type whatever keyword you have. Let’s say if you were trying to create a content for health care, you can type health care and then it will list you all the content, along with the total number of shares in social media a well as a couple of other information. You can download this information in the CSV format. The good news is, BuzzSumo has a completely free 14 day pro account trial that allows you to download 10,000 pieces of content, which will be the groundwork for you to run predictive analytics and content. Again, very simple, go to BuzzSumo, create an account, download the content. Or you can use Google Chrome’s scraper plugin. Again, same concept. It’s completely free. You can apply on Google Chrome and then download all the content that’s available using this tool. Once you do that, your content is going to look something like this.

      What’s happening here is basically what I did is I just downloaded a bunch of content from BuzzSumo, and I just organized it. We have the category for the content, we have the headline, we have the snippet, abstract, word count. The most important one is popularity column. Basically I marked everything that has more than 1,000 shares as 1, and everything else as 0. That’s your first step, which is data preparation.

      Step#2: Create account. Very simple, go to Watsonanalytics.com, create an account. It’ll ask you basic information, your email so you can quickly create an account and start using Watson Analytics. Third step is the data that you created in step 1, you upload that data into Watson Analytics and you look at some initial pattern. Let’s talk about that. Again, a very simple step.

      Step#3: Upload data. Login and click add and upload the data, so CSV file. Once the data’s uploaded inside of Watson Analytics, you can ask simple language question. Remember what I was telling you, Watson is based on the original super computer that was based on simple text language.

      You can ask a question like, “What is the most popular publication date?” Watson will give you all the different options, and you can click on these options to the most popular questions that have been asked, and then it will give you charts and graphs, so you don’t have to create charts, you don’t have to develop anything. You just ask questions, click, and then it will give you your answer. This is just the initial analysis, we’re just exploring the data now. We have not even begun the prediction, but Watson already knows what’s inside of your data.

      Step#4: Select prediction criteria. Now remember, we are trying to predict which content is popular. Obviously the popular column is the prediction criteria. Once you select the popularity column and click next, you’re going to land on the prediction dashboard. Now there are a lot of things happening here. One of the key things, which in the center which I call the solar system, it tells you the prediction accuracy.

      In this case it’s telling you with a 90% accuracy, if you focus on the word count in the blog post and the category which it belongs to, and the news desk, so this particular case I’m using the data from New York Times, that’s the reason why it has news desk as a content category. Then with the 90% accuracy focused on these 3 things, then you’re going to get highly popular content.

      Step#5: Analysis. Then you can start drilling down into these charts. What happens is you get different options, so you can look at different combinations. There’s section name and word count, which has 90% accuracy. You have news desk and word count, which is 89% accuracy, so you go with the most accurate one, which is section name and news desk, which is 90% accuracy. You’re going to have a highly popular content if you follow those criteria. Then you can start drilling down into the actual data, and this is the table you get from lost analytics. Here in my example it’s saying to me if I were to write a popular blog post, my word count has to be more than 829 words. It has to be in the health care or crossword games category.

      Simple as that, and if we further drill down you can get all data from it. It’s a very scientific approach to writing content instead of just going there and trying to figure out, “What should we write about?” Then finally, you can look at flow charts. The flow charts will tell you what is the sequence of the popularity. If you start with section name, you go to word count, news desk, and so on, so forth. The whole theme here is initially we were just doing brainstorming, and trying to figure out what content works. Here, you’re actually applying science, statistical analysis that’s telling you what really matters for you to write that content.

      Step#6: Take action. That’s one of the biggest challenges we have in digital marketing. We have so much data, we have so many marketing tools. What’s important is take action. Go create an account, go start running some initial analysis. First time you’re not going to get it, but if you run it multiple times, you’ll get to a point where you’re able to run sophisticated analysis.

      You can create content pieces like these. Let’s have a discussion. You can reach out to me at SameerKhan on Twitter and my website, DataCrackle.com. Let’s talk, let’s chat. Thank you so much.

    Notify of
    Inline Feedbacks
    View all comments