10 Incredible Big Data Books You Don’t Want to Miss This Fall (With Excerpts)
Reading has been one of my top hobby followed by traveling, writing, public speaking and watching movies. I have learned more by reading books than by attending an actual event. A great perk of being an IBMer is you get full access to books24x7.com so I can quench my thirst with a new books every day.
Over the last few years, as you may have seen I am spending more time on data science and big data analytics. What this means is I am reading more books on these subjects. When I first began exploring the books on this topic I came across a tons of books with the high level message and no dept. To be precise, as of the writing of this post there are 14,313 results on Amazon.com for keyword “big data” under the books section. How can you separate the signal from the noise (no pun intended 🙂 ) with so many choices at your fingertips?
Plus, it was frustrating as most of these books provided the motivation to get started but lacked meat. Keeping that in mind I wrote this post so you can get access to a clear and concise list of the best books available on the topic of big data. I have also highlighted the key message from each of these books so you can make an educated decision. Here are 10 highly acclaimed books that will give you a head start in Big Data.
Big Data: A revolution…is a fascinating account of how Big Data is growing and what the likely repercussions of this expansion could be. The book peppers its narrative with a number of examples from companies such as Google. It also discusses trends such as completeness of data (as against sampling), data-ifying (quantifying and digitizing) information that was previously only vaguely summarized, use of new databases such as No-Sql and Hadoop to analyze large volumes of data that don’t lend themselves to traditional analysis and so on. This book also explores the use of Big data analytics in health, politics, business and how it is revolutionizing the processing of information. It also discusses likely threats from Big Data like how it can erode personal privacy.
Stein Kretsinger, the founder of Advertising.com calls this book, ‘the Freakonomics of Big Data’. The best thing about this book is that you don’t have to be a techie to understand the ideas given in the book. It is an interesting read into the applications of (against how to use) predictive analysis and related technologies such as natural language processing and machine learning. It is a humorous and interesting narrative about predictive analysis with several anecdotes from the author’s life itself. If you don’t know anything about predictive analysis and want an introduction about what is possible with it, this is a great book to start off with.
Think bigger: Developing a successful…. is a quintessential book for businesses who want to implement Big Data analytics. Companies generate a large amount of data every day, considering the choices and transaction they engage in. If they analyze this information, it can reveal valuable insights that will revolutionize their business. The real world explanations and insights contained in this book provide a road map for companies who want to create a Big Data strategy. It examines the number of trends in Big Data such as the mobile revolution, real-time use of Big Data, the Internet of Things, the concept of quantified self, Big Data and public data, gamification and more. The author also identifies some generic uses for Big Data that all organizations can use. The book also contains a number of case studies from companies such as Disney, Apple, Amazon, TomTom, Zygna and more.
This book gives valuable insights into how companies can integrate the knowledge they gain from Big Data analysis into their daily business. It has an easy to understand narrative, thanks to its straightforward and conversational tone. The book says the analytics revolution has already arrived. We only have to include it into our business processes/operations as against an external, batch process. The book contains a number of real world examples to illustrate what you should do and what you should not do with your corporate Big Data initiatives. It is a must read for anyone who is looking to make a career in analytics or wants to learn analytics principles to achieve accountable results in business.
The core message of The Signal and the Noise is this- prediction is indispensable in science, politics, business and other sectors. The problem is, people are not very good at it. They are affected by cognitive biases and systemic problems such as information overload. Still, we are constantly learning new things about how events happen and we can use this knowledge to predict better. Basically, it is an extrapolation of Bayesian thinking to real life incidents. For example, in Chapter 5, the author says that earthquakes cannot be predicted. But what we do know is, some areas are more prone to earthquakes than others and in some areas, an earthquake of magnitude n is ten times more likely than an earthquake of magnitude n+1. The Signal and the Noise is considered as one of the best books in applied statistics, a field which has a lot of relevance to Big Data.
The proliferation of smart phones, personal computers and the growth of the internet have led to many positive developments. But not everything is as it looks. It has also created avenues for personal data collection of unsuspecting citizens. The problem is, this data may be used for malicious intent like wiretapping, hacking and even by internet marketers and cyber criminals. The book actually lists all the ways in which people can be attacked online. But it also gives practical tools and tips on how citizens can protect their personal data online, making it the perfect beginner’s guide to online security.
To Big To Ignore…..gives an excellent introduction to why Big Data is important. The book contains jargon-free, commonsense advice to organizations that want to leverage big data. It is also a good source on Big Data for those who don’t have a background in Big Data. The book won’t teach you how to apply Big Data techniques, but it will tell you how different analytics techniques work. The author gives many examples of how Big Data has helped organizations pursue new innovation/growth opportunities. He also gives a nice introduction to infonomics, economics of information and benefits to be gained from data structuring. For these reasons, Too Big To Ignore is a must read for company owners, business professionals, IT consultants and chief executives.
Big Data At Work: Dispelling the myths…is a compelling read on where Big data stands currently and what the recent developments in the field are. It speaks about the importance of hiring good data scientists and gives a list of the ideal traits to look for in such candidates such as a deep understanding of Big Data architecture and coding, improvisation, action orientation, evidence based decision making, strong relationship and communication skills, skills in visual analytics, statistics, machine learning and unstructured data analytics and so on. The author is careful to write in a simple and easy to understand language.
Numbersense means intuition of knowing when we have to make a U-turn or press on or even stop. It is the ability to gather details and recognize decoys. The book begins with the premise that data can be manipulated and so it is not about how much data you analyze but how you do it. Big Data is here to stay and it is going to have a massive impact. Everyone consumes data so at the very least we must learn how to become smarter consumers. But the author also says that Big Data is moving us backward, because more results mean that we have to spend more time in arranging, replicating, analyzing and validating this data. This can cause confusion and doubt. The author also discusses a number of ideas on how people can separate the grain from the chaff when faced with a deluge of information or when they don’t trust Big Data.
Data scientists will benefit immensely from this book. It shows you how to build a Big Data architecture using clustered hardware and how to use new tools like Cassandra, Hadoop and Storm to capture and analyze web scale data. The book gives a thorough discussion of data modeling, requirements for data processing, data layers, issues with storage implementation and data architecture. It speaks about the importance of architecture design, data pipelining, data modeling to achieve efficient and effective analytics results. It also discusses the potential of the Lambda architecture for data mining, pioneered by Nathan Marz. There is no doubt that this is one of the most up-to-date books on Big Data architecture in the market today.