It’s Friday and time for another round of discussion around big data. Today I would like to address some of the areas where big data analysis can make a significant difference. Large-scale analysis is already proving fruitful in several areas such as disease control, energy consumption and tv production. However, the opportunities to use big data for solving bigger problems are endless.
1. Health care cost in US – There has been a major advancement in health care technology in the last two decades. Human race has taken a big leap in understanding the compositions of DNA, performing robotic surgery and artificial organ transplant to name a few. These developments are spectacular and will help all of us in the long run. As we continue to improve healthcare technology, we also have to focus a lot more on the affordability of healthcare.
Now I am not referring to easily affordable insurance plans or supporting the insurance industry. That’s not the fundamental problem, and it’s just the cause. The problem is the exorbitant healthcare cost which makes basic and advanced health care unaffordable to masses. A quick analysis on health care cost will reveal how Americans pay significantly more in medical expenses than most developed countries.
Clearly, we have a problem and even though the per capita income in US is higher than most countries the cost of medical normalizes the gains. This is where big data analytics can help us to answer why health care is more expensive in US. At the surface, we know one of the issues is the lack of strong pricing regulations on the medical industry. Another one is the lack of transparency on behalf of the medical institutes. You could end up paying hundreds of thousands of dollars for a bypass surgery with no insight on whether the hospital profit margin is 20% or 80%.
A typical cardio surgeon could charge anywhere between 40K to 100K for a major surgery. Good surgeons may perform 3-4 major surgeries per day (50-60 per month). I understand performing a good surgery is a rare skill and there is one good surgeon for every 300 citizens but what about a bad surgeon. How do you compensate for a lost life because the surgeon was not able to do his job? Big data can help us answer these questions and provide full transparency of cost. Worst case it could help us develop a qualitative rating system with price analytics and doctor’s performance in addition to qualitative feedback.
2. Global warming – Global warming (GW) has been a known issue for a while and is a proven problem with fewer solutions in action. We all know our green house gas emission is causing a rapid increase in temperatures across the globe. Increase in temperature could lead to worldwide water shortage, disease outbreaks, species extinction, abnormal weather patterns and loss of vegetation & crops.
Hundreds if not thousands of scientists and researchers have already warned against the problem and have provided solutions. The key issue is due to differences in opinions within the scientist community, there is still an ongoing debate without permanent wholesale solution insight. Global dimming is another problem and runs parallel to global warming. The burning of fossil-fuel results in greenhouse gases and the byproducts released also includes sulphur dioxide and other poisonous pollutants. The pollutants feed the cloud resulting in acid rain while it also increased the density of the cloud, and heavy clouds reflect more heat back to the space resulting in a phenomenon called global dimming (GD). Due to global dimming, the water temperature decreases and results in less rain. Decrease in rain could impact billions of people around the world and is already causing lack of rain in Africa.
Majority of the analysis done on these phenomenons are performed on sampled data sets. GW scientists operate independently and submit the analysis to a larger global warming committee (only if they are a part of it), or they end up writing a research paper. What we need is a large-scale analytics project were all the findings are collected and analyzed in one place. New learning can be fed back to this big data analytics engine on an ongoing basis resulting in newer findings. This type of analysis will rule out the problems arising from sampled data giving us a holistic view of the problem. It will also provide greater insights on the rate of temperature rise, reduction in rain, spread of GW/GD, pin pointing worst regions, cities or businesses and identification of workable solutions.
3. Financial crisis – Tens of thousands of equity transactions are conducted every day on the Wall street. Numerous analytics and trading products are available on the market which utilizes popular patterns to predict winning trades. However, not a single software can predict the financial crises even worst a winning trade consistently over longer periods. Well managed mutual funds always outperform a crazy day-trading software in the long run. From the great depression of the 1930s to the financial crisis of 2008/09 and most recently the collapse of Greece, the world has not progressed much in terms of our financial forecasting capabilities. We still fear there might be another global financial crisis underway, although no one really knows when it will occur or what will be the magnitude of collapse.
It amazes me as we live in a connected world with super computers in our pocket and still cannot predict market stability with accuracy. We continue to rely on techniques developed for a smaller data set instead of embracing big data analytics. We continue to use investor perception as a major market indicator even though sophisticated computers can outcast champions (IBM watson beating Jeopardy winners). It is easier to blame a bad product such as sub prime loans being the key cause of 2008 collapse; harder to work on a joint solution. What we need is a large scale analytics system which can raise a red flag for every bad financial move we make as nation or as a collective human race. There is no shortage of data, and we already have trillions of data points as an input and millions more are available every second. Increasing the volume of data will always improve the reliability of outcomes without the need of complex algorithms.
It’s your turn so feel free to share your feedback, solutions, tweets or shares. How do you think we can solve these problems.