Handle information with data science

Have you ever wondered how Facebook can predict your future friends? Or how Google knows what route you should pick when taking your car to work, depending on what time you leave? These are just two examples of how companies use massive amounts of data to cleverly facilitate the everyday lives of their customers. However, far from all companies know how to best make use of the vast data streams that keep flowing in. This is where Data Science comes into the picture.

As our lives become more and more digitized, the amount of data gathered is also accelerating. It may come from sources as diverse as mobile phones, IoT sensors, cameras, and websites. Many companies are currently in a position where they are building infrastructure for efficient big data handling and storage [1]. Their next step will then likely be to try figure out how to best make use of this data and how to gain competitive advantages against their competitors. At first, it is common that such companies feel overwhelmed by the tidal wave of information. This effect is often referred to as drowning in data, while being starved for actual knowledge.

So, what tools are available for companies that wish to break out of this maelstrom of data? The key to their success lies in the field of Data Science, which can be used to gain overview and control over data. Professionals within this field are most often called Data Scientists. Skilled Data Scientists know how to mix their knowledge of mathematics/statistics and programming with pure business expertise, to find patterns and anomalies in data. This leads to new business opportunities and insights that can create value for companies as well as for the whole society.

“So, what tools are available for companies that wish to break out of this maelstrom of data? The key to their success lies in the field of Data Science”

The Data Scientist profession has been called the sexiest job of the 21st century [2] and lots of people are attracted to the field by the opportunities to work with hot new technologies such as Machine Learning and Artificial Intelligence, which are both closely associated with Data Science. While this may very well be the case, one should also remember that Data Science is so much more than this. For example, before we can even start applying Machine Learning it is important to first [3]:

  • Clean and tidy up data such that it assumes suitable structure
  • Identify questions for our data to answer
  • Modify the existing variables/features in data or create new ones
  • Explore data to gain a deeper understanding of it, using for example diagrams

These tasks are often the most time-consuming parts of a Data Scientist’s job. 

On the other hand, it doesn’t matter how skilled Data Scientists are at data analysis and Machine Learning if the people that make the decisions at their company do not listen to them. As a Data Scientist it is therefore of utmost importance to also be a good communicator to gain understanding of your data discoveries. At the same time, however, communication is not a one-way street. A study from 2011 [4] showed a clear positive correlation between a company’s ability to make decisions based on data – so called Data-Driven Decision Making – and their market performance [5]. It is thus just as important that company decision makers understand the role of their Data Science team as it is for the Data Scientists to convey their messages to the company’s management [1].

But, as you might ask yourself, is Data Scientist really a job for the future? Aren’t they replacing themselves with the Artificial Intelligence that they create? The answer is no. While Artificial Intelligence is a word on everyone’s lips these days, it is important to know that the machines aren’t nearly ready to think for themselves. Human-like Artificial Intelligence is still science fiction, and machines and computers are essentially stupid [6]. Sure, it is possible to train them to become extremely good at tasks such as: image recognition, stock trading, spam filtering, and predicting when electrical components are about to fail. But, as soon as you move them away from their intended application area, it is obvious that their understanding of the world is very limited. In other words, they are great at finding answers to specific questions and make correct decisions in very specific cases. But they are useless at posing the right questions and generalizing their knowledge [6]. That is why there is a need for Data Scientists who can ask the right questions, analyze, model, and draw general conclusions from data. Data Science is thus not just a temporary buzzword, but a profession that is here to stay.

 

Contributor

Olof Rännbäck-Garpinger is a consultant at Knightec and holds a doctorate in regulatory technology from Lunds University. He has vast experience in working with data analysis for Svensk Kärnbränslehantering AB and has been active in the field of data science since 2016.

 

[1] Provost, F., & Fawcett, T. (2013). Data science and its relationship to big data and data-driven decision making. Big data1(1), 70-76.

[2] Davenport, T. H., & Patil, D. J. (2012). Data Scientist: The Sexiest Job of the 21st century. Harvard business review90(10), 60-68.

[3] Wickham, H., & Grolemund, G. (2016). R for data science: import, tidy, transform, visualize, and model data. “O’Reilly Media, Inc.”.

[4] Brynjolfsson, Erik and Hitt, Lorin M. and Kim, Heekyung Hellen. (2011). Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance? Available at SSRN: https://ssrn.com/abstract=1819486 or http://dx.doi.org/10.2139/ssrn.1819486

[5] McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data: the management revolution. Harvard business review90(10), 60-68.

[6] Brynjolfsson, E., & Mcafee, A. (2017). The business of artificial intelligence. Harvard Business Review.