How does data science create value from data? To get to the bottom of this inquiry, we need we think about the very nature of data science. Let’s break down the term first. ‘Data’ is an abstracted record of real-world events. ‘Science’ is a process where one develops a general rule by proving hypothesis based on observation of the results generated by repetitive experiments. Since these results are records of events created by experiments, they are data. If ‘data’ is already an essential foundation of ‘science’ in the first place, then data science should sound redundant. And it is. Data science and science are basically the same thing. Therefore, five characteristics of data science can be drawn from generic steps of science.
1. Business-oriented : The ultimate goal of data science is to solve a business problem, and it requires understanding of business context.
2. Repeated Experiments : In science, multiple experiments are must in order to prove a hypothesis. Data science is no different.
3. Data-oriented : If not based on data or evidence, it is not science. Without data, it is not data science.
4. Math & Statistics : Objective and quantitative analysis is done through math and statistics. They are the cornerstone of data science.
5. Generalized Pattern Extraction : Both science and data science draw generally applicable pattern out of evidence inductively in order to solve problems.
To define data science by combining these five characteristics, data science is. A problem solving process by
clearly defining business problems to be solved,
hypothesizing various possible solutions,
gathering and cleansing necessary data for each hypothesis,
performing repetitive experiments using math and statistics,
finding generally applicable patterns out of data,
and developing solutions based on the given hypothesis.