the value of digital technologies big data

Post on 02-Jun-2022

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Value of Digital TechnologiesBig Data

Sofia, 23 March 2018

Severino MeregalliScientific Coordinator – DEVO Lab

SDA Bocconi

THE BUSINESS CONTEXT: WHY DATA EXPLOITATION IS SO IMPORTANT

• Dynamism and complexity as structural elements

• Fuzzy business scenarios

• Complexity management and profit linkage

• The fall of management as a science and of prescriptive management

• The fall of the “legendary” long term strategic planning as an antidote to complexity

• The “evergreen” gap between Business Requirements and Information Systems

• Desperate search of insight and knowledge sources

THE (BIG) DATA LANDSCAPE

• Generating value from data and analytics is one of the pillars of competitive advantage

• Decision-making in complex and dynamic organizations calls for a full exploitation of data resources

• Progressive digitalization of businesses vs skills needed to take advantage of large and complex dataset

• Big Data, Data Discovery and Analytics have suffered all negative impacts due to hype and the rise of improvised players

• Wide range of high performing technologies and players

• Cost/benefit leverage calls for a deep understanding of the real opportunities and hurdles in Data exploitation

DATA EXPLOSION VS ABILITY TO EXECUTE

• There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

• Organizations need not only to put the right talent and technology in place but also structure workflows and incentives to optimize the use of big data.

McKinsey Global Institute 2011

5

THE MEDIA HYPE

A CROWDED MARKET

7

HYPE… AS USUAL

Gartner Hype Cycle for Emerging Technologies, 2014

8

THE CALL FOR A MANAGERIAL APPROACH (DEVO LAB SDA BOCCONI)

Value

Shortcut

9

THE ISSUE

After the first wave of technology adoption for managing and analyzing large datasets, both the academic and the practitioner community acknowledged the risks of (another) «hype

driven» approach

10

THREE KEY TOPICS

Physical vs Social

Data quality

Context

• It is relatively less complex to get significant results when the focus of the analysis is on deterministic phenomena (Natural Sciences) rather than on Social Sciences

• In Natural Sciences it is possible to explain/understand a phenomenon by observing a singularity (ega star with an odd orbit) …the same does not apply to Social Sciences (eg trendsetter vs crazy behavior)

• Predictive analysis, as well as the mere understanding of phenomena impacted by social variables is still characterized by issues difficult to address, even when companies have large amounts of data and computing power

• The paradox is that in the digital world sometimes it is easier to influence behaviors rather than understand them

• The short term economic value is proportional to the difficulty of the task: higher in Social Sciences, lower in Natural Sciences

11

PHYSICAL VS. SOCIAL PHENOMENA

12

DATA QUALITY MANAGEMENT

• The stratification of large amount of data, with different formats, different scopes, emphasizes the old but evergreen concept of “Garbage-in, Garbage-OUT“

• Big Data tools and technologies have not yet solved this problem and, in some cases, it has been amplified by the presence of data from sources that are out of control (i.e. Social Networks)

• “Data Quality" attitude is a precondition to initiate a virtuous cycle of data value exploitation

• Technology is here to help, but we still have issues:

– uniqueness (single source of truth)

– accountability for data quality (not IT)

– consistency of goals between who produces and who analyses the data

– availability of consistent and shared data information (metadata)

– legal issues

13

UNDERSTANDING DOMAIN, CONTEXT AND DECODING RESULTS

• The breadth and variety of datasets allow analysts to find numerous correlations between variables, which can not be found in small datasets

• The issue is to understand which are the meaningful correlations to be considered, since.. the more the variables, the more correlations that can show significance

• Context is hard to interpret at scale and even harder to maintain when data are reduced to fit into a model. Obtaining and managing context data will be a challenge.

The more variables, the more correlations that can show significance. Falsity also grows

faster than information; it is nonlinear (convex) with respect to data

N. Taleb - Professor of risk engineering at New York University’s Polytechnic Institute.

14

THE MORE VARIABLES, THE MORE CORRELATIONS THAT CAN SHOW SIGNIFICANCE…

• Contextual data are scarce and very often not available or not consistent with the needs

• Each application domain requires to involve experts that know it from inside. Statistical “brute force” approach does not work well in Social Sciences

• The issue is to find the sweet spot between “obvious” and “false” findings

15

UNDERSTANDING DOMAIN, CONTEXT AND DECODING RESULTS

• Differentiate between physical and social phenomena

• Measure the "quality" of available data

– Accuracy– Reliability– Completeness– Consistency– Timeliness

• Consider the availability of domain experts / knowledge when dealing with social phenomena

16

THREE PILLARS FOR DATA VALUE

17

THE ANALYSIS MODEL

Data QualityValue

Level of DeterminismLow

High

Low High

Value

Social phenomena

Physical phenomena

POSSIBLE PATHS

Level of DeterminismLow

High

Low High

ValueData QualityValue

19

VOLVO CAR CORPORATION CASE HISTORY

The Company• Global leader in the automotive industry• Acquired by Geely Auto Group in 2010• Focus on quality and safety : Our vision is to design cars that should not crash. In the shorter perspective

the aim is that by 2020 no-one should be killed or injured in a Volvo car.

Scope of Work• Improving quality of data

collected from dealers, engineering, production and from diagnostic systems (DRO)

• Build and unified repository of integrated data

Achievements• Problem identification and

prioritization of maintenance activities

• Solving problems of quality during the production processes

• Warranty programs management accuracy

• Potential failure predictive analysis

The Needs• Analyze mechanical

performances of the vehicles in real driving conditions in order to improve design, production and after-sales service (warranty) processes

20

VOLVO CAR CORPORATION CASE HISTORY

Low

High

Low High

Value

FullPartialNull

Level of Determinism

Data QualityValue

21

SCE SMART CONNECT CASE HISTORY

The Company• Southern California Edison is the largest subsidiary of Edison International• For over a century, the company provides electricity to about 14 million customers in Southern

California (Central, Coastal & Southern California)

Scope of Work• Acquisition of data from

Smart Meter (720 readings per month per customer, about 5.6 billion of readings per month total)

• Smart meters data integration with expenses and demographic information

Achievements• Improvement in production

and distribution flow management

• Peak usage prediction

The Needs• Provide customers with a

weekly reporting of energy consumption, in order to gain expenses control

22

SOUTHERN CALIFORNIA EDISON CASE HISTORY

Low

High

Low

Value

Full

Level of Determinism

Data QualityValue

23

GDF SUEZ CASE HISTORY

The Company• French group, one of the main Utility worldwide (turnover of about 70 billion €)• Founded in 2008 after Suez and GDF merge• Core business: production and distribution of electricity, natural gas and renewable sources

Scope of Work• Customer size wasn’t

addressed consistently (admin vs. Commercial data)

• Improvement in: Data Quality, CRM & Billing integration, Marketing Campaigns

• Incremental understanding of customers’ related phenomena

Achievements• Customer’s value – based

segmentation

• Churn, due to customer’s relocation, prevention

• «Gas-only» customer’s acquisition (electricity)

The Needs• After liberalization of the

energy market in France, B2C (CH&P) Business Unit was willing to pursue the opportunity to grow in the electricity market leveraging their gas market share

• Understand customer segmentation, where to focus sales and marketing initiatives and how

24

GDF SUEZ CASE HISTORY

Low

High

Low High

Value

Level of Determinism

Data QualityValue

• The analysis of case studies highlights how a mature approach to Data Value bank on two main dimensions:

– data quality

– the ability to interpret /understand phenomena

• Thanks to the analysis of case histories, it has been possible to identify a first set of Data Value components

25

LESSON LEARNED FROM CASE HISTORIES

26

DATA VALUE LAYERSINTRINSIC VALUE

Data Model

Data Volume – Cross Section

Data Volume – Stock

Data Quality

Quantitative tools

Cognitive tools

Data Model

Data Volume – Cross Section

Data Volume – Stock

Data Quality

Quantitative tools

Cognitive tools

Physical Social

Domain Expertise

27

DATA VALUE LAYERSPOTENTIAL VALUE

Data

ToolsExpertise

Data Model

Data Volume – Cross Section

Data Volume – Stock

Data Quality

Quantitative tools

Cognitive tools

Data Model

Data Volume – Cross Section

Data Volume – Stock

Data Quality

Quantitative tools

Cognitive tools

Physical Social

Context - Data

Context - Models

Domain Expertise

Context - Data

Context - Models

Edge Computing

vs

Edge Organizations

NEW FRONTIERS, NEW OPPORTUNITIES…NEW ISSUES

29

NEW FRONTIERS, NEW OPPORTUNITIES…NEW ISSUES

Data ownership and side effects control

30

NEW FRONTIERS, NEW OPPORTUNITIES…NEW ISSUES

Data storage technologies evolution is much slower than data growth

Big Data, Machine Learning and Quantum Computing: the perfect storm ?

NEW FRONTIERS, NEW OPPORTUNITIES…NEW ISSUES

NEW FRONTIERS, NEW OPPORTUNITIES…NEW ISSUES

NEW FRONTIERS, NEW OPPORTUNITIES…NEW ISSUES

Research, consulting, teaching, software industry, working again without hackatons

(or with real rewards and better ethics)

34

SUMMARY AND RECOMMENDATIONS

• There are no "big data". We have only data which are manageable / unmanageable with state of the art technologies

• The real challenge is getting «Big Info» and take better decisions

• Natural and Social domains are different

• Data quality is still the precondition for any project

• Context understanding and contextual data are (in social applications) very often the real bottleneck

• Use a checklist to asses data value components before starting a project

• Only consider vendors that are able to provide fully integrated solutions to their data issues (no room for improvised players)

• Not to capitalize on data sets in Natural Sciences Domains is a big mistake …..transforming data sets in value in Social Sciences is (still) a big challenge

• Davenport T.H., Big Data at Work, Dispelling the Myths, Uncovering the Opportunities, Harvard Business Review Press, 2013

• Davenport T.H., Data Scientist: The Sexiest Job of the 21st Century, Harvard Business Review, October, 2012 http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/

• Gartner Hype Cycle for Emerging Technologies, 2014• McKinsey Global Institute, Big data: The next frontier for innovation, competition, and productivity,

2011• Redman T, Data’s Credibility Problem, Harvard Business Review, December 2013

http://hbr.org/2013/12/datas-credibility-problem/ar/1• Ross J.W., Beath C.M., Quaadgras A., You may not need Big Data after all, Harvard Business

Review, December 2013, http://hbr.org/2013/12/you-may-not-need-big-data-after-all/ar/1• Taleb N. N., Beware the Big Errors of ‘Big Data’, Wired, 2013 www.wired.com/2013/02/big-data-

means-big-errors-people/

REFERENCES

top related