cedric.cnam.fr - f. camillo–i. d’attoma integration of...
Post on 19-Jul-2020
3 Views
Preview:
TRANSCRIPT
f. camillo – i. d’attoma
Integration of different data collection
techniques using a multivariate
counterfactual approach
Furio Camillo
Alma Mater Studiorum
Università di Bologna
f. camillo – i. d’attoma
f. camillo – i. d’attoma
column labe l description comments
id identifie r of ea ch sa le s point 4517 sa lespoint
pr1 sa le s va lue of the product ca tegory n.1 in millions of euros
pr2 sa le s va lue of the product ca tegory n.2 in millions of euros
pr3 sa le s va lue of the product ca tegory n.3 in millions of euros
pr4 sa le s va lue of the product ca tegory n.4 in millions of euros
pr5 sa le s va lue of the product ca tegory n.5 in millions of euros
pr6 sa le s va lue of the product ca tegory n.6 in millions of euros
pr7 sa le s va lue of the product ca tegory n.7 in millions of euros
pr8 sa le s va lue of the product ca tegory n.8 in millions of euros
pr9 sa le s va lue of the product ca tegory n.9 in millions of euros
trea t trea tment indica tor of a marke ting campa ign 1=trea ted; 2=no trea ted
outcome economic re turn in millions of euros
x1 structura l va riable n.1 ca tegorica l va ria ble
x2 structura l va riable n.2 ca tegorica l va ria ble
x3 structura l va riable n.3 ca tegorica l va ria ble
x4 structura l va riable n.4 ca tegorica l va ria ble
x5 structura l va riable n.5 ca tegorica l va ria ble
Y(outcome)
= Pr1-Pr9(products)
X1-X5(structural)
T(treatment)
f+ error
f. camillo – i. d’attoma
……..
Data exploration (1)
t
t t
f. camillo – i. d’attoma
Data exploration (2)
f. camillo – i. d’attoma
treated No treated
Simulated effect = 32mln Euros
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
Information about the past
information on the family
Information on the social class
Geo-demographic information
Value system and lifestyle
2 different data collection tools: CATI and CAWI
Hypothesis, method and available data about a web-panel
CAWI
CATI
Opinions
Motivations
Aspirations
Needs
Behaviours
X
Y
T
pre-treatment informations
treatment
post-treatment variables:
(OUTCOMES) Interesting
Variables of the survey
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
The Data Mining approach
• Researchers and analysts don’t need any a priori hypothesis
about variables distribution
• We can analyze high dimensional data in a easy way
• DM algorithms aim to minimize the complexity, the time and
costs of elaborations
• It generates results easy to understand
The data miner produces a “black-
box”, that is like an automatic tool,
that aims to meet decision makers
daily requirements, but in a flexible
way (U. Fayaad, 2001)
f. camillo – i. d’attoma
The main reference
f. camillo – i. d’attoma
The main reference
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
f. camillo – i. d’attoma
Type of job-contract: oral list (CATI) or written list (CAWI)?
f. camillo – i. d’attoma
f. camillo – i. d’attoma
Where c is the generic cluster of the
DM approach (multivariate)
f. camillo – i. d’attoma
The WEB in the Future: different access
points, different access tools
f. camillo – i. d’attoma
A web survey about the Italian identity:
15 items (1-10 scale points)
Below are listed a number of issues that characterize Italy. Please tell me for
each of these matters what you think are representative / characterizing the
national unity of our country, Italy.
To answer uses a rating from 1 to 10, with 1 being not at all represent the
national unity of our country and 10 means it is very much the national
identity of our country.
The golden question
(active)
1. Artistic and cultural heritage
2. …..
3. The “mafia”
4. …
15. The opera
f. camillo – i. d’attoma
+1
-1
0
maxminmean
Original
scale
Recoded
scale
A non-linear re-coding method
(MG-Strategy) (endogenous for each respondent)
Ref: F.Camillo – MicroMacro Marketing – 1999/1 –
Il Mulino
The outcome variable:
a p-clusters
segmentation
f. camillo – i. d’attoma
Covariate sub-space T1: smartphone T2: usual PC Balancing
sub-space1 yes
sub-space2 yes
sub-space3 no
sub-space4 yes
sub-space5 yes
sub-space6 no
--------- -----
sub-space n yes
Cluster1 Cluster2 Cluster3 ----- Cluster p
T1 0.5 0.3 0.1 ----- 0.03
T2 0.47 0.2 0.09 ---- 0.1
For sub-space1
Comparing the distributions (CHI2) it is possible to evaluate for each sub-
space the impact of different treatment (use of smartphone or not use)
f. camillo – i. d’attoma
Benefits of the proposed strategy in
“The Internet age”
• Why The Internet will be important in the future?
• Data driven approach
• Semi automatic (massive use)
• “Multivariate” use of the information
• Work in progress: a SAS software, qualitative and
quantitative co-variates (the science matrix of
Rubin), application of ICOMP approach of Bozdogan
for a more automatic stop rule definition.
furio.camillo@unibo.it – ida.dattoma2@unibo.it
top related