curs 11-data analysis
TRANSCRIPT
-
8/4/2019 Curs 11-Data Analysis
1/24
Chap.6. Data analysis
6.1. Information systems used for dataanalysis
6.2. Descriptive statistics6.3. Inferential statistics
-
8/4/2019 Curs 11-Data Analysis
2/24
6.1. Information systems used fordata analysis
SPSS System (Statistical Package for Social
Sciences) is used on a large scale in marketingresearch for data analysis.
It is used mainly for data gathered with the help ofquestionnaires but also for various quantitative datafrom statistics, companys recording etc.)
The obtained information is presented as tables andcharts.
It offers multiple ways of data analysis like:summarize data, transforming variables, statisticaltests etc.
-
8/4/2019 Curs 11-Data Analysis
3/24
6.1. Information systems used fordata analysis
The flow of using SPSS system for information processing
Creating SPSS
data base
Selecting the
procedure of data
analysis
Selecting the
variables for
analysis
Data processing in
order to obtain the
information
Data
gathering
-
8/4/2019 Curs 11-Data Analysis
4/24
6.1. Information systems used for dataanalysis
Data gathering
Depends on the research method: Surveys - questionnaire
Secondary data official statistics, statistical databases, company recordings etc.
Avoiding data gathering errors is very important forthe research success. The researcher should payspecial attention to:
Proper training of the operators that collect data.
Verification in the fieldwork to ensure that theinterviewers are following the sampling procedures.
Controlling the data recordings to determinewhether interviewers are cheating.
-
8/4/2019 Curs 11-Data Analysis
5/24
6.1. Information systems used fordata analysis
Creating SPSS data base
In order to create a data base in SPSSthe following steps are followed:
Opening a new file Defining the variables of research
Recording data in the data base
Verification of recorded data
-
8/4/2019 Curs 11-Data Analysis
6/24
6.1. Information systems used fordata analysis
Start/Programs/SPSS for
Windows
-
8/4/2019 Curs 11-Data Analysis
7/24
6.1. Information systems used fordata analysis
A new empty data base
-
8/4/2019 Curs 11-Data Analysis
8/24
6.1. Information systems used fordata analysis
The window for defining variables
-
8/4/2019 Curs 11-Data Analysis
9/24
6.1. Information systems used fordata analysis
Setting the type of data
-
8/4/2019 Curs 11-Data Analysis
10/24
6.1. Information systems used fordata analysis
Defining the codes for response categories
-
8/4/2019 Curs 11-Data Analysis
11/24
6.1. Information systems used fordata analysis
Defining the codes for missing responses
-
8/4/2019 Curs 11-Data Analysis
12/24
6.1. Information systems used fordata analysis
Coding data The process of identifying and assigning
numerical scores or other character symbols todata expressed in words.
Codes facilitate the introduction of data indata bases.
Codes allow data to be processed bycomputers.
Coding depends on the type of scale usedin questionnaire.
-
8/4/2019 Curs 11-Data Analysis
13/24
6.1. Information systems used fordata analysis
Ex: Nominal scale
What brand of cigarettes do you smoke most often? Winston L&M Kent
Marlboro
Winchester ViceroyOther. Please specify ________
(1)
(2)(3)
(4)
(5)(6)
(7)
Attention: The assigned codes do not represent an order or a specific quantity. They areallotted only for identification of a response category (like the numbers of football players)
Binary (dichotomus) scale: - particular case
Are you smoking? Yes No
(1)
(0)
-
8/4/2019 Curs 11-Data Analysis
14/24
6.1. Information systems used fordata analysis
Ex: Ordinal scale1. The rank order scale according to a characteristic:
Please rank the following 5 brands of laundry detergentaccording to your preference (give the rank 1 to the most
preferred brand, rank 2 for the second preferred brand and soon until the rank 5 for the least preferred brand). OMO
ARIEL DERO PERSIL TIDE
Coding: in this case it is defined a variable for every responsecategory. The rank assigned by every respondent (from 1 to 5) willbe introduced in data base.
Attention: for the ordinal scales, the codes assigned generate anorder.
-
8/4/2019 Curs 11-Data Analysis
15/24
6.1. Information systems used fordata analysis
2. Semantic differential How much important is the ratio quality price when you choose a brand
of laundry detergent? __(5)__ __(4)___ _____(3)____ ____(2)____ __(1)__
very important neither important not important not at allimportant nor unimportant important
3. Numerical scale How satisfied you are with the whitening power of Ariel laundry detergent?
Very satisfied 5 4 3 2 1 Very dissatisfied
Usually, in this case only the extreme values are coded (1= verydissatisfied, 5=very satisfied)
4. Likert scalePlease indicate your opinion related to the following statement:
When somebody chooses a laundry detergent, the price is the mostimportant, all brands having about the same whitening power.
__(5)__ __(4)___ _____(3)____ ____(2)____ __(1)__strongly agree neither agree disagree stronglyagree nor disagree disagree
-
8/4/2019 Curs 11-Data Analysis
16/24
6.1. Information systems used fordata analysis
Interval scale The middle point of every interval is recorded in data base. This
one is used both as value of the variable and code of the responsecategory.
How many cigarettes do you generally smoke during a day ? 5-9 (7)
10-14 (12) 15-19 (17) 20-24 (22) 25-29 (27)
Ratio scale For this type of scale, coding is not used. In the data base it is
recorded the exact value indicated by the respondent.
Ex: How many hours do you study for an exam during the examinationsession?____5 h____
-
8/4/2019 Curs 11-Data Analysis
17/24
6.1. Information systems used fordata analysis
Ex: Divide 100 points among each of the followingbrands according to your preference for the brand:
ARIEL __40___
DERO __20___
PERSIL __30__
TIDE __10___
Coding: in this case it is defined a variablefor every response category (like in the caseof rank order scale). The value assigned by
every respondent will be introduced in thedata base.
-
8/4/2019 Curs 11-Data Analysis
18/24
6.2. Descriptive statistics
Descriptive analysis
Refers to the transformation of raw data into a form thatwill make them easy to understand and interpret(summarize data).
The most common ways to summarize data are: frequency
distribution, percentage distribution, calculation of centraltendency and variation indicators.
Charts could be associated to frequency tables in order tofacilitate the understanding of information.
Attention: Descriptive statistics is computed exclusively at thelevel of sample, using the data collected from the samplemembers.
-
8/4/2019 Curs 11-Data Analysis
19/24
6.2. Descriptive statistics
Selecting the procedures of descriptive analysis in SPSS
-
8/4/2019 Curs 11-Data Analysis
20/24
6.2. Descriptive statistics
Frequency table
An arrangement of statistical data in a row-and-column formatthat exhibits the count of responses and percentages for eachcategory assigned to a variable.
General Happiness
467 30,8 31,1 31,1
872 57,5 58,0 89,0
165 10,9 11,0 100,0
1504 99,1 100,0
13 ,91517 100,0
Very Happy
Pretty Happy
Not Too Happy
Total
Valid
NAMissingTotal
Frequency Percent Valid PercentCumulative
Percent
-
8/4/2019 Curs 11-Data Analysis
21/24
6.2. Descriptive statistics
Measures of central tendency: mode, mean, median
Mode is the response category with the highest frequency
Median is the middle value when the data are arranged in ascending ordescending order. It divide the sample into two equal groups (50% of thesample members are on the left and the other 50% on the right of themedian).
Mean is the most commonly used for central tendency when data aremeasured with ratio or interval scale.
Mean score represents a summarized rank used in the case of ordinalscale for creating final order of analyzed categories. It is calculated likemean but it has not the same properties with this one.
n
fx
x
n
1i
ii=
=
For binary scale
pn
f
n
f0f1x YesNoYes ==
+=
-
8/4/2019 Curs 11-Data Analysis
22/24
6.2. Descriptive statisticsVariation indicators: range, variance, standard deviation, standard error of mean.
Range measures the spread of data Range=xlargest-xsmallest
Variance is the mean of squared deviation from mean. It is an indicator of samplehomogeneity.
Standard deviation is the square root of the variance. It is expressed in thesame units as the data.
Standard error of mean - a measure of how much the value of themean may vary from sample to sample taken from the same distribution.
n
f)xx(
s
n
1i
i
2
i2
=
=
n
f)xx(
s
n
1i
i
2
i=
=
n
ssx =
)p100(psor)p1(ps 22 ==
For binary scale
For binary scale
)p100(psor)p1(ps ==
-
8/4/2019 Curs 11-Data Analysis
23/24
6.2. Descriptive statistics
Selecting the procedures of descriptive analysis in SPSS
-
8/4/2019 Curs 11-Data Analysis
24/24