co5124.sp52.assignment1 ngo chi nguyen 12528511 in
Post on 24-Oct-2014
114 Views
Preview:
TRANSCRIPT
DATA ANALYSIS AND DECISION MODELLING
(CO5124.SP52) – ASSIGMENT 1
NAME : NGO CHI NGUYEN - 12528511
: NGUYEN MONG HIEN – 12608524
: NGUYEN MINH HANH – 12530661
: NGUYEN MINH DAO – 12528600
: HOANG NHAT TAN – 12618888
DATE : 12th Sep, 2011.
Contents Page No.
1. Question 1 ..................................................................................................... 1
i. Question 1a ........................................................................................,...... 3
ii. Question 1b ............................................................................................... 3
iii. Question 1c ........................................................................................,...... 4
2. Question 2 ..................................................................................................... 4
3. Question 3 ...................................................................................................... 7
4. Question 4 ........................................................................................................ 10
ASSIGNMENT 1
DISCUSION
Question 1
Step 1: Prepare data for question 1a, question 1b, question 1c
In order to answer these questions, a table include price of supermarket chains need to be created. From the raw data in file Excel, using “PHStat > Data Preparation > Unstack data” (Figure 1.0) with “Grouping Variable Cell Range” is Name column and “Stack Data Cell Range” is Price column, we have table (Table 1.0):
Figure 1.0
1 3 2100.20 96.11 101.7598.21 96.22 101.0899.21 96.86 99.1898.98 98.49 101.8399.13 100.11 102.8299.43 105.52 104.0595.00 100.63 100.9195.71 101.89 102.5599.61 109.65 103.1799.18 97.16 97.6399.25 101.02 100.93101.66 101.96 102.22102.58 102.73 103.03103.02 98.66 103.6998.29 98.85 104.5198.88 98.90 101.7799.71 99.31 102.3199.92 102.17 102.53100.64 99.45 98.43100.84 100.72 101.1798.58 106.15 103.19
1
ASSIGNMENT 1
99.20 98.92 101.4799.87 99.18 100.7199.99 103.25 109.0699.45 103.56 109.95104.08 108.30 108.79104.69 104.69 110.57105.99 101.39 108.79106.12 101.83 98.6299.05 105.62 104.4999.49 105.93 98.5899.18 107.45 103.9399.16 108.41 107.2099.38 98.58 102.6599.55 100.67 98.1499.55 100.86 98.43102.50 100.90 98.6095.66 101.46 102.8895.77 96.84 98.67102.49 97.78 101.27100.26 102.32 102.7497.84 98.20 102.8398.36 98.28 102.94100.58 100.97 103.48100.65 101.86 102.76102.07 102.02 103.79107.01 101.37 103.31107.01 107.01 104.10
Table 1.0
Describer Table 1.0:
1: Coles supermarkets
2: Woolworths/Safeway supermarkets
3: Others supermarkets
Step 2: Apply “PHStat > Multiple-Sample Tests >One-way ANOVA” to table above
2
ASSIGNMENT 1
Figure 1.1
Choose data for “Group Data Cell Range”, select “First cell contain label” and “Tukey-Kramer Procedure”, we have result in figures:
Question 1a
Figure 1.2
Base on the result above we reject H0 for prices at supermarkets because F value (7.190346097) > F crit (3.060291772)
In conclusion: there is difference in the average price of the basket of 34 items at different supermarkets. In other word, the claim is not correct.
Question 1b
3
ASSIGNMENT 1
From the table above, we have F > F_crit (7.190346097 comparison with 3.060291772), so reject H0 for prices at supermarket chains, means that at least one mean is different from the others.
In conclusion: there is difference in the average price of the basket of 34 items at different supermarkets.
Question 1c
Figure 1.3
Base on table above, the means of Coles (1) and Woolworths (3) are different with absolute difference about 2.385833 and the mean of Coles (1) is less than the mean of the Others (3) (100.2704, 102.6563 correspond). This means that if there is a significant difference in the average price of the basket at different stores, so the first group of analysts is correct.
Question 2
Step 1: Prepare data for question 2
S1 S2 S31 100.20 99.71 99.181 98.21 99.92 99.161 99.21 100.64 99.381 98.98 100.84 99.551 99.13 98.58 99.551 99.43 99.20 102.501 95.00 99.87 95.661 95.71 99.99 95.771 99.61 99.45 102.491 99.18 104.08 100.261 99.25 104.69 97.841 101.66 105.99 98.361 102.58 106.12 100.581 103.02 99.05 100.651 98.29 99.49 107.011 98.88 102.07 107.012 101.75 101.77 103.932 101.08 102.31 107.202 99.18 102.53 102.652 101.83 98.43 98.14
4
ASSIGNMENT 1
2 102.82 101.17 98.432 104.05 103.19 98.602 100.91 101.47 102.882 102.55 100.71 98.672 103.17 109.06 101.272 97.63 109.95 102.742 100.93 108.79 102.832 102.22 110.57 102.942 103.03 108.79 103.482 103.69 98.62 102.762 104.51 104.49 103.792 104.10 98.58 103.313 96.11 98.66 108.413 96.22 98.85 98.583 96.86 98.90 100.673 98.49 99.31 100.863 100.11 102.17 100.903 105.52 99.45 101.463 100.63 100.72 96.843 101.89 106.15 97.783 109.65 98.92 102.323 97.16 99.18 98.203 101.02 103.25 98.283 101.96 103.56 100.973 102.73 108.30 101.863 105.62 104.69 102.023 105.93 101.39 101.373 107.45 101.83 107.01
Table 2.0
Describe the Table 2.0:
First column stores supermarkets (1, 2 and 3) Second column stores State 1 (S1) Third column stores State 2 (S2) Fourth column stores State 3 (S3)
Step 2: Apply “Data > Data Analysis >ANOVA: Two-factor with replication” for Table 2.0
5
ASSIGNMENT 1
Figure 2.0
Figure 2.1
Input:
Input ranger: select data in Table 2.0 Rows per samples: 16
Step 3: Analysis result
Anova: Two-Factor With Replication
SUMMARY S1 S2 S3 Total1Count 16 16 16 48Sum 1588.34 1619.69 1604.95 4812.98
Average99.27125
101.2306
100.3094
100.2704
Variance4.333012
6.491673
10.41843
7.433966
2
6
ASSIGNMENT 1
Count 16 16 16 48Sum 1633.45 1660.43 1633.62 4927.5
Average102.0906
103.7769
102.1013
102.6563
Variance3.427366
18.31522
6.162158
9.547049
3Count 16 16 16 48Sum 1627.35 1625.33 1617.53 4870.21
Average101.7094
101.5831
101.0956
101.4627
Variance17.66901
8.619863 9.5774 11.5182
TotalCount 48 48 48Sum 4849.14 4905.45 4856.1
Average101.0238
102.1969
101.1688
Variance9.708803
11.96402
8.897547
ANOVASource of Variation SS Df MS F P-value F crit
Sample136.6128 2
68.30641
7.231241
0.001039
3.063204
Columns39.26861 2
19.63431
2.078581
0.129093
3.063204
Interaction24.98265 4
6.245661
0.661195
0.620017
2.438739
Within1275.212 135
9.446015
Total1476.076 143
Figure 2.2
From Figure 2.2, we can test the difference in average price across three states among the supermarkets.
Base on the figure 2.2, on row Sample (for Supermarket), F value (7.231241) > F_crit value (3.063204), so reject H0 for prices at different supermarket.
On Columns (for States) columns, F value (2.078581) < F_crit value (3.063204), therefore, accept H0 for prices at different states.
In conclusion: there is no significant difference in the average prices across three states among the supermarkets. In other word, the belief of retail analysts is not true.
Question 3:
7
ASSIGNMENT 1
Step 1: Prepare data for question 3
C1 C2 C31 100.20 99.18 105.521 99.21 104.05 101.021 98.98 97.63 101.961 98.29 100.93 102.731 98.88 102.22 98.661 99.71 103.03 98.851 99.92 103.69 98.901 100.64 104.51 99.311 100.84 102.31 108.301 98.58 102.53 104.691 99.20 98.43 101.391 99.87 109.06 101.831 104.69 109.95 105.621 99.05 98.62 105.931 99.49 104.49 107.451 99.16 98.58 108.411 99.38 103.93 98.581 99.55 107.20 100.671 99.55 98.14 100.861 95.66 98.43 100.901 95.77 98.60 96.841 102.07 102.74 97.781 107.01 102.83 102.321 107.01 102.94 107.012 98.21 101.75 96.112 99.13 101.08 96.222 99.43 101.83 96.862 95.00 102.82 98.492 95.71 100.91 100.112 99.61 102.55 100.632 99.18 103.17 101.892 99.25 101.77 109.652 101.66 101.17 97.162 102.58 103.19 102.172 103.02 101.47 99.452 99.99 100.71 100.722 99.45 108.79 106.152 104.08 110.57 98.922 105.99 108.79 99.182 106.12 102.65 103.252 99.18 102.88 103.562 102.50 98.67 101.462 102.49 101.27 98.202 100.26 103.48 98.282 97.84 102.76 100.97
8
ASSIGNMENT 1
2 98.36 103.79 101.862 100.58 103.31 102.022 100.65 104.10 101.37
Table 3.0
Describe the Table 3.0:
First column stores location (1 and 2) Second column stores prices of supermarket Coles (C1) Third column stores prices of supermarket Woolworths (C2) Fourth column stores prices of supermarket Others (C3)
Step 2: Apply “Data > Data Analysis >ANOVA: Two-factor with replication” for Table 3.0
Figure 3.0
Figure 3.1
Input:
9
ASSIGNMENT 1
Input ranger: select data in Table 2.0 Rows per samples: 24
Step 3: Analysis result
Anova: Two-Factor With Replication
SUMMARY C1 C2 C3 Total1
Count 24 24 24 72Sum 2402.71 2454.02 2455.53 7312.26
Average100.112
9102.250
8102.313
8101.559
2
Variance7.53927
411.6839
412.2489
311.2564
2
2Count 24 24 24 72Sum 2410.27 2473.48 2414.68 7298.43
Average100.427
9103.061
7100.611
7101.367
1
Variance7.60010
47.48223
29.77674
59.51467
7
TotalCount 48 48 48Sum 4812.98 4927.5 4870.21
Average100.270
4102.656
3101.462
7
Variance7.43396
69.54704
9 11.5182
ANOVASource of Variation SS df MS F P-value F crit
Sample1.32825
6 11.32825
60.14147
60.70739
53.90972
9
Columns136.612
8 268.3064
17.27551
20.00099
13.06171
6
Interaction 42.5169 221.2584
52.26429
9 0.107753.06171
6
Within1295.61
8 1389.38853
7
Total1476.07
6 143
Figure 3.2
From Figure 3.2, we can test the difference in average price in different location among the supermarkets.
10
ASSIGNMENT 1
On row Sample (for Location), we accept H0 for prices at different location because F value (0.141476) < F_crit value (3.909729).
On row Columns (for Supermarkets), we have F value (7.660969) >F_crit (3.063204), so reject H0 for prices at deferent supermarkets.
In conclusion: there is significant difference in the average prices at different locations among the supermarkets.
Question 4
Step 1: Prepare data for question 4
ALDI1 ALDI21 100.20 99.251 98.21 101.661 99.21 102.581 98.98 103.021 99.13 98.291 99.43 98.881 95.00 102.501 95.71 95.771 99.61 102.491 99.18 100.261 99.18 97.841 99.16 98.361 99.38 100.581 99.55 100.651 99.55 107.011 95.66 107.012 101.75 101.832 101.08 102.822 99.18 104.052 103.93 100.912 107.20 102.552 102.65 103.172 98.14 97.632 98.43 100.932 98.60 102.222 102.88 103.032 98.67 103.692 101.27 104.512 103.48 102.742 102.76 102.832 103.79 102.942 103.31 104.103 96.11 100.113 96.22 105.52
11
ASSIGNMENT 1
3 96.86 100.633 98.49 101.893 108.41 109.653 98.58 97.163 100.67 101.023 100.86 101.963 100.90 102.733 101.46 105.623 96.84 105.933 97.78 107.453 102.32 101.863 98.20 102.023 98.28 101.373 100.97 107.01
Table 4.0
Describe the Table 4.0:
First column stores name of supermarket chains (1, 2 and 3) Second column stores prices of supermarkets that is located nearby ALDI
Step 2: Apply “Data > Data Analysis >ANOVA: Two-factor with replication” for Table 4.0
Figure 4.0
12
ASSIGNMENT 1
Figure 4.1
Step 3: Analysis result
Anova: Two-Factor With Replication
SUMMARY ALDI1 ALDI2 Total1Count 16 16 32Sum 1577.14 1616.15 3193.29
Average 98.57125 101.00937599.79031
Variance2.570718333 9.58512625
7.415913
2Count 16 16 32Sum 1627.12 1639.95 3267.07
Average 101.695 102.496875102.0959
Variance 6.523282.723369583
4.640122
3Count 16 16 32Sum 1592.95 1651.93 3244.88
Average 99.559375 103.245625101.4025
Variance 9.5176062510.68253292
13.28095
TotalCount 48 48Sum 4797.21 4908.03Average 99.941875 102.250625
Variance7.675487899
8.219729388
13
ASSIGNMENT 1
ANOVASource of Variation SS df MS F P-value F critSample 89.55638125 2 44.77819 6.457984 0.002394 3.097698Columns 127.9278375 1 127.9278 18.44996 4.4E-05 3.946876Interaction 33.47933125 2 16.73967 2.414222 0.095208 3.097698Within 624.0395 90 6.933772
Total 875.00305 95
Figure 4.2
Base on Figure 4.2, on row Sample (for Name of supermarket), F value (6.457984) >F_crit (3.097698), so reject H0 for price at different supermarket.
On row Columns (for different ALDI), we have F value (18.44996) > F_crit (3.946876), so reject H0 for prices at different ALDI.
In conclusion: there is difference in the average prices among the supermarkets. In other word, there is increase in competition among supermarkets nearby ALDI.
REFERENCE
» Evans, James R. (2010). Statistics, Data Analysis, and Decision Modeling (4th Ed.). Pearson Education.
14
top related