a six sigma analysis of mobile data usage
Post on 11-Apr-2017
237 Views
Preview:
TRANSCRIPT
A Six Sigma Analysis of Mobile Data Usage
2016 WCQISession W10
Brandon Theiss, PEBrandon.Theiss@gmail.com
Motivation
Is my current mobile data plan with Republic Wireless Optimal Given my data usage?
Learning Objectives• Apply the Six Sigma Methodology to Non
Traditional Applications • Utilize Monte Carlo simulations to make
predictions• Utilize Non Parametric Hypothesis testing• Utilize Process Capability to determine
specification limitations for non-normal data
4 Major Mobile Phone Carriers
Plans Offered By Verizon
20% of Verizon customers charged overages in past year*
Plans Offered By AT&T
28% of AT&T customers charged overages in past year*
Plans Offered By T-Mobile
12% of T-Mobile customers charged overages in past year*
5% of Sprint customers charged overages in past year*
Plans Offered By Sprint
Plans Offered By Republic Wireless
121110987654321
12000
10000
8000
6000
4000
2000
0
Bill Number
Tota
l Usa
ge
100020003000
50006000
10000
12000
3095.80
1911.60
3203.802674.30
3224.90
4517.404846.80
5905.40
3039.103784.20
4612.404254.40
Chart of Total Data Usage
The Data Set
Data was collected from March 23, 2015Through March 24, 2016
121110987654321
$130
$120
$110
$100
$90
$80
$70
$60
$50
$40
Bill Number
Bille
d Am
mou
nt
Verizon (1GB)ATT (2GB)T-Mobile (2GB)Sprint (1GB)Republic (2GB)
Variable
Time Series Plot of Small Verizon, ATT, T-Mobile, Sprint, Republic
Comparison of Carriers Small Data Plans
Data Speed Potentially Decreased
121110987654321
$120
$110
$100
$90
$80
$70
$60
$50
Bill Number
Bille
d Am
mou
nt
Verizon (3GB)ATT (2GB)T-Mobile (2GB)Sprint (3GB)Republic (3GB)
Variable
Time Series Plot of MediumVerizon, ATT, T-Mobile, Sprint, Republic
Comparison of Carriers Medium Data Plans
Data Speed Potentially Decreased
121110987654321
$90
$85
$80
$75
$70
$65
Bill Number
Bille
d Am
mou
nt
Verizon (6GB)ATT (5GB)T-Mobile (6GB)Sprint (6GB)Republic (5GB)
Variable
Time Series Plot of Large Verizon, ATT, T-Mobile, Sprint, Republic
Comparison of Carriers Large Data Plans
Comparison of Carriers X-Large Data Plans
121110987654321
140
120
100
80
60
40
20
0
Index
Data
Verizon (12GB)ATT (15GB)T-Mobile (10GB)Sprint (12GB)Republic (Not Offered)
Variable
Time Series Plot of XL Verizon, ATT, T-Mobile, Sprint, Republic
ATT (
15GB)
Veriz
on (12
GB)
Verizo
n (1G
B)
Sprin
t (1GB)
ATT (
2GB)
Repu
blic (
5GB)
Veriz
on (3
GB)
Sprin
t (12G
B)
T-Mob
ile (10
GB)
Veriz
on (6
GB)
ATT (
5GB)
Sprin
t (3GB)
Sprin
t (6GB
)
T-Mob
ile (6
GB)
Repu
blic (
3GB)
T-Mob
ile (2
GB)
Repu
blic (
2GB)
$ 1,600.00
$ 1,400.00
$ 1,200.00
$ 1,000.00
$ 800.00
$ 600.00
$ 400.00
$ 200.00
$ 0.00
Plan
Annu
alChart of Annual Cost
How Much Would Each Plan have cost for the Year?
1st Quartile 3053.3Median 3504.63rd Quartile 4588.6Maximum 5905.4
3052.5 4459.2
3054.0 4587.4
784.2 1879.6
A-Squared 0.25P-Value 0.687Mean 3755.8StDev 1107.0Variance 1225527.2Skewness 0.314666Kurtosis -0.123559N 12Minimum 1911.6
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
60005000400030002000
Median
Mean
4500425040003750350032503000
95% Confidence Intervals
Summary Report for Total Monthly Usage
A First Statistical Approach (monthly data)
800070006000500040003000200010000
99
95
80
50
20
5
1
Total Usage
Perc
ent
Goodness of Fit Test
NormalAD = 0.248 P-Value = 0.687
Probability Plot for Total UsageNormal - 95% CI
Is The Data Normally Distributed?
121110987654321
6000
4000
2000
Observation
Indiv
idual
Value
_X=3756
UCL=6424
LCL=1088
121110987654321
3000
2000
1000
0
Observation
Mov
ing R
ange
__MR=1003
UCL=3278
LCL=0
I-MR Chart of Total Monthly Usage
Is The Data Is Statistical Control?
600050004000300020001000
LSL *Target *USL 1000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU -0.83Ppk -0.83Cpm *
Cp *CPL *CPU -1.03Cpk -1.03
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 100.00 99.36 99.90% Total 100.00 99.36 99.90
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (1GB)
Plan Annual Cost
ATT (2GB) $ 1,065.00
Sprint (1GB) $ 1,065.00
Is a 1GB (1,000MB) Limit Appropriate?
60005000400030002000
LSL *Target *USL 2000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU -0.53Ppk -0.53Cpm *
Cp *CPL *CPU -0.66Cpk -0.66
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 91.67 94.36 97.58% Total 91.67 94.36 97.58
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (2GB)
Is a 2GB (2,000MB) Limit Appropriate?
Plan Annual Cost
Republic (2GB) $ 480.00
T-Mobile (2GB) $ 600.00
ATT(2GB) $ 1,065.00
60005000400030002000
LSL *Target *USL 3000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU -0.23Ppk -0.23Cpm *
Cp *CPL *CPU -0.28Cpk -0.28
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 83.33 75.26 80.23% Total 83.33 75.26 80.23
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (3GB)
Plan Annual Cost
Republic (3GB) $ 660.00
Sprint (3GB) $ 840.00
Verizon (3GB) $ 1,020.00
Is a 3GB (3,000MB) Limit Appropriate?
60005000400030002000
LSL *Target *USL 5000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU 0.37Ppk 0.37Cpm *
Cp *CPL *CPU 0.47Cpk 0.47
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 8.33 13.05 8.09% Total 8.33 13.05 8.09
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (5GB)
Plan Annual Cost
ATT (5GB) $ 915.00
Republic (5GB) $ 1,020.00 ATT (5GB)
$ 1,500.00
Is a 5GB (5,000MB) Limit Appropriate?
60005000400030002000
LSL *Target *USL 6000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU 0.68Ppk 0.68Cpm *
Cp *CPL *CPU 0.84Cpk 0.84
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 0.00 2.13 0.58% Total 0.00 2.13 0.58
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (6GB)
Plan Annual Cost
T-Mobile (6GB) $ 780.00
Sprint (6GB) $ 780.00
Verizon (6GB) $ 960.00
Is a 6GB (6,000MB) Limit Appropriate?
900075006000450030001500
LSL *Target *USL 10000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU 1.88Ppk 1.88Cpm *
Cp *CPL *CPU 2.34Cpk 2.34
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 0.00 0.00 0.00% Total 0.00 0.00 0.00
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (10GB)
Plan Annual Cost
T-Mobile (10GB) $ 960.00
Is a 10GB (10,000MB) Limit Appropriate?
~6 Sigma !
1200010500900075006000450030001500
LSL *Target *USL 12000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313
Process Data
Pp *PPL *PPU 2.48Ppk 2.48Cpm *
Cp *CPL *CPU 3.09Cpk 3.09
Potential (Within) Capability
Overall Capability
% < LSL * * *% > USL 0.00 0.00 0.00% Total 0.00 0.00 0.00
Observed Expected Overall Expected WithinPerformance
USLOverallWithin
Process Capability Report for Total Usage (12GB)
Plan Annual Cost
Sprint (12GB) $ 960.00
Verizon (12GB) $ 1200.00
Is a 12GB (12,000MB) Limit Appropriate?
Greater than 6 Sigma!
2/19/2
016
1/13/2
016
12/7/2
015
10/31/
2015
9/24/2
015
8/18/2
015
7/12/2
015
6/5/20
15
4/29/2
015
3/24/2
015
1200
1000
800
600
400
200
0
Date
Data
Usa
geTime Series Plot of Data Usage
A Second Statistical Approach (daily data)
1st Quartile 69.13Median 96.703rd Quartile 138.00Maximum 1100.00
112.59 133.69
88.25 102.97
95.71 110.67
A-Squared 27.78P-Value <0.005Mean 123.14StDev 102.64Variance 10535.65Skewness 3.9407Kurtosis 26.1682N 366Minimum 0.00
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
10008006004002000
Median
Mean
14013012011010090
95% Confidence Intervals
Summary Report for Data Usage
Descriptive Statistics On Daily Usage
12008004000
99.9
99
90
5010
1
0.1
Data Usage
Perce
nt
10000100
01001010.10.01
0.001
0.0001
99.9
99
90
5010
1
0.1
Data Usage
Perce
nt
100010010
99.9
99
90
50
10
1
0.1
Data Usage - Threshold
Perce
nt
20-2-4
99.999
90
50
10
10.1
Data Usage
Perce
nt
3-Parameter LoglogisticAD = 1.975 P-Value = *
Johnson TransformationAD = 0.171 P-Value = 0.932
Goodness of Fit Test
LogisticAD = 13.251 P-Value < 0.005
LoglogisticAD = 9.501 P-Value < 0.005
After Johnson transformation
Probability Plot for Data UsageLogistic - 95% CI Loglogistic - 95% CI
3-Parameter Loglogistic - 95% CI Normal - 95% CI
If The Data Is Not Normal What Approximates The Data?
12008004000
99.9
99
90
50
10
1
0.1
N 366AD 27.776P-Value <0.005
Perc
ent
20-2-4
99.9
99
90
50
10
1
0.1
N 366AD 0.171P-Value 0.932
Perc
ent
1.21.00.80.60.40.2
0.8
0.6
0.4
0.2
0.0
Z Value
P-Va
lue
for A
D te
st
0.38
Ref P
P-Value for Best Fit: 0.931848Z for Best Fit: 0.38Best Transformation Type: SUTransformation function equals-0.996951 + 0.885314 × Asinh( ( X - 59.1002 ) / 25.8392 )
Probability Plot for Original Data
Probability Plot for Transformed Data
Select a Transformation
(P-Value = 0.005 means ≤ 0.005)
Johnson Transformation for Data Usage
The Johnson Transformation of the Data
111098754321
3.0
1.5
0.0
-1.5
-3.0
Billing Cycle
Indiv
idual
Value
_X=-0.003
UCL=2.430
LCL=-2.436
111098754321
4
3
2
1
0
Billing Cycle
Mov
ing R
ange
__MR=0.915
UCL=2.989
LCL=0
1
11
11
11
111
I-MR Chart of Transformed Data Usage
Is the Data In Statistical Control?
121110987654321
1200
1000
800
600
400
200
0
Billing Cycle
Data
Usa
ge
106.75261.6645103.34889.1433104.029150.58156.348190.497101.303122.071153.747137.239
Boxplot of Data Usage
A Third Statistical Approach
10005000
99.999
90
50
10
10.1
Residual
Perc
ent
20015010050
1000
750
500
250
0
Fitted Value
Resid
ual
9007506004503001500
120
90
60
30
0
Residual
Freq
uenc
y
350300250200150100501
1000
750
500
250
0
Observation Order
Resid
ual
Normal Probability Plot Versus Fits
Histogram Versus Order
Residual Plots for Data Usage
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-ValueBilling Cycle 11 429109 39010 4.04 0.000Error 354 3416405 9651Total 365 3845514
Model Summary
S R-sq R-sq(adj) R-sq(pred)98.2388 11.16% 8.40% 4.99%
Method
Null hypothesis All means are equalAlternative hypothesis At least one mean is differentSignificance level α = 0.05
Equal variances were assumed for the analysis.
Factor Information
Factor Levels ValuesBilling Cycle 12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
One-way ANOVA: Data Usage versus Billing Cycle
Is There Statistically Significant Difference Between The Months?
But ANOVA Requires The Data to be Normal
Kruskal-Wallis Test: Data Usage versus Billing Cycle
Kruskal-Wallis Test on Data Usage
BillingCycle N Median Ave Rank Z 1 31 108.80 217.0 1.84 2 30 130.50 249.8 3.58 3 31 88.40 187.4 0.21 4 30 85.60 160.9 -1.22 5 31 137.90 265.5 4.51 6 31 129.40 234.7 2.82 7 30 88.15 182.3 -0.07 8 31 93.80 187.9 0.24 9 30 75.70 135.9 -2.5710 31 75.00 148.9 -1.9011 31 62.50 86.3 -5.3512 29 73.20 142.6 -2.17Overall 366 183.5
H = 82.19 DF = 11 P = 0.000H = 82.19 DF = 11 P = 0.000 (adjusted for ties)
A First Non-Parametric Approach
20
10
0
1050
9007506004503001500 1050
9007506004503001500
20
10
0
105090075060045
03001500
20
10
0
105090075060045
03001500
1
Data Usage
Freq
uenc
y
2 3 4
5 6 7 8
9 10 11 12
Histogram of Data Usage
Panel variable: Billing Cycle
Kruskal-Wallis Test RequiresThe Distributions To Have Similar Shapes
Mood Median Test: Data Usage versus Billing Cycle Mood median test for Data UsageChi-Square = 70.53 DF = 11 P = 0.000
Billing Individual 95.0% CIsCycle N≤ N> Median Q3-Q1 --+---------+---------+---------+---- 1 10 21 109 68 (*--) 2 4 26 131 45 (-*---) 3 17 14 88 59 (-*-----) 4 19 11 86 46 (-*-) 5 5 26 138 156 (-----*---------------) 6 8 23 129 81 (----*----) 7 16 14 88 78 (--*-----) 8 17 14 94 44 (-*-) 9 21 9 76 44 (-*--)10 22 9 75 46 (-*-)11 26 5 63 36 (-*-)12 18 11 73 83 (--*-----) --+---------+---------+---------+---- 60 120 180 240
Overall median = 97
A Second Non-Parametric Approach
A Fourth Statistical Approach
SaturdayFridayThursdayWednesdayTuesdayMondaySunday
1200
1000
800
600
400
200
0
Day Of Week
Data
Usa
ge
124.35117.36125.612116.7687.934127.687163.094
Boxplot of Data Usage
A Fifth Statistical Approach (by days of the week)
20
10
0
10509007506004503001500
10509007506004503001500
20
10
0
10509007506004503001500
20
10
0
Sunday
Data Usage
Freq
uenc
y
Monday Tuesday
Wednesday Thursday Friday
Saturday
Histogram of Data Usage
Panel variable: Day Of Week
What Do The Distributions Of Each Day Look Like?
SUNDAY
1st Quartile 74.32Median 120.153rd Quartile 197.35Maximum 1100.00
115.83 210.36
84.71 152.41
142.27 210.53
A-Squared 4.65P-Value <0.005Mean 163.09StDev 169.77Variance 28821.13Skewness 3.7420Kurtosis 18.3156N 52Minimum 0.00
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
9607204802400
Median
Mean
225200175150125100
95% Confidence Intervals
Summary Report for Data Usage
Sunday Descriptive Statistics
WeibullAD = 0.728 P-Value = 0.053
3-Parameter WeibullAD = 0.355 P-Value = 0.475
Goodness of Fit Test
ExponentialAD = 4.176 P-Value < 0.003
2-Parameter ExponentialAD = 1.614 P-Value = 0.017
1000100101
99.9
90
50
10
1
Data Usage
Perce
nt
1000100101
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
100010010
99.9
90
50
10
1
Data Usage
Perce
nt
1000100101
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Sunday?
9607204802400
40
30
20
10
0
Shape # 1.369Scale # 125.2Thresh # 26.73N 50
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
# This estimated historical parameter is used in the calculations.
A 3-Parameter Weibull Models Sunday Data
Red Bars indicate outliers that were excluded from parameter determination
MONDAY
1st Quartile 67.92Median 89.853rd Quartile 134.92Maximum 619.30
94.34 161.03
78.71 108.15
100.37 148.52
A-Squared 5.23P-Value <0.005Mean 127.69StDev 119.76Variance 14342.75Skewness 2.55234Kurtosis 7.19780N 52Minimum 0.00
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
6004803602401200
Median
Mean
16014012010080
95% Confidence Intervals
Summary Report for Data Usage
Monday Descriptive Statistics
10001001010.10.010.0010.0001
90
50
10
1
Data Usage
Perce
nt
1000100101
90
50
10
1
Data Usage - Threshold
Perce
nt
10001001010.10.010.0010.0001
90
50
10
1
Data Usage
Perce
nt
10010
90
50
10
1
Data Usage - Threshold
Perce
nt
WeibullAD = 2.383 P-Value < 0.010
3-Parameter WeibullAD = 0.398 P-Value = 0.342
Goodness of Fit Test
ExponentialAD = 6.080 P-Value < 0.003
2-Parameter ExponentialAD = 6.124 P-Value < 0.010
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Monday?
600480360240120
35
30
25
20
15
10
5
0
Shape # 1.916Scale # 74.12Thresh # 29.30N 48
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
# This estimated historical parameter is used in the calculations.
A 3-Parameter Weibull Models Monday Data
Red Bars indicate outliers that were excluded from parameter determination
TUESDAY
1st Quartile 61.250Median 81.4003rd Quartile 105.600Maximum 289.700
75.526 100.342
72.217 89.345
37.785 55.699
A-Squared 1.76P-Value <0.005Mean 87.934StDev 45.017Variance 2026.544Skewness 2.02797Kurtosis 7.44336N 53Minimum 0.000
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
300240180120600
Median
Mean
100959085807570
95% Confidence Intervals
Summary Report for Data Usage
Tuesday Descriptive Statistics
WeibullAD = 0.382 P-Value > 0.250
3-Parameter WeibullAD = 0.203 P-Value > 0.500
Goodness of Fit Test
ExponentialAD = 10.303 P-Value < 0.003
2-Parameter ExponentialAD = 3.239 P-Value < 0.010
1000100101
99.9
90
50
10
1
Data Usage
Perce
nt
10001001010.1
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
10010
99.9
90
50
10
1
Data Usage
Perce
nt
10010
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Tuesday?
30024018012060
25
20
15
10
5
0
Shape # 1.882Scale # 57.02Thresh # 34.60N 51
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
# This estimated historical parameter is used in the calculations.
A 3-Parameter Weibull Models Tuesday Data
Red Bars indicate outliers that were excluded from parameter determination
WEDNESDAY
1st Quartile 69.00Median 97.103rd Quartile 154.00Maximum 321.50
97.32 136.20
77.27 113.79
59.20 87.27
A-Squared 2.07P-Value <0.005Mean 116.76StDev 70.53Variance 4974.95Skewness 1.10549Kurtosis 0.67508N 53Minimum 0.00
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
320240160800
Median
Mean
1401301201101009080
95% Confidence Intervals
Summary Report for Data Usage
Wednesday Descriptive Statistics
WeibullAD = 1.186 P-Value < 0.010
3-Parameter WeibullAD = 0.618 P-Value = 0.113
Goodness of Fit Test
ExponentialAD = 5.427 P-Value < 0.003
2-Parameter ExponentialAD = 2.310 P-Value < 0.010
1000100101
99.9
90
50
10
1
Data Usage
Perce
nt
1000100101
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
100010010
99.9
90
50
10
1
Data Usage
Perce
nt
1000100101
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Wednesday?
3202802402001601208040
20
15
10
5
0
Shape 1.430Scale 104.1Thresh 24.66N 52
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
A 3-Parameter Weibull Models Wednesday Data
THURSDAY
1st Quartile 74.90Median 102.003rd Quartile 163.27Maximum 449.50
105.25 145.97
81.45 134.75
61.29 90.69
A-Squared 1.99P-Value <0.005Mean 125.61StDev 73.13Variance 5347.80Skewness 2.03570Kurtosis 6.42894N 52Minimum 42.10
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
400300200100
Median
Mean
1401301201101009080
95% Confidence Intervals
Summary Report for Data Usage
Thursday Descriptive Statistics
WeibullAD = 0.904 P-Value = 0.020
3-Parameter WeibullAD = 0.324 P-Value > 0.500
Goodness of Fit Test
ExponentialAD = 6.944 P-Value < 0.003
2-Parameter ExponentialAD = 1.454 P-Value = 0.025
1000100101
99.9
90
50
10
1
Data Usage
Perce
nt
1000100101
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
100010010
99.9
90
50
10
1
Data Usage
Perce
nt
1000100101
99.9
90
50
10
1
Data Usage - Threshold
Perce
nt
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Thursday?
400300200100
25
20
15
10
5
0
Shape # 1.364Scale # 85.54Thresh # 40.89N 52
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
# This estimated historical parameter is used in the calculations.
A 3-Parameter Weibull Models Thursday Data
Red Bars indicate outliers that were excluded from parameter determination
FRIDAY
1st Quartile 67.70Median 100.953rd Quartile 122.95Maximum 435.30
94.58 140.14
84.26 105.70
68.58 101.49
A-Squared 4.30P-Value <0.005Mean 117.36StDev 81.84Variance 6697.42Skewness 2.21566Kurtosis 5.35910N 52Minimum 10.70
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
4003002001000
Median
Mean
1401301201101009080
95% Confidence Intervals
Summary Report for Data Usage
Friday Descriptive Statistics
WeibullAD = 0.607 P-Value = 0.111
3-Parameter WeibullAD = 0.392 P-Value = 0.404
Goodness of Fit Test
ExponentialAD = 9.088 P-Value < 0.003
2-Parameter ExponentialAD = 2.477 P-Value < 0.010
1000100101
90
50
10
1
Data Usage
Perce
nt
10001001010.1
90
50
10
1
Data Usage - Threshold
Perce
nt
10010
90
50
10
1
Data Usage
Perce
nt
10010
90
50
10
1
Data Usage - Threshold
Perce
nt
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Friday?
400300200100
35
30
25
20
15
10
5
0
Shape # 1.670Scale # 61.32Thresh # 39.17N 51
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
# This estimated historical parameter is used in the calculations.
A 3-Parameter Weibull Models Friday Data
Red Bars indicate outliers that were excluded from parameter determination
SATURDAY
1st Quartile 69.73Median 101.853rd Quartile 137.40Maximum 597.70
96.46 152.24
82.46 121.30
83.94 124.22
A-Squared 4.52P-Value <0.005Mean 124.35StDev 100.17Variance 10033.47Skewness 2.79744Kurtosis 9.97571N 52Minimum 0.00
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
6004803602401200
Median
Mean
16014012010080
95% Confidence Intervals
Summary Report for Data Usage
Saturday Descriptive Statistics
WeibullAD = 1.262 P-Value < 0.010
3-Parameter WeibullAD = 0.441 P-Value = 0.310
Goodness of Fit Test
ExponentialAD = 7.494 P-Value < 0.003
2-Parameter ExponentialAD = 1.317 P-Value = 0.037
1000100101
90
50
10
1
Data Usage
Perce
nt
1000100101
90
50
10
1
Data Usage - Threshold
Perce
nt
10010
90
50
10
1
Data Usage
Perce
nt
1000100101
90
50
10
1
Data Usage - Threshold
Perce
nt
Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI
Weibull - 95% CI 3-Parameter Weibull - 95% CI
What Distribution Models Saturday?
600480360240120
35
30
25
20
15
10
5
0
Shape # 1.246Scale # 69.33Thresh # 44.41N 50
Data Usage
Freq
uenc
yHistogram of Data Usage
3-Parameter Weibull
# This estimated historical parameter is used in the calculations.
A 3-Parameter Weibull Models Saturday Data
Red Bars indicate outliers that were excluded from parameter determination
THE SIMULATION
The Simulation Equation
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Sunday Monday Tuesday Wednesday Thursday
Tuesday Wednesday Thursday Friday Saturday
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +=Bill 1
Sunday Monday Tuesday Wednesday Thursday Friday Saturday TotalBill1 4 4 5 5 5 4 4 31Bill2 4 4 4 4 4 5 5 30Bill3 5 5 5 4 4 4 4 31Bill4 4 4 4 5 5 4 4 30Bill5 5 4 4 4 4 5 5 31Bill6 4 5 5 5 4 4 4 31Bill7 4 4 4 4 5 5 4 30Bill8 5 5 4 4 4 4 5 31Bill9 4 4 5 5 4 4 4 30
Bill10 4 4 4 4 5 5 5 31Bill11 5 5 5 4 4 4 4 31Bill12 4 4 4 5 4 4 4 29
The Simulation Parameters
The Simulation Results
The Simulation Results
The Simulation Results
ASSESSING CAPABILITY FROM SIMULATION RESULTS
Is a 1GB (1,000MB) Limit Appropriate?
Is a 2GB (2,000MB) Limit Appropriate?
Is a 3GB (3,000MB) Limit Appropriate?
Is a 4GB (4,000MB) Limit Appropriate?
Is a 5GB (5,000MB) Limit Appropriate?
Is a 6GB (6,000MB) Limit Appropriate?
Is a 10GB (10,000MB) Limit Appropriate?
Is a 12GB (12,000MB) Limit Appropriate?
Data Usage <1 1-2 2-3 3-4 4-5 5-6 >6 Expected Monthly Charge 0.000% 0.330% 29.190% 52.890% 15.850% 1.650% 0.090%
Sprint (1GB) $ 40.00 $ 0.05 $ 4.38 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 55.00 Sprint (3GB) $ 50.00 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 60.57 Sprint (6GB) $ 65.00 $ 65.00 VZ (1Gb) $ 50.00 $ 0.05 $ 4.38 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 65.00 ATT (2GB) $ 55.00 $ 4.38 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 69.95 ATT (5GB) $ 75.00 $ 0.25 $ 0.01 $ 75.26 VZ (3GB) $ 65.00 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 75.57 Sprint (12GB) $ 80.00 $ 80.00 VZ (6GB) $ 80.00 $ 0.01 $ 80.01 VZ(12GB) $ 100.00 $ 100.00 ATT (15GB) $ 125.00 $ 125.00
Plan Selection Based on Simulation
Measured SimulationPpk % Ppk %
1GB -0.83 99.36% -1.22 100%2GB -0.53 94.36% -0.7047 99.673GB -0.23 75.26% -0.1984 70.48%5GB 0.37 13.05% 0.81 1.74%6GB 0.68 2.13% 1.31 0.09%10GB 1.88 0.00% 3.33 0.00%12GB 2.48 0.00% 4.35 0.00%
Comparison of Simulated and Measured Capability
Conclusion• Mobile Phone Data usage can be analyzed using:
– Descriptive Statistics– Run Charts– Probability Plots– Control Chart– Process Capability
• Non-Normal Data requires different hypothesis test including:– Kruskal-Wallis– Mood Median
• A Stochastic Simulation Model can be created by:– Determining a distribution that characterized each factor– Specifying a mathematical relationship between the factors
• A Process Capability on simulated data can be used to determine specification limits
Questions?
Contact Information:Brandon R. Theiss, PE
Rutgers School of Law- CamdenBrandon.Theiss@Rutgers.edu
top related