chapter 7: statistical analysis spreadsheet-based decision support systems prof. name [email protected]...
TRANSCRIPT
![Page 1: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/1.jpg)
Chapter 7: Statistical Analysis
Spreadsheet-Based Decision Support Systems
Prof. Name [email protected] (123) 456-7890University Name
![Page 2: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/2.jpg)
2
Overview
7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary
![Page 3: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/3.jpg)
3
Introduction
Performing basic statistical analysis of data using Excel functions
Statistical features of the Data Analysis Toolpack
Trend curves for analyzing data patterns
Basic linear regression techniques in Excel
Several different distribution functions in Excel
![Page 4: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/4.jpg)
4
Understanding Data
Statistical Functions
Descriptive Statistics
Histograms
![Page 5: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/5.jpg)
5
Statistical Functions
AVERAGE– Finds the mean of a set of data.– =AVERAGE(range or range_name)
MEDIAN– Finds the middle number in a list of sorted data.– =MEDIAN(range or range_name)
STDEV– Finds the standard deviation of a set of data.– This is equal to the square root of the variance, which measures the
difference between the mean of the data set and the individual values.– =STDEV(range or range_name)
![Page 6: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/6.jpg)
6
Figures 7.1 and 7.2
![Page 7: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/7.jpg)
7
Figures 7.3 and 7.4
![Page 8: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/8.jpg)
8
Analysis Toolpack
An Excel Add-In which includes several statistical analysis techniques
To ensure that it is an active Add-in, choose Tools > Add-ins from the menu. Select Analysis Toolpack from the list.
![Page 9: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/9.jpg)
9
Descriptive Statistics
Provides a list of statistical information about your data set including– Mean
– Median
– Standard deviation
– Variance
Go to Tools > Data Analysis > Descriptive Statistics
![Page 10: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/10.jpg)
10
Descriptive Statistics (cont)
The Input Range refers to the location of the data set.
You can check whether your data is Grouped By Columns or Rows.
If there are labels in the first row of each column of data, then check the Labels in First Row box.
The Output Range refers to where you want the results of the analysis to be displayed in the current worksheet.
The Summary Statistics box will calculate the most commonly used statistics from our data.
![Page 11: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/11.jpg)
11
Figure 7.7
Quarterly stock returns for three different companies are recorded. We want to know – Average stock return
– Variability of stock returns
– Which quarters had the highest and lowest stock returns
![Page 12: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/12.jpg)
12
Figures 7.8 and 7.9
![Page 13: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/13.jpg)
13
Figure 7.11
The standard deviation can be used to understand how common outliers are in the data.
![Page 14: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/14.jpg)
14
More Descriptive Statistics
Confidence Level for Mean– The mean is calculated using the specified confidence level (for example, 95% or
99%), the standard deviation, and the size of the sample data.– The confidence level and calculated mean are then added to the analysis report.– You can compare the actual mean to this calculated mean based on the specified
confidence level.
Kth Largest– Gives the largest ranked data value for a specified value of k.– For k = 1, the maximum data value would be returned.
Kth Smallest– Gives the smallest ranked data value for a specified value of k.– For k = 1, the minimum data value would be returned.
![Page 15: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/15.jpg)
15
Descriptive Statistics Functions
PERCENTILE – Returns a value for which a desired percentile k of the specified data_set falls below.– =PERCENTILE(data_set, k)
For example, for the MSFT data, the value for which 95% of the data falls below is
– =PERCENTILE(B4:B27,0.95) = 0.108
PERCENTRANK – Returns the percentile of the data _set which falls below a given value. – =PERCENTRANK(data_set, value)
For example, the percent of the MSFT data which falls below the value 0.108 is– =PERCENTRANK(B4:B27, 0.108) = 0.95, or 95%
![Page 16: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/16.jpg)
16
Histograms
Histograms calculate the number of occurrences, or frequency, which values in a data set fall into various intervals.
Choose the Histogram option from the Analysis Toolpack list.
![Page 17: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/17.jpg)
17
Histograms (cont’d)
The Input Range is the range of the data set.
The Bin Range is used to specify the location of the bin values.
– Bins are the intervals into which values can fall; they can be defined by the user or can be evenly distributed among the data by Excel.
The Output Range is the location of the output, or the frequency calculations for each bin.
The chart options include a simple Chart Output (the actual histogram), Cumulative Percentage for each bin value, and a Pareto organization of the chart.
![Page 18: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/18.jpg)
18
Figures 7.15 and 7.16
![Page 19: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/19.jpg)
19
Figures 7.17 and 7.18
To create your own bin values, make a list of upper bounds for each interval.
![Page 20: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/20.jpg)
20
Figure 7.19
![Page 21: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/21.jpg)
21
Histograms (cont’d)
Histograms can also be formatted.– Right-click on the histogram and change the Chart Options or other
parameters.
![Page 22: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/22.jpg)
22
Histograms (cont)
There are four basic shapes to a histogram: – Symmetric: has only one peak; that is, there is a central high part and almost
equal lower parts to the left and right of this peak.
– Positively skewed: has a peak on the left and many lower points (stretching) to the right.
– Negatively skewed: has a peak on the right and many lower points (stretching) to the left.
– Multiple peaks: imply that more than one source, or population, of data is being evaluated.
![Page 23: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/23.jpg)
23
Relationships in Data
Trend Curves
Regression
![Page 24: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/24.jpg)
24
Data Relationships
Relationships in data are usually identified by comparing two variables: the dependent variable and the independent variable.
– The dependent variable is the variable we are most interested in. By understanding its current behavior we can better predict its future behavior.
– The independent variable is the variable we use as the comparison in order to make this prediction.
![Page 25: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/25.jpg)
25
Trend Curves
Trend curves are used to graph and analyze these relationships between data.
Trend curves graph the data with – the independent variable on the x-axis
– the dependent variable on the y-axis
To add a trend curve to your chart, right-click on the data points in an XY Scatter chart and choose Add Trendline from the drop-down list of options.
![Page 26: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/26.jpg)
26
Trend Curves (cont’d)
There are five basic trend curves which Excel can model: – Linear
– Exponential
– Power
– Moving Average
– Logarithmic
![Page 27: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/27.jpg)
27
Trend Curves (cont’d)
Click on the Options tab to set options for the trend curve.
Set the name of the trendline.
Specify a period forward or backwards for which you want to predict the behavior of your dependent variable.
Check to Display Equation and Display R-Squared Value.
![Page 28: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/28.jpg)
28
Linear Trend Curves
Number of Units Produced each month and the corresponding Monthly Plant Cost are recorded.
The company wants to be able to estimate their plant costs based on the planned production amounts.
The dependent variable is therefore the Monthly Plant Cost and the independent variable is the Units Produced.
![Page 29: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/29.jpg)
29
Figures 7.26 and 7.29
Graph the data and then add a Linear trendline.
![Page 30: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/30.jpg)
30
Figure 7.30
Use the displayed equation to predict future values. First check the accuracy of the equation by calculating the error from the
known data.
Linear trends have the relationship: y = a*x - b
![Page 31: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/31.jpg)
31
Exponential Trend Curves
Sales data for ten years is recorded.
We want to be able to predict sales for the next few years.
The independent variable is Years and our dependent variable is Sales.
![Page 32: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/32.jpg)
32
Figures 7.34 and 7.35
Exponential trends have the relationship: y = a*e^(b*x) or
y = a*EXP(b*x)
![Page 33: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/33.jpg)
33
Power Trend Curves
We are given yearly Production values and yearly Unit Cost for production.
We want to determine the relationship between Unit Cost and Production in order to be able to predict future Unit Costs.
![Page 34: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/34.jpg)
34
Figures 7.39 and 7.40
Power trends have the relationship: y = a*x^b
![Page 35: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/35.jpg)
35
Regression Analysis
We can use some regression analysis parameters to ensure that the relationships we have chosen for our data are “good” fits.
These parameters include– R-Squared value
– Standard error
– Slope
– Intercept
![Page 36: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/36.jpg)
36
R-Squared Value
The R-Squared value measures the amount of influence the independent variable has on the dependent variable.
The closer the R-Squared value is to 1, the stronger the relationship is between the independent and dependent variables.
If the R-Squared value is closer to 0, then there may not be a relationship between these two variables.
![Page 37: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/37.jpg)
37
Figure 7.42
We fit a Linear trendline to the Monthly Plant Cost per Units Produced chart (see Figure 7.44).
The R-Squared value is 0.8137, which is fairly close to 1, implying a good fit.
![Page 38: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/38.jpg)
38
Figure 7.45
The RSQ Excel function can calculate the R-squared value from a set of data.– =RSQ(y_range, x_range)
Note that this function only works with Linear trend curves.
![Page 39: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/39.jpg)
39
Standard Error
The standard error measures the accuracy of any predictions made.
It can be calculated in Excel using the STEYX function– =STEYX(y_range, x_range)
This function can also only be used for Linear trend curves.
![Page 40: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/40.jpg)
40
Slope and Intercept
Two Excel functions can be used with a linear regression line of a collection of data.
SLOPE function– =SLOPE(y_range, x_range)
INTERCEPT function– =INTERCEPT(y_range, x_range)
![Page 41: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/41.jpg)
41
Distributions
Many distributions have Excel functions associated with them. – These functions are basically equivalent to using distribution tables.
– That is, given certain parameters of a set of data for a particular distribution, you would look at a distribution table to find the corresponding area from the distribution curve.
Some common distributions are– Normal
– Exponential
– Uniform
– Binomial
– Poisson
– Beta
– Weibull
![Page 42: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/42.jpg)
42
Normal Distribution
The parameters for this distribution are simply the value we are interested in finding the probability for, and the mean and standard deviation of the set of data.
The function we use with the Normal distribution is NORMDIST– =NORMDIST(x, mean, std_dev, cumulative)
![Page 43: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/43.jpg)
43
Normal Distribution (cont)
The cumulative parameter will be seen in many Excel distribution functions.
This parameter can take the values True or False to determine if you want the value returned from the cumulative distribution function or the probability density function, respectively. – The cumulative distribution function (cdf) will find the probability that a value
in the data set is less than or equal to x.
– The probability density function (pdf) will find the probability that a value is exactly equal to x.
![Page 44: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/44.jpg)
44
Figure 7.48
Annual drug sales at a local drugstore are distributed Normally with a mean of 40,000 and standard deviation of 10,000.
The probability that the actual sales for the year are 42,000 is 0.58, or 58%.
![Page 45: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/45.jpg)
45
Figure 7.49
What is the probability that annual sales will be between 35,000 and 49,000?
To find this value, we will subtract the cdf values for these two bounds.– =NORMDIST(49000, 40000, 10000, True) –
NORMDIST(35000, 40000, 10000, True) This will return a 0.51 probability, or 51% chance.
![Page 46: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/46.jpg)
46
Standard Normal Distribution
If the mean of your data is 0 and the standard deviation is 1, then placing these values in the NORMDIST function with the cumulative parameter as True will find the resulting value from the Standard Normal distribution.
The STANDARDIZE function will convert the x value from a data set of a mean not equal to 0 and a standard deviation not equal to 1 into a value which does assume a mean of 0 and a standard deviation of 1.– =STANDARDIZE(x, mean, std_dev)
The resulting standardized value is then used as the main parameter in the NORMSDIST function– =NORMSDIST(standardized_x)
![Page 47: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/47.jpg)
47
Figure 7.50
Consider the same example used previously to find the probability that a drugstore’s annual sales are 42,000.
![Page 48: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/48.jpg)
48
Uniform Distribution
The Uniform distribution does not actually have a corresponding Excel function; however, a simple formula can be used to model the Uniform distribution.– 1 / (b – a)
Given that a value x is Uniformly distributed between a and b, we can use this formula to determine the probability that x will have an integer value in this interval.
![Page 49: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/49.jpg)
49
Figure 7.51
Consider any values for a and b, then use the formula to calculate the Uniform value.
![Page 50: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/50.jpg)
50
Poisson Distribution
The Poisson distribution has only the mean as its parameter.
The function we use for this distribution is POISSON – =POISSON(x, mean, cumulative)
The Poisson distribution value is the probability that the number events which occur is either between 0 and x (cdf) or equal to x (pdf).
![Page 51: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/51.jpg)
51
Figure 7.52
For example, consider a bakery which serves an average of 20 customers per hour.
Find the probability that at most 35 customers will be served in the next two hours.
![Page 52: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/52.jpg)
52
Exponential Distribution
The Exponential distribution has only one parameter: lambda = 1 / mean of the data set.
The function we use for this distribution is EXPONDIST– =EXPONDIST(x, lambda, cumulative)
The Exponential distribution is commonly used for modeling interarrival times.
![Page 53: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/53.jpg)
53
Figure 7.53
Let us use the same example with the bakery data. Arrival rate is said to be 20 customers per hour. Interarrival mean, or the Exponential mean, is 1 / arrival rate. Therefore,
for this example, the interarrival mean is 1/20 hours per customer arrival.
To find the probability that a customer arrives in 10 minutes, we would set – x = 10/60 = 0.17 hours
– lambda = 1/(1/20) = 20 hours
– =EXPONDIST(0.17, 20, True)
![Page 54: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/54.jpg)
54
Binomial Distribution
The Binomial distribution has the following parameters: the number of trials and the probability of a success.
We are trying to determine the probability that the number of successes is less than or equal to (using cdf) or equal to (pdf) some x value.
The function we use for this distribution is BINOMDIST – =BINOMDIST(x, trials, prob_success, cumulative)
![Page 55: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/55.jpg)
55
Figure 7.54
Suppose a survey shows that 40 percent of people pay more attention to ads in the newspaper, and 60 percent pays more attention to ads on television.
What is the probability that out of 100 people surveyed, 50 of them respond more to ads on television?
![Page 56: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/56.jpg)
56
Beta Distribution
The Beta distribution has the following parameters: alpha, beta, A, and B. – Alpha and beta are determined from the data set
– A and B are optional bounds on the x value for which you want the Beta distribution value
The function we use for this distribution is BETADIST – =BETADIST(x, alpha, beta, A, B)
If A and B are omitted, then a standard cumulative distribution is assumed and they are given the values 0 and 1, respectively.
![Page 57: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/57.jpg)
57
Figure 7.55
Determine the probability that a team can complete a project in 10 days.
Estimate the total time needed to be 1 to 2 weeks; these estimates will be the bound values, or the A and B parameters.
Use a mean and standard deviation of 12 and 3 days to compute the alpha and beta parameters.
![Page 58: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/58.jpg)
58
Weibull Distribution
The Weibull distribution has the parameters alpha and beta.
The function we use for this distribution is WEIBULL – =WEIBULL(x, alpha, beta, cumulative)
The Weibull distribution is most commonly used to determine reliability functions.
![Page 59: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/59.jpg)
59
Figure 7.56
On average, a lightbulb will last 1200 hours, with a standard deviation of 100 hours. We can use these values to calculate alpha and beta.
We can now use the WEIBULL distribution to determine the probability that a lightbulb will be reliable for 55 days = 1320 hours.
![Page 60: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/60.jpg)
60
Summary
The Analysis Toolpack is an Excel Add-In that includes statistical analysis techniques such as Descriptive Statistics, Histograms, Exponential Smoothing, Correlation, Covariance, Moving Average, and others.
The Descriptive Statistics option provides a list of statistical information about a data set, including the mean, median, standard deviation, and variance.
Histograms calculate the number of occurrences, or frequency, which values in a data set fall into various intervals.
Relationships in data are usually identified by comparing the dependent variable and the independent variable.
There are five basic trend curves that Excel can model: Linear, Exponential, Power, Moving Average, and Logarithmic.
Some of the more common distributions that can be recognized when performing a statistical analysis of data are the Normal, Exponential, Uniform, Binomial, Poisson, Beta, and Weibull distributions.
![Page 61: Chapter 7: Statistical Analysis Spreadsheet-Based Decision Support Systems Prof. Name name@email.com Position (123) 456-7890 University Name](https://reader033.vdocuments.us/reader033/viewer/2022061518/56649e235503460f94b1155b/html5/thumbnails/61.jpg)
61
Additional Links
(place links here)