introduction to statistical thinking for decision making

Upload: jing-saguittarius

Post on 05-Apr-2018

235 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    1/13

    Introduction to Statistical Thinking for Decision Making

    This site builds up the basic ideas of business statistics systematically and correctly. It is acombination of lectures and computer-based practice, joining theory firmly with practice. Itintroduces techniques for summarizing and presenting data, estimation, confidence intervals and

    hypothesis testing. The presentation focuses more on understanding of key concepts andstatistical thinking, and less on formulas and calculations, which can now be done on smallcomputers through user-friendlyStatistical JavaScript A, etc. A Spanish version of this site isavailable atRazonamiento Estadstico para la Toma de Decisiones Gerencialesand itscollectionof JavaScript.

    Today's good decisions are driven by data. In all aspects of our lives, and importantly in thebusiness context, an amazing diversity of data is available for inspection and analytical insight.Business managers and professionals are increasingly required to justify decisions on the basis ofdata. They need statistical model-based decision support systems.

    Statistical skills enable them to intelligently collect, analyze and interpret data relevant to theirdecision-making. Statistical concepts and statistical thinking enable them to:

    solve problems in a diversity of contexts. add substance to decisions. reduce guesswork.

    This Web site is a course in statistics appreciation; i.e., acquiring a feel for thestatistical way of thinking. It hopes to make sound statistical thinkingunderstandable in business terms. An introductory course in statistics, it isdesigned to provide you with the basic concepts and methods of statistical

    analysis for processes and products. Materials in this Web site are tailored to helpyou make better decisions and to get you thinking statistically. A cardinalobjective for this Web site is to embed statistical thinking into managers, whomust often decide with little information.

    In competitive environment, business managers must design quality into products,and into the processes of making the products. They must facilitate a process ofnever-ending improvement at all stages of manufacturing and service. This is astrategy that employs statistical methods, particularlystatistically designedexperiments, and produces processes that provide high yield and products thatseldom fail. Moreover, it facilitates development of robust products that areinsensitive to changes in the environment and internal component variation.Carefully planned statistical studies remove hindrances to high quality andproductivity at every stage of production. This saves time and money. It is wellrecognized that quality must be engineered into products as early as possible inthe design process. One must know how to use carefully planned, cost-effectivestatistical experimentsto improve, optimize and make robust products andprocesses.

    Business Statistics is a science assisting you to make business decisions under

    uncertainties based on some numerical and measurable scales. Decision makingprocesses must be based on data, not on personal opinion nor on belief.

    The Devil is in the Deviations: Variation is inevitable in life! Every process,every measurement, every sample has variation. Managers need to understandvariation for two key reasons. First, so that they can lead others to apply statisticalthinking in day-to-day activities and secondly, to apply the concept for thepurpose of continuous improvement. This course will provide you with hands-onexperience to promote the use of statistical thinking and techniques to apply themto make educated decisions, whenever you encounter variation in business data.You will learn techniques to intelligently assess and manage the risks inherent indecision-making. Therefore, remember that:

    Just like weather, if you cannot control something, you should learn how tomeasure and analyze it, in order to predict it, effectively.

    http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Javastat.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Javastat.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Javastat.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504S.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504S.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504S.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/JAVASTATS.HTMhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/JAVASTATS.HTMhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/JAVASTATS.HTMhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/JAVASTATS.HTMhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdesinexphttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdesinexphttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdesinexphttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdesinexphttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstaexperhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstaexperhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstaexperhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdesinexphttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdesinexphttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/JAVASTATS.HTMhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/JAVASTATS.HTMhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504S.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Javastat.htm
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    2/13

    If you have taken statistics before, and have a feeling of inability to graspconcepts, it may be largely due to your former non-statistician instructors teachingstatistics. Their deficiencies lead students to develop phobias for the sweetscience of statistics. In this respect, Professor Herman Chernoff (1996) made thefollowing remark:

    "Since everybody in the world thinks he can teach statistics even

    though he does not know any, I shall put myself in the position of

    teaching biology even though I do not know any"

    Inadequate statistical teaching during university education leads even aftergraduation, to one or a combination of the following scenarios:

    1. In general, people do not like statistics and therefore they try to avoid it.2. There is a pressure to produce scientific papers, however often confronted with"I need

    something quick."

    3.

    At many institutes in the world, there are only a few (mostly 1) statisticians, if any at all.This means that these people are extremely busy. As a result, they tend to advise simpleand easy to apply techniques, or they will have to do it themselves. For my teachingphilosophy statements, you may like to visit the Web siteOn Learning & Teaching.

    4. Communication between a statistician and decision-maker can be difficult. One speaks instatistical jargon; the other understands the monetary orutilitarianbenefit of using thestatistician's recommendations.

    Plugging numbers into the formulas and crunching them have no value by themselves. Youshould continue to put effort into the concepts and concentrate on interpreting the results.

    Even when you solve a small size problem by hand, I would like you to use the availablecomputer software and Web-based computation to do the dirty work for you.

    You must be able to read the logical secret in any formulas not memorize them. For example, incomputing the variance, consider its formula. Instead of memorizing, you should start with somewhy:

    i. Why do we square the deviations from the mean.Because, if we add up all deviations, we get always zero value. So, to deal with this problem, wesquare the deviations. Why not raise to the power of four (three will not work)? Squaring doesthe trick; why should we make life more complicated than it is? Notice also that squaring alsomagnifies the deviations; therefore it works to our advantage to measure the quality of the data.

    ii. Why is there a summation notation in the formula.To add up the squared deviation of each data point to compute the total sum of squareddeviations.

    iii. Why do we divide the sum of squares by n-1.The amount of deviation should reflect also how large the sample is; so we must bring in thesample size. That is, in general, larger sample sizes have larger sum of square deviation from themean. Why n-1 not n? The reason for n-1 is that when you divide by n-1, the sample's varianceprovides an estimated variance much closer to thepopulationvariance, than when you divide byn. You note that for large sample size n (say over 30), it really does not matter whether it isdivided by n or n-1. The results are almost the same, and they are acceptable. The factor n-1 iswhat we consider as the"degrees of freedom".

    This example shows how to question statistical formulas, rather than memorizing them. In fact,when you try to understand the formulas, you do not need to remember them, they are part ofyour brain connectivity. Clear thinking is always more important than the ability to doarithmetic.

    When you look at a statistical formula, the formula should talk to you, as when a musician looksat a piece of musical-notes, he/she hears the music.

    http://home.ubalt.edu/ntsbarsh/Business-stat/IT/ONLINE.HTMLhttp://home.ubalt.edu/ntsbarsh/Business-stat/IT/ONLINE.HTMLhttp://home.ubalt.edu/ntsbarsh/Business-stat/IT/ONLINE.HTMLhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Utility.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Utility.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Utility.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Utility.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/IT/ONLINE.HTML
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    3/13

    computer-assisted learning: The computer-assisted learning provides you a"hands-on"experience which will enhance your understanding of the concepts and techniques covered inthis site.

    Java, once an esoteric programming language for animating Web pages, is now a full-fledged

    platform for building JavaScript E-labs' learning objects with useful applications. As you used todo experiments in physics labs to learn physics, computer-assisted learning enables you to useany online interactive tool available on the Internet to perform experiments. The purpose is thesame; i.e., to understand statistical concepts by using statistical applets which are entertainingand educating.

    The appearance of computer software, JavaScript, Statistical Demonstration Applets, and OnlineComputation are the most important events in the process of teaching and learning concepts inmodel-based, statistical decision making courses. These e-lab Technologies allow you toconstruct numerical examples to understand the concepts, and to find their significance foryourself.

    Unfortunately, most classroom courses are not learning systems. The way the instructors attemptto help their students acquire skills and knowledge has absolutely nothing to do with the waystudents actually learn. Many instructors rely on lectures and tests, and memorization. All toooften, they rely on"telling." No one remembers much that's taught by telling, and what's tolddoesn't translate into usable skills. Certainly, we learn by doing, failing, and practicing until wedo it right. The computer assisted learning serves this purpose.

    A course in appreciation of statistical thinking gives business professionals anedge. Professionals with strong quantitative skills are in demand. Thisphenomenon will grow as the impetus for data-based decisions strengthens and

    the amount and availability of data increases. The statistical toolkit can bedeveloped and enhanced at all stages of a career. Decision making process underuncertainty is largely based on application of statistics for probability assessmentof uncontrollable events (or factors), as well as risk assessment of your decision.For the foundation of decision making visitOperations/Operational Researchsite.For more statistical-based Web sites with decision making applications, visitDecision Science Resources, andModeling and Simulation Resourcessites.

    The main objective for this course is to learn statistical thinking; to emphasizemore on concepts, and less theory and fewer recipes, and finally to foster activelearning using the useful and interesting Web-sites. It is already a known factthat"Statistical thinking will one day be as necessary for efficient citizenship asthe ability to read and write." So, let's be ahead of our time.

    Further Readings:Chernoff H., A Conversation With Herman Chernoff, Statistical Science, Vol. 11, No. 4, 335-350, 1996.Churchman C., The Design of Inquiring Systems, Basic Books, New York, 1971. Early in the book he statedthat knowledge could be considered as a collection of information, or as an activity, or as a potential. Healso noted that knowledge resides in the user and not in the collection.Rustagi M., et al. (eds.), Recent Advances in Statistics: Papers in Honor of Herman Chernoff on His SixtiethBirthday, Academic Press, 1983.

    The Birth of Probability and Statistics

    The original idea of"statistics" was the collection of information about and forthe"state". The word statistics derives directly, not from any classical Greek orLatin roots, but from the Italian word for state.

    The birth of statistics occurred in mid-17th century. A commoner, named JohnGraunt, who was a native of London, began reviewing a weekly churchpublication issued by the local parish clerk that listed the number of births,christenings, and deaths in each parish. These so called Bills of Mortality alsolisted the causes of death. Graunt who was a shopkeeper organized this data in theform we call descriptive statistics, which was published asNatural and Political

    Observations Made upon the Bills of Mortality. Shortly thereafter he was elected

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre/opre640online.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre/opre640online.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre/opre640online.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/Refop.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/Refop.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/RefSim.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/RefSim.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/RefSim.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/RefSim.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/Refop.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre/opre640online.htm
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    4/13

    as a member of Royal Society. Thus, statistics has to borrow some concepts fromsociology, such as the concept ofPopulation. It has been argued that sincestatistics usually involves the study of human behavior, it cannot claim theprecision of the physical sciences.

    Probability has much longer history. Probability is derived from the verb to probemeaning to"find out" what is not too easily accessible or understandable. Theword"proof" has the same origin that provides necessary details to understandwhat is claimed to be true.

    Probability originated from the study of games of chance and gambling during the16th century. Probability theory was a branch of mathematics studied by BlaisePascal and Pierre de Fermat in the seventeenth century. Currently in 21st century,probabilistic modeling is used to control the flow of traffic through a highwaysystem, a telephone interchange, or a computer processor; find the geneticmakeup of individuals or populations; quality control; insurance; investment; and

    other sectors of business and industry.

    New and ever growing diverse fields of human activities are using statistics;however, it seems that this field itself remains obscure to the public. ProfessorBradley Efron expressed this fact nicely:

    During the 20th Century statistical thinking and methodology have becomethe scientific framework for literally dozens of fields including education,agriculture, economics, biology, and medicine, and with increasinginfluence recently on the hard sciences such as astronomy, geology, andphysics. In other words, we have grown from a small obscure field into a

    big obscure field.

    Statistical Modeling for Decision-Making under Uncertainties:

    From Data to the Instrumental Knowledge

    In this diverse world of ours, no two things are exactly the same. A statistician isinterested in both the differences and the similarities; i.e., both departures andpatterns.

    The actuarial tables published by insurance companies reflect their statisticalanalysis of the average life expectancy of men and women at any given age. Fromthese numbers, the insurance companies then calculate the appropriate premiumsfor a particular individual to purchase a given amount of insurance.

    Exploratory analysis of data makes use of numerical and graphical techniques tostudy patterns and departures from patterns. The widely used descriptivestatistical techniques are: FrequencyDistribution; Histograms; Boxplot;Scattergrams and Error Bar plots; and diagnostic plots.

    In examining distribution of data, you should be able to detect importantcharacteristics, such as shape, location, variability, and unusual values. Fromcareful observations of patterns in data, you can generate conjectures aboutrelationships among variables. The notion of how one variable may be associatedwith another permeates almost all of statistics, from simple comparisons ofproportions through linear regression. The difference between association andcausation must accompany this conceptual development.

    Data must be collected according to a well-developed plan if valid information ona conjecture is to be obtained. The plan must identify important variables relatedto the conjecture, and specify how they are to be measured. From the datacollection plan, a statistical model can be formulated from whichinferencescanbe drawn.

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulation
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    5/13

    As an example ofstatistical modeling with managerial implications, such as"what-if" analysis, consider regression analysis. Regression analysis is a powerfultechnique for studying relationship between dependent variables (i.e., output,performance measure) and independent variables (i.e., inputs, factors, decisionvariables). Summarizing relationships among the variables by the most

    appropriate equation (i.e., modeling) allows us to predict or identify the mostinfluential factors and study their impacts on the output for any changes in theircurrent values.

    Frequently, for example the marketing managers are faced with the question,What Sample Size Do I Need? This is an important and common statisticaldecision, which should be given due consideration, since an inadequate samplesize invariably leads to wasted resources. The sample size determination sectionprovides a practical solution to this risky decision.

    Statistical models are currently used in various fields of business and science.

    However, the terminology differs from field to field. For example, the fitting ofmodels to data, called calibration, history matching, and data assimilation, are allsynonymous withparameterestimation.

    Your organization database contains a wealth of information, yet the decisiontechnology group members tap a fraction of it. Employees waste time scouringmultiple sources for a database. The decision-makers are frustrated because theycannot get business-critical data exactly when they need it. Therefore, too manydecisions are based on guesswork, not facts. Many opportunities are also missed,if they are even noticed at all.

    Knowledge is what we know well. Information is the communication ofknowledge. In every knowledge exchange, there is a sender and a receiver. Thesender make common what is private, does the informing, the communicating.Information can be classified as explicit and tacit forms. The explicit informationcan be explained in structured form, while tacit information is inconsistent andfuzzy to explain. Know that data are only crude information and not knowledgeby themselves.

    Data is known to be crude information and not knowledge by itself. The sequencefrom data to knowledge is: from Data to Information, from Information toFacts, and finally, from Facts to Knowledge. Data becomes information, whenit becomes relevant to your decision problem. Information becomes fact, when thedata can support it. Facts are what the data reveals. However the decisiveinstrumental (i.e., applied) knowledge is expressed together with some statisticaldegree of confidence.

    Fact becomes knowledge, when it is used in the successful completion of adecision process. Once you have a massive amount of facts integrated asknowledge, then your mind will be superhuman in the same sense that mankindwith writing is superhuman compared to mankind before writing. The followingfigure illustrates the statistical thinking process based on data in constructingstatistical models for decision making under uncertainties.

    Click on the image to enlarge it and THEN print it.The Path from Statistical Data to Managerial Knowledge

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/data.gifhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamerts
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    6/13

    The above figure depicts the fact that as the exactness of a statistical modelincreases, the level of improvements in decision-making increases. That's why weneed Business Statistics. Statistics arose from the need to place knowledge on asystematic evidence base. This required a study of the rules of computationalprobability, the development of measures of data properties and relationships, and

    so on.

    Statistical inference aims at determining whether any statistical significance canbe attached that results after due allowance is made for any random variation as asource of error. Intelligent and critical inferences cannot be made by those who donot understand the purpose, the conditions, and applicability of the varioustechniques for judging significance.

    Considering the uncertain environment, the chance that"good decisions" are madeincreases with the availability of"good information." The chance that"goodinformation" is available increases with the level of structuring the process of

    Knowledge Management. The above figure also illustrates the fact that as theexactness of a statistical model increases, the level of improvements in decision-making increases.

    Knowledge is more than knowing something technical. Knowledge needswisdom. Wisdom is the power to put our time and our knowledge to the properuse. Wisdom comes with age and experience. Wisdom is the accurate applicationof accurate knowledge and its key component is to knowing the limits of yourknowledge. Wisdom is about knowing how something technical can be best usedto meet the needs of the decision-maker. Wisdom, for example, creates statisticalsoftware that is useful, rather than technically brilliant. For example, ever since

    the Web entered the popular consciousness, observers have noted that it putsinformation at your fingertips but tends to keep wisdom out of reach.

    The notion of "wisdom" in the sense of practical wisdom has entered Westerncivilization through biblical texts. In the Hellenic experience this kind of wisdomreceived a more structural character in the form of philosophy. In this sensephilosophy also reflects one of the expressions of traditional wisdom.

    Business professionals need a statistical toolkit. Statistical skills enable you tointelligently collect, analyze and interpret data relevant to their decision-making.Statistical concepts enable us to solve problems in a diversity of contexts.Statistical thinking enables you to add substance to your decisions.

    That's why we need statistical data analysis in probabilistic modeling. Statisticsarose from the need to place knowledge management on a systematic evidencebase. This required a study of the rules of computational probability, thedevelopment of measures of data properties, relationships, and so on.

    The purpose of statistical thinking is to get acquainted with the statisticaltechniques, to be able to execute procedures using available JavaScript, and to beconscious of the conditions and limitations of various techniques.

    Statistical Decision-Making Process

    Unlike thedeterministicdecision-making process, such aslinear optimizationbysolvingsystems of equations,Parametric systems of equationsand indecisionmaking under pure uncertainty, the variables are often more numerous and moredifficult to measure and control. However, the steps are the same. They are:

    1. Simplification2. Building a decision model3. Testing the model4. Using the model to find the solution:

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre/partVIII.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre/partVIII.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre/partVIII.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/LPTools.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/LPTools.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/LPTools.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/SysEq.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/SysEq.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/SysEq.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/PaRHSSyEqu.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/PaRHSSyEqu.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/PaRHSSyEqu.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/ADuncertain.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/ADuncertain.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/ADuncertain.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/ADuncertain.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/ADuncertain.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/ADuncertain.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/PaRHSSyEqu.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/SysEq.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/LPTools.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre/partVIII.htm
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    7/13

    o It is a simplified representation of the actual situationo It need not be complete or exact in all respectso It concentrates on the most essential relationships and ignores the less essential

    ones.o It is more easily understood than the empirical (i.e., observed) situation, and

    hence permits the problem to be solved more readily with minimum time andeffort.5. It can be used again and again for similar problems or can be modified.

    Fortunately the probabilistic and statistical methods for analysis and decision making underuncertainty are more numerous and powerful today than ever before. The computer makespossible many practical applications. A few examples ofbusiness applications are the following:

    An auditor can use random sampling techniques to audit the accounts receivable forclients.

    A plant manager can use statistical quality control techniques to assure the quality of hisproduction with a minimum of testing or inspection.

    A financial analyst may use regression and correlation to help understand the relationshipof a financial ratio to a set of other variables in business.

    A market researcher may use test of significace to accept or reject the hypotheses about agroup of buyers to which the firm wishes to sell a particular product.

    A sales manager may use statistical techniques to forecast sales for the coming year.Questions Concerning Statistical the Decision-Making Process:

    1. Objectives or Hypotheses: What are the objectives of the study or the questions to beanswered? What is the population to which the investigators intend to refer their

    findings?2. Statistical Design: Is the study a planned experiment (i.e.,primary data), or an analysis ofrecords ( i.e.,secondary data)? How is the sample to be selected? Are there possiblesources of selection, which would make the sample atypical or non-representative? If so,what provision is to be made to deal with this bias? What is the nature of the controlgroup, standard of comparison, or cost? Remember that statistical modeling meansreflections before actions.

    3. Observations: Are there clear definition of variables, including classifications,measurements (and/or counting), and the outcomes? Is the method of classification or ofmeasurement consistent for all the subjects and relevant to Item No. 1.? Are therepossible biased in measurement (and/or counting) and, if so, what provisions must bemade to deal with them? Are the observations reliable and replicable (to defend yourfinding)?

    4. Analysis: Are the data sufficient and worthy of statistical analysis? If so, are thenecessary conditions of the methods of statistical analysis appropriate to the source andnature of the data? The analysis must be correctly performed and interpreted.

    5. Conclusions: Which conclusions are justifiable by the findings? Which are not? Are theconclusions relevant to the questions posed in Item No. 1?

    6. Representation of Findings: The finding must be represented clearly, objectively, insufficient but non-technical terms and detail to enable the decision-maker (e.g., amanager) to understand and judge them for himself? Is the finding internally consistent;i.e., do the numbers added up properly? Can the different representation be reconciled?

    7. Managerial Summary: When your findings and recommendation(s) are not clearly put, orframed in an appropriate manner understandable by the decision maker, then the decisionmaker does not feel convinced of the findings and therefore will not implement any of therecommendations. You have wasted the time, money, etc. for nothing.

    What is Business Statistics?

    The main objective of Business Statistics is to makeinferences(e.g., prediction, makingdecisions) about certain characteristics of apopulationbased on information contained in arandom sample from the entire population. The condition for randomness is essential to makesure the sample is representative of the population.

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprimhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rseconprim
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    8/13

    Business Statistics is the science ofgood' decision making in the face of uncertainty and is usedin many disciplines, such as financial analysis, econometrics, auditing, production andoperations, and marketing research. It provides knowledge and skills to interpret and usestatistical techniques in a variety of business applications. A typical Business Statistics course isintended for business majors, and covers statistical study, descriptive statistics (collection,

    description, analysis, and summary of data), probability, and the binomial and normaldistributions, test of hypotheses and confidence intervals, linear regression, and correlation.

    Statistics is a science of making decisions with respect to the characteristics of a group ofpersons or objects on the basis of numerical information obtained from a randomly selectedsample of the group. Statisticians refer to this numerical observation as realization of a randomsample. However, notice that one cannot see a random sample. A random sample is only asample of a finite outcomes of a random process.

    At the planning stage of a statistical investigation, the question of sample size (n) is critical. Forexample, sample size for sampling from a finite population of size N, is set at: N+1, rounded up

    to the nearest integer. Clearly, a larger sample provides more relevant information, and as aresult a more accurate estimation and better statistical judgement regarding test of hypotheses.

    Under-lit Streets and the Crimes Rate: It is a fact that if residential city streets are under-litthen major crimes take place therein. Suppose you are working in the Mayers office and put you

    in charge of helping him/her in deciding which manufacturers to buy the light bulbs from inorder to reduce the crime rate by at least a certain amount, given that there is a limited budget?

    Click on the image to enlarge it and THEN print it.Activities Associated with the General

    Statistical Thinking and Its Applications

    The above figure illustrates the idea ofstatistical inferencefrom a random sample about thepopulation. It also provides estimation for thepopulation's parameters; namely the expected

    value x, the standard deviation, and thecumulative distribution function (cdf)Fx, and theircorresponding sample statistics, mean , sample standard deviation Sx, andempirical (i.e.,observed) cumulative distribution function(cdf), respectively.

    The major task of Statistics is the scientific methodology for collecting, analyzing, interpreting arandom sample in order to draw inference about some particular characteristic of a specificHomogenous Population. For two major reasons, it is often impossible to study an entirepopulation:

    The process would be too expensive or too time-consuming.The process would be destructive.

    In either case, we would resort to looking at a sample chosen from the populationand trying to infer information about the entire population by only examining thesmaller sample. Very often the numbers, which interest us most about the

    population, are the mean and standard deviation , any number -- like the meanor standard deviation -- which is calculated from an entire population, is called aParameter. If the very same numbers are derived only from the data of a sample,then the resulting numbers are called Statistics. Frequently, Greek letters representparametersand Latin letters representstatistics(as shown in the above Figure).

    The uncertainties in extending and generalizing sampling results to the populationare measures and expressed by probabilistic statements called Inferential

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rcdffunchttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rcdffunchttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rcdffunchttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rcdffunchttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatistictermhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatistictermhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatistictermhttp://home.ubalt.edu/ntsbarsh/Business-stat/risk.gifhttp://home.ubalt.edu/ntsbarsh/Business-stat/risk.gifhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatistictermhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rcdffunchttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rcdffunchttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rPopulationhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentia
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    9/13

    Statistics. Therefore, probability is used in statistics as a measuring tool anddecision criterion for dealing with uncertainties in inferential statistics.

    An important aspect of statistical inference is estimating population values(parameters) from samples of data. An estimate of a parameter isunbiasedif the

    expected value of sampling distribution is equal to that population. The samplemean is an unbiased estimate of the population mean. The sample variance is anunbiased estimate of population variance. This allows us to combine severalestimates to obtain a much better estimate. The Empirical distribution is thedistribution of a random sample, shown by a step-function in the above figure.The empirical distribution function is an unbiased estimate for the populationdistribution function F(x).

    Given you already have a realization set of a random sample, to compute thedescriptive statistics including those in the above figure, you may like usingDescriptive Statistics JavaScript.

    Hypothesis testing is a procedure for reaching a probabilistic conclusive decisionabout a claimed value for a populations parameter based on a sample. To reduce

    this uncertainty and having high confidence that statistical inferences are correct,a sample must give equal chance to each member of population to be selectedwhich can be achieved by sampling randomly and relatively large sample size n.

    Given you already have a realization set of a random sample, to perform

    hypothesis testing for mean and variance 2, you may like usingTesting theMeanandTesting the VarianceJavaScript, respectively.

    Statistics is a tool that enables us to impose order on the disorganized cacophonyof the real world of modern society. The business world has grown both in sizeand competition. Corporate executive must take risk in business, hence the needfor business statistics.

    Business statistics has grown with the art of constructing charts and tables! It is ascience of basing decisions on numerical data in the face of uncertainty.

    Business statistics is a scientific approach to decision making under risk. Inpracticing business statistics, we search for an insight, not the solution. Our searchis for the one solution that meets all the business's needs with the lowest level of

    risk. Business statistics can take a normal business situation, and with the properdata gathering, analysis, and re-search for a solution, turn it into an opportunity.

    While business statistics cannot replace the knowledge and experience of thedecision maker, it is a valuable tool that the manager can employ to assist in thedecision making process in order to reduce the inherent risk, measured by, e.g.,

    the standard deviation .

    Among other useful questions, you may ask why we are interested in estimating

    the population's expected value and its Standard Deviation ? Here are someapplicable reasons. Business Statistics must provide justifiable answers to the

    following concerns for every consumer and producer:

    1. What is your (or your customers) Expectation of the product/service youbuy (or that your sell)? That is, what is a good estimate for ?

    2. Given the information about your (or your customers) expectation, what isthe Quality of the product/service you buy (or that you sell)? That is, what

    is a good estimate for ?3. Given the information about what you buy (or your sell) expectation, and

    the quality of the product/service, how does the product/service compare

    with other existing similar types? That is, comparing several 's, and

    several 's.

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rgoodqualihttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rgoodqualihttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rgoodqualihttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Descriptive.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/MeanTest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/MeanTest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/MeanTest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/MeanTest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/variationtest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/variationtest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/variationtest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/variationtest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/MeanTest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/MeanTest.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Descriptive.htmhttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rgoodquali
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    10/13

    Common Statistical Terminology with Applications

    Like all profession, also statisticians have their own keywords and phrases to ease

    a precise communication. However, one must interpret the results of any decisionmaking in a language that is easy for the decision-maker to understand.Otherwise, he/she does not believe in what you recommend, and therefore doesnot go into the implementation phase. This lack of communication betweenstatisticians and the managers is the major roadblock for using statistics.

    Population: A population is any entire collection of people, animals, plants orthings on which we may collect data. It is the entire group of interest, which wewish to describe or about which we wish to draw conclusions. In the above figurethe life of the light bulbs manufactured say by GE, is the concerned population.

    Qualitative and Quantitative Variables: Any object or event, which can vary insuccessive observations either in quantity or quality is called a"variable."Variables are classified accordingly as quantitative or qualitative. A qualitativevariable, unlike a quantitative variable does not vary in magnitude in successiveobservations. The values of quantitative and qualitative variables arecalled"Variates" and"Attributes", respectively.

    Variable: A characteristic or phenomenon, which may take different values, suchas weight, gender since they are different from individual to individual.

    Randomness: Randomness means unpredictability. The fascinating fact about

    inferential statistics is that, although each random observation may not bepredictable when taken alone, collectively they follow a predictable pattern calledits distribution function. For example, it is a fact that the distribution of a sampleaverage follows a normal distribution for sample size over 30. In other words, anextreme value of the sample mean is less likely than an extreme value of a fewraw data.

    Sample: A subset of a population or universe.

    An Experiment: An experiment is a process whose outcome is not known inadvance with certainty.

    Statistical Experiment: An experiment in general is an operation in which onechooses the values of some variables and measures the values of other variables,as in physics. A statistical experiment, in contrast is an operation in which onetake a random sample from a population and infers the values of some variables.For example, in a survey, we"survey" i.e."look at" the situation without aiming tochange it, such as in a survey of political opinions. A random sample from therelevant population provides information about the voting intentions.

    In order to make any generalization about a population, a random sample from theentire population; that is meant to be representative of the population, is oftenstudied. For each population, there are many possible samples. A sample statisticgives information about a correspondingpopulation parameter. For example, thesample mean for a set of data would give information about the overall population

    mean .

    It is important that the investigator carefully and completely defines thepopulation before collecting the sample, including a description of the members tobe included.

    Example: The population for a study of infant health might be all children born inthe U.S.A. in the 1980's. The sample might be all babies born on 7th of May in anyof the years.

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamertshttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rparamerts
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    11/13

    An experiment is any process or study which results in the collection of data, theoutcome of which is unknown. In statistics, the term is usually restricted tosituations in which the researcher has control over some of the conditions underwhich the experiment takes place.

    Example: Before introducing a new drug treatment to reduce high blood pressure,the manufacturer carries out an experiment to compare the effectiveness of thenew drug with that of one currently prescribed. Newly diagnosed subjects arerecruited from a group of local general practices. Half of them are chosen atrandom to receive the new drug, the remainder receives the present one. So, theresearcher has control over the subjects recruited and the way in which they areallocated to treatment.

    Design of experiments is a key tool for increasing the rate of acquiring newknowledge. Knowledge in turn can be used to gain competitive advantage,shorten the product development cycle, and produce new products and processes

    which will meet and exceed your customer's expectations.

    Primary data and Secondary data sets: If the data are from a plannedexperiment relevant to the objective(s) of the statistical investigation, collected bythe analyst, it is called a Primary Data set. However, if some condensed recordsare given to the analyst, it is called a Secondary Data set.

    Random Variable: A random variable is a real function (yes, it is called"variable", but in reality it is a function) that assigns a numerical value to eachsimple event. For example, in sampling for quality control an item could bedefective or non-defective, therefore, one may assign X=1, and X = 0 for a

    defective and non-defective item, respectively. You may assign any other twodistinct real numbers, as you wish; however, non-negative integer randomvariables are easy to work with. Random variables are needed since one cannot doarithmetic operations on words; the random variable enables us to computestatistics, such as average and variance. Any random variable has a distribution ofprobabilities associated with it.

    Probability: Probability (i.e., probing for the unknown) is the tool used foranticipating what the distribution of data should look like under a given model.Random phenomena are not haphazard: they display an order that emerges only inthe long run and is described by adistribution. The mathematical description ofvariation is central to statistics. The probability required forstatistical inferenceisnot primarily axiomatic or combinatorial, but is oriented toward describing datadistributions.

    Sampling Unit: A unit is a person, animal, plant or thing which is actuallystudied by a researcher; the basic objects upon which the study or experiment isexecuted. For example, a person; a sample of soil; a pot of seedlings; a zip codearea; a doctor's practice.

    Parameter: A parameter is an unknown value, and therefore it has to beestimated. Parameters are used to represent a certain population characteristic. For

    example, the population mean is a parameter that is often used to indicate theaverage value of a quantity.

    Within a population, a parameter is a fixed value that does not vary. Each sampledrawn from the population has its own value of any statistic that is used toestimate this parameter. For example, the mean of the data in a sample is used to

    give information about the overall mean in the population from which thatsample was drawn.

    Statistic: A statistic is a quantity that is calculated from a sample of data. It isused to give information about unknown values in the corresponding population.

    For example, the average of the data in a sample is used to give information aboutthe overall average in the population from which that sample was drawn.

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDisthttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rstatInferentiahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rdensityDist
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    12/13

    A statistic is a function of an observable random sample. It is therefore anobservablerandom variable. Notice that, while a statistic is a"function" ofobservations, unfortunately, it is commonly called a random"variable" not afunction.

    It is possible to draw more than one sample from the same population, and thevalue of a statistic will in general vary from sample to sample. For example, theaverage value in a sample is a statistic. The average values in more than onesample, drawn from the same population, will not necessarily be equal.

    Statistics are often assigned Roman letters (e.g. and s), whereas the equivalentunknown values in the population (parameters ) are assigned Greek letters (e.g.,

    , ).

    The word estimate means to esteem, that is giving a value to something. Astatistical estimate is an indication of the value of an unknown quantity based on

    observed data.

    More formally, an estimate is the particular value of an estimator that is obtainedfrom a particular sample of data and used to indicate the value of a parameter.

    Example: Suppose the manager of a shop wanted to know , the meanexpenditure of customers in her shop in the last year. She could calculate theaverage expenditure of the hundreds (or perhaps thousands) of customers who

    bought goods in her shop; that is, the population mean . Instead she could use

    an estimate of this population mean by calculating the mean of a representativesample of customers. If this value were found to be $25, then $25 would be her

    estimate.

    There are two broad subdivisions of statistics: Descriptive Statistics andInferential Statistics as described below.

    Descriptive Statistics: The numerical statistical data should be presented clearly,concisely, and in such a way that the decision maker can quickly obtain theessential characteristics of the data in order to incorporate them into decisionprocess.

    The principal descriptive quantity derived from sample data is the mean ( ),

    which is the arithmetic average of the sample data. It serves as the most reliablesingle measure of the value of a typical member of the sample. If the samplecontains a few values that are so large or so small that they have an exaggeratedeffect on the value of the mean, the sample is more accurately represented by themedian -- the value where half the sample values fall below and half above.

    The quantities most commonly used to measure the dispersion of the values abouttheir mean are the variance s2 and its square root, the standard deviation s. Thevariance is calculated by determining the mean, subtracting it from each of thesample values (yielding the deviation of the samples), and then averaging thesquares of these deviations. The mean and standard deviation of the sample are

    used as estimates of the corresponding characteristics of the entire group fromwhich the sample was drawn. They do not, in general, completely describe thedistribution (Fx) of values within either the sample or the parent group; indeed,different distributions may have the same mean and standard deviation. They do,however, provide a complete description of the normal distribution, in whichpositive and negative deviations from the mean are equally common, and smalldeviations are much more common than large ones. For a normally distributed setof values, a graph showing the dependence of the frequency of the deviationsupon their magnitudes is a bell-shaped curve. About 68 percent of the values willdiffer from the mean by less than the standard deviation, and almost 100 percentwill differ by less than three times the standard deviation.

    http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rrandomvahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rrandomvahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rrandomvahttp://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rrandomva
  • 7/31/2019 Introduction to Statistical Thinking for Decision Making

    13/13