probabilistic modeling incomplete history in 0predrag/classes/2015springb555/s2.pdf · • babbage...
TRANSCRIPT
CONFLUENCE OF THREE DISCIPLINES
Probability Theory• mathematical infrastructure for manipulating probabilities, grounded in
axioms of probability• wide choice of probabilistic models with well-understood theoretical
properties
Statistics• formulation of the process of narrowing down the solutions based on
observed data and experience (knowledge and assumptions)• leads to selection of optimal or acceptable models with respect to data and
experience• formalizes assessing confidence about those solutions
Computer Science• provides theory, algorithms, software to manage data and compute
solutions• formalizes studies of the tradeoffs between quality of solutions and available
resources (time, space, computer architecture)
PROBABILITY THEORY
Blaise Pascal (1623‐1662) Pierre de Fermat (1601‐1665)
The Problem of Points• Old game, already known in the 15th century• Introduced to Pascal by Chevalier de Mere, probably in 1654• Two players agree to toss a coin until someone wins n times. The bets are placed. They
play, but the game is interrupted. How should they split the money so it is fair?• What is the solution?
Fermat to Pascal, August 29, 1654
Monsieur,Our interchange of blows still continues, and Iam well pleased that our thoughts are in suchcomplete adjustment as it seems since theyhave taken the same direction and followedthe same road...
Both solved the problem, but in different ways. Fermat’s approach was combinatorial. Pascal introduces an expectation function.
PROBABILITY THEORY
Jacob Bernoulli (1665‐1705)
Important quotes from Ars Conjectandi:
• “Probability, indeed, is degree of certainty, and differs from the latter as a part differs from the whole. Truly, if complete and absolute certainty, which we represent by the letter a or by 1... “
• “To predict something is to measure its probability. Therefore, we define the science of prediction or stochastics, as the art of measuring probabilities of things as accurately as possible, to the end that, in judgments and actions, we may always choose or follow that which has been found to be better, more satisfactory, safer, or more carefully considered. On this alone turns all the wisdom of the philosopher and all the practical judgment of the statesman.”
The Art of Conjecturing, 1713
Ars Conjectandi
• Discussed probability• Introduced subjective notion of
probability• Introduced “Bernoulli trials”• Proved weak law of large numbers• Introduced “science of prediction”
Translation from: Encyclopedia Stochastikon
PROBABILITY THEORY
Abraham de Moivre (1667‐1754) The Doctrine of Chances, 1738
The Doctrine of Chances
• Introduced the concept of a normal distribution
• Showed that normal distribution is a limit of the binomial
• Gave the first take of the Central Limit Theorem (proved by Laplace)
Shafer, Vovk. The sources of Kolmogorov’s Grundbegriffe. Statistical Science (2006) 21(1): 70-98
If A and B cannot both happen, then:
PROBABILITY THEORY
Shafer, Vovk. The sources of Kolmogorov’s Grundbegriffe. Statistical Science (2006) 21(1): 70-98
• Bernoulli: “A run of a hundred [heads] may be metaphysically possible, but it is physically impossible. It has never happened and never will happen.”
• Probable: probability exceeds half of certainty, e.g. P(A) > 1/2• Possible: event has a low degree of certainty, e.g. P(A) > 1/20 or 1/30
19th century• Boltzmann’s second law of thermodynamics claims that a dissipative processes are
irreversible because the probability of a state with entropy far from the maximum is vanishingly small
• Major players in France, Germany, Russia, Britain (Borel, Frechet, Levy, Hadamard, Lebesgue, Gauss, Reimann, von Kries, Ellis, Venn, Kolmogorov, Markov)
20th century• Lack of clarity and rigor in the probability calculus; Henri Poincare said “one can hardly
give a satisfactory definition of probability”• David Hilbert’s 6th of 23 open problems presented at International Congress of
Mathematics in Paris (1900) was to treat probability axiomatically
PROBABILITY THEORY
Andrey Kolmogorov(1903‐1987)
Grundbegriffe der Wahrscheinlich‐keitsrechnung, 1933
Grundbegriffe der Wahrscheinlichkeitsrechnung
• Introduced axioms of probability that stood the test of time
Maurice Frechet, 1938, introduced him at the colloquium at University of Geneva with these words
It was at the moment when Mr. Borel introduced thisnew kind of additivity into the calculus of probability –in 1909, that is to say – that all the elements needed toformulate explicitly the whole body of axioms of(modernized classical) probability theory cametogether.It was not enough to have all the ideas in mind, to
recall them now and then; one must make sure thattheir totality is sufficient, bring them together explicitly,and take responsibility for saying that nothing furtheris needed in order to construct the theory.
This is exactly what Mr. Kolmogorov did. This is hisachievement. (And we do not believe he wanted toclaim any others, so far as the axiomatic theory isconcerned)
Shafer, Vovk. The sources of Kolmogorov’s Grundbegriffe. Statistical Science (2006) 21(1): 70-98
STATISTICS
Graunt, John. Natural and political observations mentioned in a following Index, and made upon the Bills of Mortality, 1665.
John Graunt (1620‐1674)Natural and Political Obser‐vations Made upon the Bills of Mortality, 1662 (1663)
Graunt discussed:
• trustworthiness of the data in the “bills” published over a 60‐year period
• description of mortality due to plague, including “imputation” of missing data
• detailed description and analysis of the gender ratio, discovered stability
• provided a “life table” in order to answer question on how many men of fighting age live in London
Follow ups on Graunt’s work:
• John Arbuthnot tested the hypothesis that the ratio of men vs. women was 1• Christian Huygens calculated the expected and median lifetime
STATISTICS
Fienberg. A brief history of statistics in thee and one-half chapters: a review essay. Statistical Science (1992) 7(2): 208-225.
STATISTICS
Fienberg. A brief history of statistics in thee and one-half chapters: a review essay. Statistical Science (1992) 7(2): 208-225.
STATISTICS
Thomas Bayes (1701‐1761)
Pierre‐Simon Laplace (1749‐1827)
Inverse Probability:
• Both Bayes and Laplace understood it, but Bayes died before publishing his work
• Bayes was first, Laplace went further –he was only 25 when he repeated Bayes’s work
• In the “Bayesian” sense, both used uniform priors
Laplace’s demon:
• “We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.“
An Essay towards solving a Problem in the Doctrine of Chances, 1763
STATISTICS
Carl Friedrich Gauss (1777‐1855)
The Method of Least Squares:
• search for a statistical approach for combining observations• The data was most frequently observations of planetary
positions, orbits, geodesic arcs• Laplace’s approach was ad hoc, people tried to improve it• Legendre comes up with the method of least squares in
1805, but quantification of uncertainty was missing• Gauss used normally distributed error terms for a system of
linear equations and then maximized posterior distribution and showed this was the same as Legendre’s method
• Laplace recognized that a normal distribution is important in itself and proved the central limit theorem
• “Gauss‐Laplace synthesis” is the foundation of modern statistics
But, the Bayesian statistics was not the only way:
• In 1778 Daniel Bernoulli (nephew of Jacob, son of Johann) proposed a method that can be considered a sketch of the maximum likelihood method
• Leonhard Euler wrote an appended commentary to the article criticizing this method as “arbitrary”; ML gets forgotten in light of success of Laplace and Gauss
• Maximum likelihood was later rediscovered in the 20th century, popularized by Fisher
STATISTICS
Fienberg. A brief history of statistics in thee and one-half chapters: a review essay. Statistical Science (1992) 7(2): 208-225.
by Adolphe Quetelet
STATISTICS
Fienberg. A brief history of statistics in thee and one-half chapters: a review essay. Statistical Science (1992) 7(2): 208-225.
COMPUTER SCIENCE
George Boole (1815‐1864)Gottfried Wilhelm Leibniz (1646‐1716)
Binary system as basis for computing:• Leibnitz: development of formal logic; advocated binary system for performing
calculations (0‐1 or on‐off) • George Boole: published “Boolean algebra” in 1854
Prehistory:
• people wanted to have devices for counting since forever (e.g. abacus)
• But, those were not general purpose computing machines
COMPUTER SCIENCE
Ada King (1815‐1852), Countess of Lovelace
Charles and Ada:• Babbage had an idea to construct a machine that can compute anything (called it “Analytical Engine”)• Ada Lovelace constructed the first program to compute Bernoulli numbers on the analytical engine.
Considered the first programmer.• Computer language Ada named after Ada Lovelace
Charles Babbage (1791‐1871)
Mechanical devices:
• Pascal: built the first mechanical adding machine in 1642 (apparently described by Hero of Alexandria)
• Babbage: began constructing a “difference machine” in 1822
Part of difference machine assembled by Babbage’s son. Actual difference machine constructed from Babbage’s design.
COMPUTER SCIENCE
David Hilbert (1862‐1943)
Computability:• Turing machine: represents a computing machine, can do
what any other computing machine can; the machine can compute any function that can be expressed as an algorithm (Church‐Turing thesis)
• Recursion, lambda calculus and Turing machines are equivalent in terms of representing a class of functions
Hilbert’s Entscheidungsproblem:
• Is mathematics decidable? Is there a mechanical method that can be applied to any mathematical assertion and eventually tell whether that assertion is true or false?
Alan Turing (1912‐1954)
Kurt Godel (“Her Warum”):• Limits of what could be proved and disproved; Recursion
Alan Turing:• Halting problem is undecidable• There is no solution to the “decision
problem”
Alonzo Church:• Lambda calculus
COMPUTER SCIENCE
Von Neumann architecture:• General purpose computing
architecture• Keeps code and data in the same
memory
John von Neumann (1903‐1957)
von Neumann architecture
Konrad Zuse (1910‐1995)
• Turing‐complete electromechanical computer, Z3 (1941)
ENIAC:• Electronic Numerical Integrator And
Computer• conceived and designed at UPenn
Wikipedia