Synopsis of CBU Graduate Statistics Course

The Anatomy of Statistics: Models, Hypotheses, Significance and Power
- The Naming of Parts
- Experiments and Data
An Experiment E is prescribed by a Method. When the Experiment E is performed Data X are observed. Repeated performance of E may produce Data which vary.
- Models and Parameters

A Model describes how Data arise, by identifying Systematic and Unexplained components.

Data = Systematic + Unexplained.

The Systematic and Unexplained components are linked together through one or more Parameters via a Probability formulation.

Parameters can relate to either the Systematic components (e.g. Mean) or the Unexplained components (e.g. Standard deviation; Variance; Degrees of Freedom)

Probability Probability is a fundamental concept which it is difficult to define. There are divergent theories on what it means. There is common agreement on the calculus it obeys. Intuitions can easily lead one astray. The Birthdays paradox The Monty Hall Puzzle Three doors lead to 1 Valuable Prize (VP) and 2 Worthless Prizes (WP). The Game Show Host (GSH) asks you to pick one. You select a door. The GSH opens one of the other doors, revealing a WP. He asks if you want to stick with the door you picked or change to the other unopened door. What should you do? Stick? Or Switch? The Monty Hall Puzzle (2) The optimal solution turns out to be: Always Switch The probability the chosen door has the VP inside was 1/3 initially and nothing has been done to change that. However, the opening of the door to the WP has moved all the remaining probability (2/3) to the other unopened door. Probability vs. Statistics Hypotheses and Inference What can we say about the ‘true’ value of p? Point estimates and Confidence Intervals? Is the Data compatible with the ‘true’ value of p being, say, 0.75? Is the weight of the evidence sufficient to say that we would prefer to say that p=0.75 rather than that p=0.5? The weight of the evidence We will step through a sequence of possible values of p looking to see how our data X=13 look. p = 0.1

p = 0.2 p = 0.3 p = 0.4

… at last ... X = 13 just begins to show up on the radar at p=0.4 p = 0.5 p = 0.6 p = 0.7 p = 0.8 p = 0.9 The Rise and Fall of Probability (1) The Probability of the Data 13/16 recovered rises and falls as p moves from from near 0 to near 1. This behaviour is described as the Likelihood Function for p relative to the Data X. Here is a graph of the Likelihood Function, first for the values of p we have look at so far ... Likelihood values The Rise and Fall of Probability (2) … and here is a complete graph covering all possible values of p The Likelihood Function Estimation and Inferences (1) The Likelihood Function is pivotal in understanding how the Data throw light on the Parameters Estimation and Inferences (2) The value of p where the Likelihood takes its largest value can be a sensible starting point for estimating p. This is called the Maximum Likelihood Estimate (MLE). Often the MLE is the ‘natural one’: MLE(p) =13/16=0.833

Estimation and Inferences (3) The sharpness of the peak of the curve tells us the possible scale of the error in this estimate Confidence Intervals can be based on this. The relative heights (Likelihood Ratios) are a principal tool for comparing different Parameters. Key Questions Does the experimental manipulation have an effect? To what extent does it have an effect? Does the treatment work? How well does it work? Does behaviour B predict pathology P? How well does it predict it?

Schools of Statistical Inference Ronald Aylmer FISHER Jergy NEYMAN and Egon PEARSON Rev. Thomas BAYES Fisherian Inference R A Fisher Likelihood P values Tests of Significance Null Hypothesis Testing Neyman & Pearson Inference J Neyman and E Pearson Testing between Alternative Hypotheses Size Power Bayesian Inference T Bayes Prior and Posterior Probabilities Revision of beliefs in the light of the data R A Fisher: P values and Significance Tests (1) Null Hypothesis H0 e.g. H0: p = 0.5 Data may give evidence against H0. Order possible outcomes in terms of degree of deviation from H0. This may involve a judicious choice of Test Statistic

R A Fisher: P values and Significance Tests (2) P value is the sum of the probabilities of possible outcomes of the Experiment at least as extreme (improbable) as the Data. P value is also known as the Significance Level or Significance of the Data.

R A Fisher: P values and Significance Tests (3) Sometime we quote the actual P value. P = 0.112 Sometimes we quote the P value relative to conventional values, e.g. P>0.1, P<0.01 etc. R A Fisher: P values and Significance Tests (4) Sometimes, especially in Tables, a Baedeker starring system operates:

means 0.01  P < 0.05;

* means 0.001  P < 0.01;

*** means P < 0.001

R A Fisher and the Design of Experiments Fisher’s influence on mainstream scientific methodology is enormous In particular he created a new science of the Design of Experiments Factors Covariates Interaction Confounding Randomization Neyman and Pearson: Significance Tests Deciding between a Null Hypothesis and an Alternative Hypothesis E.g. Hnull: p=0.5 vs. Halt: p=0.75 Two permitted decisions Accept Hnull Reject Hnull Two types of Error

A Tale of Two Errors (1) Type I When we incorrectly Reject Hnull although Hnull is correct Alpha (Type I error rate)  ‘False Alarms’ ‘Size’ = Alpha A Tale of Two Errors (2) Type II When we incorrectly decide to Accept Hnull although Halt is correct Beta (Type II error rate)  ‘Missed Signals’ ‘Power’ = 1-Beta N-P Hypothesis Testing Fix Alpha (in advance!) Find the Rejection Region with the given Size Alpha and smallest possible Beta. This is intimately linked to the Likelihood Function (strictly Likelihood Ratios). Look to see if Data fall in the Rejection Region N-P Hypothesis Testing (2) If Data fall in Rejection Region then ‘We Reject the Null Hypothesis’. We don’t accept the alternative hypothesis. Alternatively, ‘We Do Not Reject the Null Hypothesis’. We don’t accept the null hypothesis. Alpha in Advance gets entangled with observed P value. Conventional Hybrid Inference Dress things up as Hypothesis Testing Use observed P values as differential indicators of significance Be aware, and Beware! Read Gerd Gigerenzer (1993). The superego, the ego, and the id in statistical reasoning. In A handbook for data analysis. Hillsdale, NJ: Erlbaum, (pp. 311-339) How Statistics took over the scientific world Size and Power The Size of various Tests The Power of various Tests The Size/Power Trade-Off What you buy with larger samples Bayesian Inference Prior probability distribution over space of parameters, expressing prior beliefs; Multiply by likelihood for observed data, yielding ... Posterior probability distribution, expressing revised beliefs having observed new data Summary based on posterior distribution A Tale of Two Bayesians Conclusion The concepts we have outlined are the basis of all the statistical procedures that we use, though we usually have to take the mathematical details on trust. The concepts are not very easy and efforts made establishing a clear understanding will yield dividends. Used effectively, Statistics are a good support; they can however be a soft underbelly for examiners, referees, and journal editors. Finding out more ... Next Week ... Peter WATSON will speak on ... EDA: Exploratory Data Analysis

Exploratory Data Analysis
Categorical Data Analysis
Simple and multiple linear regression
ANOVA of balanced multi-factorial designs: between subject designs, and single subject studies
The General Linear Model and complex designs including Analysis of Covariance
Power analysis
Repeated Measures and Mixed Model ANOVA
Latent variable models: factor analysis and all that
Post-hoc tests, multiple comparisons, contrasts and handling interactions

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

Upload page content

Synopsis of CBU Graduate Statistics Course