6  Inference

Statistical inference is the process of estimating population parameters from samples. In many ways, making inferences from a sample is the main point of statistical analysis.

6.1 Estimation

6.1.1 Definitions

Estimation involves inferences about the value of population parameters.

  • Point Estimate : A single-value estimate of a population parameter.

  • Confidence Interval : An interval which will contain a population parameter to a given probability (typically 95%) under repeated sampling.

6.1.2 Example

Returning to the tree height example, the average tree height calculated from the sample of 1000 trees (17.5m) is a point estimate for the population tree height. The forestry scientists would typically also calculate a 95% confidence interval (CI) from their sample. The confidence interval will depend on the the mean and the standard deviation of their sample. In this case, they determined the 95% CI to be [16.9m, 18.1m]. This means that there is a 95% probability that this confidence interval contains the true population height.

6.2 Hypothesis Testing

Hypothesis testing involves testing a claim (hypothesis) about a population (Library 2023).

6.2.1 Definitions

  • Null Hypothesis, \(H_0\) : The hypothesis that an observed effect is simply due to the randomness of sampling.

  • Alternative Hypothesis, \(H_a\) : The hypothesis that an observed effect is a real feature of the population being studied.

  • Statistical Test : A test that determines if a sample provides enough evidence to reject the null hypothesis.

  • Test Assumptions : Statistical tests are based on assumptions about the data that need to be satisfied in order for the test to be valid. The assumptions vary between tests.

  • p-value : The probability of making an error by rejecting the null hypothesis.

  • Effect Size : A measure of the size of the phenomenon in question.

  • Statistically Significant : If a statistical test has a p-value less than the specified threshold - typically 5% (0.05) - the result is said to be statistically significant.

  • Power of a Test : The ability of a test to reject the null hypothesis when an observed effect is real. Power is related to sample size and the size of the effect to be detected.

6.2.2 Example

Based on the observational study, the medical researchers in the eczema treatment experiment developed the hypothesis that once or twice daily treatment with 10mL of the new medication would have a statistically significant effect in reducing the extent of eczema. Stated formally, their hypotheses for the statistical test were:

\(H_0\) : There is no statistically significant difference in the mean area of eczema between the treatment groups.

\(H_a\) : There is a statistically significant difference in the mean area of eczema between the treatment groups.

They estimated the size of the difference in means (effect size) they expected between the treatment groups and used this information to determine the sample size for each group. After checking that their data satisfied the assumptions for the chosen statistical test (One-way ANOVA), the test was applied and returned a significant p-value at the 5% level (p = 0.014). Since there was only a 1.4% probability they would be making an error by rejecting the null hypothesis, they concluded that at least one of the treatments gave a statistically significant reduction in eczema when compared with the non-treatment (control) group. Further statistical tests indicated that there was no statistically significant difference between the once-daily and twice-daily groups (p = 0.73). Based on their experiment, they reported that 10mL once daily of the new medication was an effective treatment for eczema.

6.2.3 Common Statistical Tests

Table 6.1 lists some of the common statistical tests for both normally distributed and non-normally distributed data. Tests are chosen depending on their purpose.

Table 6.1: Common statistical tests. Adapted from The Statistics Tutor’s Quick Guide to Commonly Used Statistical Tests, pp.10-11 (Statistics Support for Students - Www.statstutor.ac.uk, n.d.)
Purpose Dependent Variable Independent Variable Parametric Test
(normal)
Non-parametric
Test
(skewed or ordinal)
Comparing means of two independent groups Discrete or continuous Nominal (two groups) Independent T-test Mann-Whitney Test; Wilcoxon Rank Sum
Comparing means of two paired (before / after) groups Discrete or continuous Ordinal (two groups) Paired T-test Wilcoxon Signed Rank Test
Comparing Means of 3+ independent groups Discrete or continuous Nominal (three or more groups) One-way ANOVA Kruskal-Wallis Test
Relationship between two continuous variables Continuous Continuous Pearson’s Correlation Coefficient Spearman’s Correlation Coefficient
Expected counts for one qualitative variable Qualitative None Not applicable Chi-squared Test
Relationship between two qualitative variables Qualitative Qualitative Not applicable Chi-squared Test