What is Independent-Measure ANOVA?
- It is a hypothesis-testing procedure that is used to evaluate mean differences between two or more treatments (or populations).
Remember the below concepts before starting the session.
-
-
- Variability
- Sum of squares
- Sample variance
- Degrees of freedom
- Hypothesis testing
- Independent-measures t statistic
-
Analysis of variance (ANOVA)
- ANOVA uses sample data as the basis for drawing general conclusions about populations.
- t-tests are limited to situations in which there are only two treatments to compare.
- ANOVA can be used to compare two or more treatments.
-
- There are three populations (population1, population2 & population3)
- Treatment done on populations (Treatment1, Treatment2 &Treatment3 )
- After treatment populations mean are not known (µ1 =? µ2 =? & µ3 =? )
- Three samples (sample 1, sample 2 & sample 3) are dawn from the populations.
- These samples are representing three populations. It will be used to conclude about the Population treatment effect.
- The analysis objective is to conclude that there are mean differences among the three populations. Or in other words, Treatment has significant effects or Not.
Terminology in Analysis of Variance
Factors
-
- An assignable cause which affects the responses (test results)
- Factors can be quantitative such as – Temperature, Pressure, Voltage applied, Holding pressure, Type of material, processing Time, cooling time, and presence of a catalyst.
Levels
-
- The specific value/setting of a Factor is called Level.
- A factor can be given any random value within the operating range. This value is called Level.
Example: For a plastic moulding of a Part following are the Factors & Levels
To produce plastic parts Factors (Machine /process parameters) can have any value between Level1 and Level2
- Material melt temperature can be set in between 210 to 290 ℃ to make plastic parts.
- Mould temperature can be set in between 30 to 70 ℃ to make plastic parts.
- Injection pressure can be set in between 700 Bar to 1500 Bar to make plastic parts.
- Holding pressure (% of the injection pressure) can be set in between 30% to 60% to make plastic parts.
- Back pressure can be set in between 50Bar to 200 Bar to make plastic parts.
- Factors can have any random value within the operating range.
- Operating Range: Values from Level1 to Level2
Example: Material melt temperature operating range: from 210 ℃ to 290 ℃
The process can be set at 210, 215, 230, 250, 260……………………………………… 290℃
Statistical Hypotheses for ANOVA
- Null hypothesis (H0) states no effect,
H0 : μ1 = μ2 = μ3
- Alternative hypothesis (H1 ) states -have an effect
H1 : μ1 ≠ μ2 ≠ μ3 (all three means are different)
H1 : μ1 = μ3 , but μ2 is different
Type I Errors and Multiple-Hypothesis Tests
- Typpe-1 error: Rejecting H0 when there is no difference (both are the same).
- In the hypothesis test, an alpha level determines the risk of a Type I error.
- With α = .05, there is a 5% risk (1-in-20) to reject the null hypothesis.
Test-wise alpha level: Type I error
- The test-wise alpha level is the risk of a Type I error, or alpha level, for an individual hypothesis test.
Experiment-wise alpha level: Type I error
- When an experiment involves several different hypothesis tests, the experiment-wise alpha level is the total probability of a Type I error that is accumulated from all of the individual tests in the experiment.
- The experiment-wise alpha level is substantially greater than the value of alpha used for any one of the individual tests.
For an experiment with three treatments, will compare mean differences with t-tests:
-
- Test 1 compares treatment I vs. treatment II , uses α = .05
- Test 2 compares treatment I vs. treatment III , uses α = .05
- Test 3 compares treatment II vs. treatment III , uses α = .05
- 5% risk of a Type I error for the first test
- 5% risk of a Type I error for the second test
- 5% risk of a Type I error for the third test
- The three separate tests accumulate to produce a relatively large experiment-wise alpha level.
- ANOVA performs all three comparisons simultaneously in one hypothesis test.
- ANOVA uses one test with one alpha level to evaluate the mean differences and it avoids the problem of accumulation of experiment-wise alpha level.
Steps of Analysis of Variance
The goal of ANOVA is to find out whether a treatment effect exists.
Between-Treatments Variance
- Between-treatment variance simply measures how much difference exists between the treatment conditions.
- There are two possible explanations for these between-treatment differences:
- The differences between treatments are not caused by any treatment effect but are simply the naturally occurring, random and unsystematic differences that exist between one sample and another.
- The differences between treatments have been caused by the treatment effects.
- The term between treatments refers to differences from one treatment to another. With three treatments, for example, we are comparing three different means (or totals) and have df = 3 – 1 = 2
Within-Treatments Variance
- Differences that exist within a treatment, represent random and unsystematic differences that occur when there are no treatment effects causing the responses to be different.
- The term within treatments refers to differences that exist inside the individual treatment conditions. Thus, we compute SS and df inside each of the separate treatments
- Thus, the within-treatments variance provides a measure of how big the differences are when H0 is true.
Total Variance
- Total variability contains two components: variance between treatments and variance within treatments.
Total variance = variance between treatments + variance within treatments
The F-Ratio: The Test Statistic for ANOVA
In F-Ratio, between-treatments are compared with within-treatments. For the independent-measures ANOVA:
- The denominator of the F-ratio measures only random and unsystematic variability, it is called the error term.
- The error term provides a measure of the variance caused by random, unsystematic differences.
- The numerator of the F-ratio includes unsystematic variability as in the error term, but it also includes any systematic differences caused by the treatment effect.
- When the treatment effect is zero (H0 is true), the error term measures the same sources of variance as the numerator of the F-ratio, so the value of the F-ratio is expected to be nearly equal to 1.00.
Formula for ANOVA
- The general structure of the procedure and look at the organization of the calculations before we introduce the individual formulas.
- The final calculation for ANOVA is the F-ratio, which is composed of two variances:
-
- Two variances used in the F-ratio are calculated using the basic formula for sample variance.
Sample variance
Degrees of Freedom calculation for ANOVA
Total Degrees of Freedom, dftotal
- SStotal value measures variability for the entire set of N scores.
- dftotal = N – 1
Within-Treatments Degrees of Freedom, dfwithin
- For SSwithin calculation, first SS for each of the treatments is find out and then these values are added together. Each treatment SS values has df = n – 1.
- When all these individual treatment values are added together, we obtain
dfwithin = Σ (n – 1)
= Σdfin each treatment
Between-Treatments Degrees of Freedom, dfbetween
- SS value is measured by SS for the set of treatments (totals or means).
- K treatments
dfbetween = k – 1
ANOVA Summary
- Result of ANOVA analysis is put in a summary table. The table shows :
- Source of variability (between treatments, within treatments, and total variability), SS, df, MS, and F
- Refer below example
F-Ratio calculation through Mean Squares.
F-Ratio distribution
- F values that can be obtained when the null hypothesis is true: are the distribution of F-ratios.
- F-ratios are computed from two variances (the numerator and denominator of the ratio). Variance is always positive. Therefore F value is a positive number.
- When H0 is true, the numerator and denominator of the F-ratio are measuring the same variance. Two sample variances are about the same size. The ratio will be near 1. And F-ratios will be near 1.00.
- F-Ratio Curve is not symmetrical but skewed to the right.
- There is a different curve for each set of dfs.
- The F statistic is greater than or equal to zero.
- As the degrees of freedom for the numerator and for the denominator get larger, the curve approximates the normal.
F-Ratio Table
ANOVA Treatment Effect – size measurement
ANOVA: Percentage of variance
- The percentage of variance accounted for by the treatment effect is usually called η2 (the Greek letter eta squared) instead of using r2
Assumptions for the Independent-Measures ANOVA
- Within each sample, observations must be independent
- Samples should be selected from populations having a normal distribution.
- And these populations must have equal variances (homogeneity of variance).
Post hoc tests for ANOVA
- If ANOVA analysis rejects the null hypothesis. Then additional analyses are conducted to determine exactly which treatments are significantly different and which are not.
- These analyses are called post hoc tests.
- Post-analysis tests are Tukey’s HSD test and the Scheffé test.
Tukey’s Honestly Significant Difference (HSD) Test
- This test computes a single value that determines the minimum difference between treatment means that is necessary for significance.
- This value is called the honestly significant difference, or HSD
- It is used to compare any two treatment conditions.
- If the mean difference exceeds Tukey’s HSD, it is concluded that there is a significant difference between the treatments.
- q is taken from the Studentized Range Statistic (q) table (df, no of treatment & α level)
- n- sample size or no of observations
Treatment effect: Tukey’s HSD
Studentized Range Statistics (q)
The Scheffè Test
- This test uses an F-ratio to evaluate the significance of the difference between any two treatment conditions.
- The numerator of the F-ratio is an MS between treatments that are calculated using only the two treatments under comparison.
- The denominator is the same MSwithin that was used for the overall ANOVA.
- If F is calculated > F table, then there is a significant difference between these two treatments.
Example: Test Statistic for ANOVA(Independent measures)
Refer to the below data table for the Breaking strength of the product received from Plant P, Plant Q & Plants. Does product breaking strength vary from plant to plant?
Total Sum of Squares – uncorrected
- The sum of Squares between groups (between plants) – uncorrected
- Correction factor
- Total Sum of Squares – corrected
- The sum of Squares between groups (between plants) – corrected
Data Table:
Calculation: Total Sum of squares (uncorrected)
Calculation: Sum of squares between groups (uncorrected)
Calculation: Correction factor
Calculation: Total Sum of squares (Corrected)
= Total Sum of Squares(uncorrected)– Correction factor
= (1) – (3)
= 153719 – 153045
= 674
Calculation: Sum of squares between groups (Corrected)
= Sum of Squares between groups (uncorrected) – Correction factor
= (2) – (3)
= 153242 – 153045
=197
Calculation: Sum of squares within group (Corrected)
= Total SS – Between groups SS
= (4) – (5)
= 674 -197
= 477
Calculation:Degree of freedom
- For Total sum of Errors = df = 17-1 = 16
- For between plants = df = 3-1 = 2
- For within plants df = a –b = 16 – 2 = 14
- F- ratio from table F(2,14) = 4.86
ANOVA: Summary table
ANOVA: Decision
- F-Ratio from table F(2,14) = 3.74
- F-Ratio calculated is 2.89 and is less than F(2,14) therefore H0 is true ,
- There is no treatment effect.
__________________________________________________________________
Download the ANOVA Excel sheet for practice
Also Read
- https://matistics.com/statistics-data-variables/
- https://matistics.com/descriptive-statistics/
- https://matistics.com/1-1-measurement-scale/
- https://matistics.com/point-biserial-correlation-and-biserial-correlation/
- https://matistics.com/2-0-statistics-distributions/
- https://matistics.com/1-2-statistics-population-and-sample/
- https://matistics.com/7-hypothesis-testing/
- https://matistics.com/8-errors-in-hypothesis-testing/
- https://matistics.com/9-one-tailed-hypothesis-test/
- https://matistics.com/10-statistical-power/
- https://matistics.com/11-t-statistics/
- https://matistics.com/12-hypothesis-t-test-one-sample/
- https://matistics.com/13-hypothesis-t-test-2-sample/
- https://matistics.com/14-t-test-for-two-related-samples/
- https://matistics.com/15-analysis-of-variance-anova-independent-measures/
- https://matistics.com/16-anova-repeated-measures/
- https://matistics.com/17-two-factor-anova-independent-measures/
- https://matistics.com/18-correlation/
- https://matistics.com/19-regression/
- https://matistics.com/20-chi-square-statistic/
- https://matistics.com/21-binomial-test/