Site icon Matistics

Pearson Correlation Analysis

Pearson Correlation Analysis

What is Correlation?


Prerequisite


Characteristics of a relationship


Form of the Relationship

The Strength or Consistency of the Relationship


Pearson correlation

Sum of squares

The sum of Products of Deviation

The definitional formula for the sum of products is

OR


Formula Pearson Correlation

The Pearson Correlation and z-Scores

For a sample, 

For a population,


Correlations Interpretation

Correlation simply describes a relationship between two variables. It does not explain why the two variables are related.

  1. One of the most common errors in interpreting correlations is to assume that a correlation necessarily implies a cause-and-effect relationship between the two variables
  2. To establish a cause-and-effect relationship, it is necessary to conduct an experiment in which one variable is manipulated and other variables is purposely controlled.
  3. It cannot be interpreted as proof of a cause-and-effect relationship between the two variables

The value of a correlation can be affected greatly by the range of scores represented in the data.

  1. Correlation is computed from scores that represent the full range of possible values
  2. The correlation within a restricted range could be completely different from the correlation that would be obtained from a full range.
  3. Correlation should not be generalized beyond the range of data represented in the sample.
  4. For a correlation to provide an accurate description of the general population, there should be a wide range of X and Y values in the data.

One or two extreme data points, often called outliers, can have a dramatic effect on the value of a correlation

  1. An outlier is an individual with X and/or Y values that are substantially different (larger or smaller) from the values obtained for the other individuals in the data set. The data point of a single outlier can have a dramatic influence on the value obtained for the correlation.
  2. If you only “go by the numbers,” you might overlook the fact that one extreme data point inflated the size of the correlation.

Strength of the relationship

A correlation measures the degree of relationship between two variables on a scale from 0–1. Square the correlation is used to measure the strength of the relationship.

  1. A correlation of 1.00 does mean that there is a 100% perfectly predictable relationship between X and Y,
  2. A correlation of .5 does not mean that you can make predictions with 50% accuracy.
  3. The value r2 is called the coefficient of determination because it measures the proportion of variability in one variable that can be determined from the relationship with the other variable.
  4. A correlation of r = 0.90, r2 = 0.81 (or 81%) of the variability in the Y scores can be predicted from the relationship with X
  5. r2 measures how much of the variance in the dependent variable is accounted for by the independent variable.

Partial correlation

A partial correlation measures the relationship between two variables while controlling the influence of a third variable by holding it constant.

Three variables, X, Y, and Z, it is possible to compute three individual Pearson correlations:


Null Hypotheses

H0 : ρ = 0 (There is no population correlation.)

H0 : ρ ≤ 0 (The population correlation is not positive.)


Alternative hypothesis

H1 : ρ ≠ 0 (There is a real correlation.)

H1 : ρ > 0 (The population correlation is positive.)


t statistic


Degrees of Freedom for the t Statistic


Spearman Correlation


Formula

M = (n + 1)/2

SS for this series of integers =

Point-Biserial Correlation 


Phi-Coefficient

  1. Convert each of the dichotomous variables to numerical values by assigning a 0 to one category and a 1 to the other category for each of the variables.
  2. Use the Pearson formula with the converted score


Exit mobile version