What is Point Biserial Correlation?
- The point-biserial correlation is a special case of correlation in which one variable is continuous and the other variable is binary (dichotomous).
Dichotomous meaning:
A dichotomous scale is a two-point scale that presents options that are absolutely opposite each other. This type of response scale does not give the respondent an opportunity to be neutral on his answer to a question.
Examples:
Yes – No, True – False, smoker (yes/no), sex (male/female), 0-1 variable.
Examples: Point Biserial Correlation
- Are women or men likely to earn more profit as Business entrepreneurs? Is there an association between gender and profit?
- Does Vaccine A or Vaccine B improve immunity? Is there an association between the vaccine type and immunity level?
- Do dogs react differently to yellow and red lights as food signals? Is there an association between the color and the reaction time?
- Are women or men likely to earn more as doctors? Is there an association between gender and earnings as a doctor?
Assumptions: Point Biserial Correlation
- The assumptions for Point-Biserial correlation include:
-
- Continuous and Binary
- Normally Distributed
- No Outliers
- Equal Variances
-
Normally Distributed meaning:
In statistics, a normal data distribution (frequency) graph must look like a bell-shaped curve.
Formula: Point Biserial Correlation
- Find out the correlation r between –
-
- A continuous random variable Y0 and
- A binary random variable Y1 takes the values 0 and 1.
-
The point-biserial correlation coefficient r is calculated from these data as –
-
-
- Y0 = mean score for data pairs for x=0,
- Y1 = mean score for data pairs for x=1,
- Sx = standard deviation for the entire test,
- po = proportion of data pairs for x=0,
- p1 = proportion of data pairs for x=1,
-
Properties: Point-Biserial Correlation
- The point-Biserial Correlation Coefficient measures the strength of the association of two variables in a single measure ranging from -1 to +1,
- Where -1 indicates a perfect negative association, +1 indicates a perfect positive association and 0 indicates no association at all.
- In place of Point-Biserial Correlation, Linear Regression Analysis is better suited for randomly independent variables.
- A similar problem can also be answered with an independent sample t-test or Mann-Whitney-U or Kruskal-Wallis-H or Chi-Square. These tests fulfill the requirement of normally distributed variables and can analyze the dependency or causal relationship between an independent variable and dependent variables.
Definition: Biserial Correlation
- It is the same as the point-biserial correlation coefficient. But this correlation is the true value of the association if samples are really normally distributed.
- biserial correlation provides a better estimate.
- It is also a correlation coefficient between Dichotomous + Continuous data (same as the Point-biserial correlation coefficient)
- It is denoted by rb
- If there are two sets of data :
X (xi = 0 or 1) – Dichotomous data
Y = {y1, ……………, yn} – Continuous data
Formula: Biserial correlation coefficient (rb)
Where
-
-
-
-
- m0 = mean of yi when X = 0
- n0 = number of elements in X which are at X=0
- p0 = n0/n
- p1 = n1/n
- n1 = the number of elements in X =1
- n = n0+n1
- m1 = mean yi when X = 1
- s = population standard deviation of Y
- y = height of the standard normal distribution at z, where P(z'<z) = q & P(z’>z) = p
- In Excel y = NORM.S.DIST(NORM.S.INV(p0),FALSE)
-
-
-
Relation: Point-biserial and biserial correlation
The biserial correlation coefficient can also be computed from the point-biserial correlation coefficient:
- Biserial correlation coefficient = \Large{r_{b} = \frac{r_{pb}\sqrt{p_{o}}p_{1}}{y}}
Also Read
- https://matistics.com/statistics-data-variables/
- https://matistics.com/descriptive-statistics/
- https://matistics.com/1-1-measurement-scale/
- https://matistics.com/point-biserial-correlation-and-biserial-correlation/
- https://matistics.com/2-0-statistics-distributions/
- https://matistics.com/1-2-statistics-population-and-sample/
- https://matistics.com/7-hypothesis-testing/
- https://matistics.com/8-errors-in-hypothesis-testing/
- https://matistics.com/9-one-tailed-hypothesis-test/
- https://matistics.com/10-statistical-power/
- https://matistics.com/11-t-statistics/
- https://matistics.com/12-hypothesis-t-test-one-sample/
- https://matistics.com/13-hypothesis-t-test-2-sample/
- https://matistics.com/14-t-test-for-two-related-samples/
- https://matistics.com/15-analysis-of-variance-anova-independent-measures/
- https://matistics.com/16-anova-repeated-measures/
- https://matistics.com/17-two-factor-anova-independent-measures/
- https://matistics.com/18-correlation/
- https://matistics.com/19-regression/
- https://matistics.com/20-chi-square-statistic/
- https://matistics.com/21-binomial-test/