Statnote 14: the correlation of two variables (Pearson's 'r')

Anthony Hilton, Richard A. Armstrong

Research output: Contribution to specialist publication or newspaperArticle


Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.
Original languageEnglish
Number of pages3
Specialist publicationMicrobiologist
Publication statusPublished - Sept 2008


  • Pearson's correlation coefficient
  • statistics


Dive into the research topics of 'Statnote 14: the correlation of two variables (Pearson's 'r')'. Together they form a unique fingerprint.

Cite this