In a recently published systematic review in the JBI Database of Systematic Reviews and Implementation Reports it was specified in the executive summary that "the results demonstrated no statistically significant difference in the quality of life of patients with cancer who had undergone patient navigation programs (pooled weighted difference = 0.41 [95% CI = -2.89 to 3.71], P = 0.81)".1(p.297) Similarly, in another recently published review, the authors detailed that "the overall relative risk for hospital readmission was 0.77 [95% CI, 0.70-0.84] (p < 0.00001)".2(p.107) Both examples highlight a common statistical indicator often quoted in research literature - a confidence interval (CI), specifically, a 95% CI.
It is important that consumers of research understand what a CI represents: What is a 95% CI? How are these intervals interpreted? What is the meaning of "confidence"? Confidence intervals were introduced in science as a statistical tool by Jerzy Neyman in the 1930 s. Neyman proposed CIs as a statistical approach for the estimation of an unknown population parameter.3 The true value of a population parameter is a fixed, unknown value. The CI is an interval of values that is used to estimate this fixed, unknown value of the population parameter. According to Neyman, for a CI with x% confidence level, before a CI is computed, there is x% probability that the interval will be "correct", that is, the interval will "capture" the true value of the fixed unknown parameter.3 Different confidence levels are used, such as 50%, 80%, 90%, 95%, and 99%. A researcher using a 95% CI for the estimation of an unknown parameter knows that in a series of 100 uses of the 95% CI procedure, it is expected that in 95 cases the computed CIs will capture the true value of the parameter, and that in five cases the true value of the parameter will not be included in the computed CIs.3Before a 95% CI is calculated, a researcher should know that this statistical procedure has a 95% success rate. After the CI is computed from the available sample data, only two probability statements may be used with regards to the computed CI: if the true value of the parameter is "captured" by the CI (however, this is not known), the probability is 100% that the true value is inside the CI; and if the true value of the parameter is not "captured" by the CI (however, this is not known), then the probability is 0% that the true value is inside the CI. Put simply, for a computed CI, it is not known if the true value of the population parameter is within the range of the CI or not.3
Returning to the examples mentioned at the beginning of this article, what can be said about the reported CIs? For these computed CIs it is not known if the true values of the parameters are "captured" or not by these intervals. It is incorrect to think that there is a 95% probability that the true values of the parameters are within these computed CIs. No such probability statement can be provided.3 Neyman explicitly states that the probability statement refers to the a priori probability of success before the statistical procedure is used, not to the computed CI. Thus, "confidence" refers to confidence in the statistical procedure, not to confidence in the computed CI. For the examples mentioned before, it would be incorrect to claim that we are sure that the true value of the overall relative risk for hospital readmission is between 0.70 and 0.84;1 similarly, it would be incorrect to claim that we are convinced that the true value of the weighted difference is between -2.89 and 3.71.2 As mentioned, for any computed CI there is no way to know if the range of values that is calculated captures the true value of the parameter or not.
It is essential that reviewers understand the true nature of the "confidence" we have in CIs and not make the gross error of believing that we are 95% (or any other probability) sure or convinced about the true value of the parameter. In this regard, the CI procedure is of limited value; however, the CI still retains some utility as an indicator of the uncertainty of the estimation of the true value of the parameter.
References