Summary
What is the certainty of evidence?
The term ‘certainty of evidence’ refers to the level of confidence researchers have that the true effect of an intervention exists where the evidence says it does.[1] This is not determined from a single study, but across the whole body of evidence, such as the clinical trials reported in a meta-analysis. It’s important to note that certainty of evidence does not indicate effect size or clinical significance. It refers only to confidence in the effect.
Certainty of evidence is sometimes referred to as ‘quality of evidence’,[2][3] but this is an outdated term. ‘Certainty of evidence’ is the preferred term because it emphasizes confidence in where the true effect lies, rather than study quality alone.[1]
How is the certainty of evidence graded?
The certainty of evidence is assessed using the Grading of Recommendations Assessment, Development and Evaluation framework, or GRADE.[1] These frameworks rate a body of evidence as high, moderate, low, or very low certainty using a set of standardized criteria that evaluate the risk of bias,[4] inconsistency,[5] imprecision,[6] publication bias,[7] and indirectness of studies. Indirectness refers to how well the participants, interventions, and outcomes match what the study is designed to investigate.[8] High certainty means that the current evidence is so strong and consistent that future studies are unlikely to result in a different conclusion. Low certainty indicates more doubt and less confidence, and that future studies could easily change the current understanding of a topic.
For example, researchers want to know the effect of caffeine on aerobic exercise performance in female athletes. A meta-analysis of randomized controlled trials has reported, on average, a small effect size. Assessing the certainty of evidence means that the researchers are trying to establish how confident they are that the true effect of caffeine sits within the range of effect sizes reported among the trials. That is, they want to determine whether the effect of caffeine on aerobic exercise really is small, on average. They would have high confidence if the meta-analysis displayed a low risk of bias, consistent results, a narrow range of effect sizes, most of the meta-analyzed studies included mostly or all female athletes, and there was no evidence of publication bias. However, confidence would be low if there was a high risk of bias, inconsistent results (high heterogeneity), a wide range of effect sizes, the analyzed studies included a variety of people (men, women, athletes, nonathletes, etc.), and publication bias was likely.
Ultimately, the level of certainty indicates how accurately the published effects of a study likely reflect the truth and how likely it is that future results will align with the current evidence.
References
- ^Hultcrantz M, Rind D, Akl EA, Treweek S, Mustafa RA, Iorio A, Alper BS, Meerpohl JJ, Murad MH, Ansari MT, Katikireddi SV, Östlund P, Tranæus S, Christensen R, Gartlehner G, Brozek J, Izcovich A, Schünemann H, Guyatt GThe GRADE Working Group clarifies the construct of certainty of evidence.J Clin Epidemiol.(2017 Jul)
- ^Gordon H Guyatt, Andrew D Oxman, Regina Kunz, Gunn E Vist, Yngve Falck-Ytter, Holger J Schünemann, GRADE Working GroupWhat is "quality of evidence" and why is it important to clinicians?BMJ.(2008 May 3)
- ^Balshem H, Helfand M, Schünemann HJ, Oxman AD, Kunz R, Brozek J, Vist GE, Falck-Ytter Y, Meerpohl J, Norris S, Guyatt GHGRADE guidelines: 3. Rating the quality of evidence.J Clin Epidemiol.(2011 Apr)
- ^Guyatt GH, Oxman AD, Vist G, Kunz R, Brozek J, Alonso-Coello P, Montori V, Akl EA, Djulbegovic B, Falck-Ytter Y, Norris SL, Williams JW Jr, Atkins D, Meerpohl J, Schünemann HJGRADE guidelines: 4. Rating the quality of evidence--study limitations (risk of bias).J Clin Epidemiol.(2011 Apr)
- ^Guyatt GH, Oxman AD, Kunz R, Woodcock J, Brozek J, Helfand M, Alonso-Coello P, Glasziou P, Jaeschke R, Akl EA, Norris S, Vist G, Dahm P, Shukla VK, Higgins J, Falck-Ytter Y, Schünemann HJ, GRADE Working GroupGRADE guidelines: 7. Rating the quality of evidence--inconsistency.J Clin Epidemiol.(2011 Dec)
- ^Guyatt GH, Oxman AD, Kunz R, Brozek J, Alonso-Coello P, Rind D, Devereaux PJ, Montori VM, Freyschuss B, Vist G, Jaeschke R, Williams JW Jr, Murad MH, Sinclair D, Falck-Ytter Y, Meerpohl J, Whittington C, Thorlund K, Andrews J, Schünemann HJGRADE guidelines 6. Rating the quality of evidence--imprecision.J Clin Epidemiol.(2011 Dec)
- ^Guyatt GH, Oxman AD, Montori V, Vist G, Kunz R, Brozek J, Alonso-Coello P, Djulbegovic B, Atkins D, Falck-Ytter Y, Williams JW Jr, Meerpohl J, Norris SL, Akl EA, Schünemann HJGRADE guidelines: 5. Rating the quality of evidence--publication bias.J Clin Epidemiol.(2011 Dec)
- ^Guyatt GH, Oxman AD, Kunz R, Woodcock J, Brozek J, Helfand M, Alonso-Coello P, Falck-Ytter Y, Jaeschke R, Vist G, Akl EA, Post PN, Norris S, Meerpohl J, Shukla VK, Nasser M, Schünemann HJ, GRADE Working GroupGRADE guidelines: 8. Rating the quality of evidence--indirectness.J Clin Epidemiol.(2011 Dec)