The messy reality of eligibility criteria in a Systematic Review

Fact: Getting your screening criteria right is critical. A good set of criteria makes it easy for a screening team to determine if a study or article should be included or excluded from the review. It will also make it easier for other people to replicate your review.

Fact: Getting your screening criteria right is hard. It’s two balancing acts, and two major pitfalls to avoid:

  • The ‘too restrictive’ versus ‘too permissive’ balancing act
  • The ‘too abstract’ versus ‘too precise’ balancing act
  • Falling into the “refined too late” pit
  • Falling into the ‘ask your supervisor too late’ pit

The ‘too restrictive’ versus ‘too permissive’ balancing act

Your screening criteria will define the efficiency of the screening process. If your criteria are too permissive, your team will have to wade through far too many irrelevant studies or articles. If they are too restrictive, your team will miss important research that should be included.

Finding ‘just-right’ is a feeling, not a rule, and the untold truth is that it takes trial and error.  You need to screen some articles with a set of screening criteria, and ask yourself “Was I including things that didn’t seem interesting, when following the criteria? Or was I forced to exclude things that looked interesting?” Your answer will indicate whether your criteria need a tweak, and in which direction.

Our rule of thumb: It takes about 150 articles to get a feel for your criteria. Run that alone, and see how you go. Then adjust your criteria until you get that “just-right” feeling, which is when you might include the occasional study or article that is not relevant – but not often. Importantly, you never reject a study or article that you feel should be included. 

Once you feel you have the criteria “just right”, ask one or two members of your screening team to review the same 150 articles and compare your included and excluded studies or articles. If there is a good match, you are onto a good thing. If there are a lot of mismatches, work together to adjust the criteria. Then independently review another 150 studies or articles with your new criteria, and see how closely your results match. You should be getting a good match by now, but if not, review the criteria again, and screen another 150 articles.

This process – which we call “pilot screening” – can sometimes feel frustrating to systematic reviewers who want to get their systematic review done ASAP. However, we promise you, every hour you invest in getting your screening criteria right at this early stage will save screening team *many* collective hours down the track. It also minimises the risk that you all get to get to the end of the screening process, only to find you have to start all over again because the criteria produced too many mismatches. You’ll get no thanks from your screening team for that.

Note: Feedback from clients suggests that 5-10 disagreements is fairly typical for 100 included articles, which equates to 7-15 disagreements for 150 articles.

The ‘too abstract’ or ‘too precise’ balancing act

Your screening criteria aim to align the thinking of others. If your criteria are too abstract then they will be interpreted in different ways by different screeners, resulting in TONS of disagreements right from the start (i.e., within the first 100 studies or articles). If your criteria are too precise, there will probably be a lot of them, making it hard for your screening team to read, remember, and apply. You’ll know your criteria is too precise if you’re 10 articles in, and every article takes you a long time to assess – more than 5 minutes – and half of that time is spent reading the criteria because you can never remember them.

Our rule of thumb: Each criterion should be 10-13 words, and bullet-point format – i.e. typical punctuation doesn’t apply. Each criterion should result in a clear yes-no result. As mentioned above, for every 100 included articles, about 5-10 disagreements is normal, but take that with a grain of salt. If your research is broad, or hard to define, expect more.

Falling into the ‘refine too late’ pit

When you refine your criteria, your screening work becomes redundant, and you have to start again with your new criteria. So refining too late… Can be a huge pit of wasted time.

The solution? Refine early. Every hour you invest in refining your screening criteria to get them right early will save you and your screening team butt-loads of work down the track. It’ll help you avoid the pit of getting to the end of the screening process and having to start all over again because the criteria produced too many disagreements.

This is demonstrated in the increasing popularity of Rapid Reviews. Here, a screening team independently screens a few hundred studies or articles, and use the mismatches between included and excluded articles to refine the screening criteria. The refined criteria are then used to rate the rest of the articles. 

Falling into the ‘ask your supervisor too late’ pit

Last, but definitely not least, when you ask your supervisor for feedback, and they think you’ve got your search queries wrong, or your eligibility criteria wrong – guess what – you’ll have to start again. So, asking for feedback too late… Can be a huge pit of wasted of time.

You need to make sure that your supervisor can provide informed feedback about the screening criteria and your article set – this requires making them do some screening using your criteria. Getting them involved is useful, because you can lean on their spidey-sense for whether:

  • Your search queries really delivered the right scope of articles to screen, and
  • Your eligibility criteria are useful, and usable for finding the right articles to include.

Our rule of thumb: Make sure your supervisor is part of the pilot screening process, as described above. This is equivalent to them helping you design the methodology of an experiment. They don’t need to be involved in doing the final screening of 1000s of articles, but if they help you refine the criteria, your review is more likely to include articles that are relevant to your question. In addition, if the outcomes of your review do not match what they expected, they will know that the screening criteria was not the source of the problem.

For the uninitiated

Published systematic reviews do not report all the failed attempts at getting the screening criteria right. But unless you are simply replicating another systematic review, everyone goes through this adjustment process. In opinion, it would be helpful if each review’s pilot screening process was included in the publication.

Don’t rush into your 5000 article screening all guns blazing, like “i’mm’a smash 500 a day!”. Invest a few hours in making sure your screening criteria work properly in a pilot screening phase. Then you’ll only have to screen the remaining 4850 once, as will your screening team. They will thank you for it.

Leave a Reply