We provide 3 types of deduplication algorithms for you to leverage.
- Exact match
- Title Search
We recommend an initial pass looking for exact matches, which often yields high-confidence results that can be accepted in bulk – hundreds of duplicates gone in one click. Exact match looks for articles that are identical across title and abstract.
A second pass with the Bond SRA method can return a set of results that are still likely to be duplicates, but may require quick confirmation from a human eye. The SRA method combines title matching with author and document meta-data checks to find duplicates.
A final pass with title search can allow you to see articles that may still be duplicates, but require a critical review. We loosely match titles, ignoring punctuation, capitalisation, or diacritics.
By completing multiple checks for duplicates, we provide the fastest way to achieve a high quality data-set that is as close to duplicate free as possible. By giving you control of how we search for duplicates, we can also give you control to bulk-remove, or not. It’s not a black box.