Citation verification — Evidence

Methodology

The module parses the reference list out of the manuscript, attempts to resolve each entry to a DOI, and then queries two complementary sources:

Semantic Scholar — paper-level metadata, citation count, fields of study, and forward / backward citation graph.
OpenCitations — independent open citation index used to cross-check the S2 view and to fill gaps where S2 has no record.

From the resolved view, the module produces three signals on the report:

Broken DOIs — references whose DOI could not be resolved at either source. Could be a typo, a withdrawn paper, or a mis-extracted reference.
Retracted citations — references that resolve to papers flagged as retracted (Retraction Watch / Crossref retraction metadata). Citing a retracted paper is not automatically wrong, but it deserves a flag.
Self-citation rate — share of references that share at least one author with the manuscript's author list. Surfaced as a figure on the report; not used as a grade input.

Results

We do not yet publish aggregate statistics here. Per-paper citation-verification output is visible on every assessment that ran the module. Aggregate publishable numbers we owe this page:

Median resolution rate for the reference list (share of references resolvable to a DOI).
Distribution of self-citation rates across the production corpus, by field.
Total retracted citations flagged; share of papers with at least one such flag.

Why we have not published numbers yet. Aggregating these honestly requires deduplicating against the daily backfill cron (which is still re-running citation verification for older assessments) and stratifying by field. We will publish once the backfill stabilises.

Caveats — what this doesn't measure

The module flags that a citation is to a retracted paper. It does not judge whether the citation is appropriate (e.g. citing a retracted paper to discuss its retraction is fine).
Reference parsing is imperfect. Mangled references that fail to extract a DOI become "unresolved", not "broken" — the module distinguishes between the two on the report but it remains a noisy signal.
Self-citation rate is a descriptor, not a grade input. We report it because it is useful context, not because high values are automatically a problem.
OpenCitations and Semantic Scholar disagree on roughly 1–3% of citations. We mark such cases as "single-source" rather than fabricate a tiebreak.
Citations to non-DOI sources (preprint URLs, books, datasets, software releases) are out of scope for retraction checking.

Code

Module: checks/layer1/reference_verification.py.