Image forensics — Evidence

Methodology

The module rasterises figures from the PDF, segments candidate panels, and runs a small set of heuristics inspired by published image-forensics work and the ELIS toolkit:

Duplicate-region detection. Block-matching across panels within a figure and across figures within the manuscript. High-confidence matches are surfaced as candidate duplications.
Band-shift heuristic. For Western-blot-style figures, looks for repeated lane signatures with localised offsets — a pattern associated with retouched bands.
Splice-edge detection. Localised discontinuities in JPEG quantisation tables along straight edges, characteristic of pasted regions.
Resolution & compression sanity. Flags figures whose effective resolution or compression artefacts are inconsistent with the surrounding manuscript.

Results

We do not yet publish precision or recall numbers for this module. Public benchmarks for academic image forensics are limited; the most-cited corpora (e.g. Bik et al.'s flagged-figure datasets) have known coverage and labelling caveats that complicate honest evaluation.

We will publish, when ready: number of preprints scanned, share that produced a finding at each severity level, and the agreement between the module's flags and human follow-up review on a sample.

What a finding here means. A flag is a request for a closer look, not a charge. Many duplications are legitimate (example panels, intentional reuse with citation). The module exists to direct attention, not to assign blame.

Caveats — what this doesn't measure

The module is dormant on abstract-only ingests. No flags can be raised on papers ingested without full text.
Heavily compressed figures (low-DPI scans, aggressive JPEG re-encoding by the publisher pipeline) defeat several heuristics. Findings on such figures are less reliable.
Modern image-generation models can produce panels with no detectable splice. The module is unlikely to flag fully synthetic figures.
Cross-paper duplication (same figure in two unrelated submissions) is out of scope. The module operates within the manuscript only.
Without a labelled benchmark we have no calibrated false-positive rate. Use findings as an input to human review, not as a verdict.

Code

Module: checks/layer1/image_forensics.py · ELIS-style helpers: checks/layer1/elis_unified.py.