Multi-agent AI peer review

Quality signal
for preprints.

Eleven specialist agents review every preprint independently, then deliberate to produce a single Publication Fit tier (1–10) anchored to real journals.

papers assessed across indexed preprints from bioRxiv and medRxiv.

Machine-generated indicators. Assists but does not replace expert peer review.

Live pipeline

Auto-refreshes every 30 seconds

Live
Indexed
Assessed
In queue
Avg review time
Trust markers
letter · A–E
A ticks all integrity boxes · C some · E none — data & code availability, COI, ethics, preregistration, funding; fraud checks cap the grade
Novelty
number · 1–10
1 landmark · 5 important · 10 minimal — lower = more novel
Recent assessments
View all ->
Grade Paper Source Assessed
Loading…

How it works

Four layers. Eleven agents. One 10-tier Publication Fit score.

01
Layer 1

Deterministic checks

Paper-mill detection, statcheck p-value verification, GRIM tests, data availability, retraction cross-referencing — before any LLM runs.

02
Layer 2

Eleven-agent review

Four integrity agents (methodologist, statistician, ethics, validity), five domain agents, a devil's advocate that stress-tests the consensus, and a positive-evidence reviewer that surfaces what the paper does well — all independent.

03
Layer 3

Deterministic grading

Evidence and significance labels from each agent map to a 2-axis grade via a rule-based lookup calibrated to eLife reviews, then derived to a 10-tier Publication Fit score anchored to real journals.

04
Layer 4

Opus arbitration

Borderline or low-agreement cases are arbitrated by Claude Opus with the full agent panel as context.