Estimated percentage of publications from paper mills
Quality Controls for Preprints
Putting trust in preprints
With 10,000+ preprints posted weekly, there's no quality filter. AI-generated papers, recycled content, and methodologically flawed studies sit alongside groundbreaking research.
We're building a first-pass filter using open source research integrity tools — helping researchers, journalists, and LLMs distinguish signal from noise.
This is an experiment. All assessments are machine-generated indicators requiring human expert review.
Try: 10.1101/2025.01.15.633214 or paste any bioRxiv URL
The A5–E1 Grading Matrix
Every paper receives a grade combining Strength of Evidence (A–E) and Significance of Findings (1–5)
| EvidenceSignificance → | 5Landmark | 4Fundamental | 3Important | 2Valuable | 1Useful |
|---|---|---|---|---|---|
| ACompelling | A5 | A4 | A3 | A2 | A1 |
| BConvincing | B5 | B4 | B3 | B2 | B1 |
| CSolid | C5 | C4 | C3 | C2 | C1 |
| DIncomplete | D5 | D4 | D3 | D2 | D1 |
| EInadequate | E5 | E4 | E3 | E2 | E1 |
Significance (1–5) reflects the importance of the research question — not publication tier. A D5 paper asks a landmark-scope question but lacks the evidence to support its claims.
The Problem
Preprint quality is impossible to assess at scale
Papers contain at least one statistical inconsistency
Posted to bioRxiv and medRxiv alone
Integrated Tools
We combine the best open-source research integrity tools
8,000+ tortured phrases, SCIgen fingerprints, and fabrication indicators
P-value recalculation, GRIM tests, and impossible-result detection
Verify data availability statements and detect datasets mentioned but not shared
Cross-reference citations against retracted paper database
Methodologist, Statistician, Ethics, Domain Expert, and 5 specialist reviewers — with adversarial verification
3 specialist agents (Claude, GPT-4o, Gemini) independently review, consult, and reconcile on borderline papers
REST API for Developers
Embed quality signals into any workflow — a single GET request, zero infrastructure
# One line. Any DOI, arXiv ID, or bioRxiv URL. curl -H "X-API-Key: pak_your_key" \ "https://preprints.ai/v2/score/10.1101/2025.01.15.633214" # Response { "doi": "10.1101/2025.01.15.633214", "grade": "B4", "integrity": { "grade": "B", "score": 0.82, "label": "Convincing" }, "novelty": { "grade": "4", "score": 0.76, "label": "Fundamental" }, "confidence": 0.85, "badge_url": "https://preprints.ai/badge/10.1101/...", "report_url": "https://preprints.ai/report/10.1101/..." }
Infrastructure for Publishers
From preprint servers to journals — integrate quality signals at submission, review, or post-publication
Submission Screening
Flag potential integrity concerns before peer review begins. Statcheck, ODDPub, and paper mill detection run automatically on every submission.
Embeddable Badges
One line of HTML. Grade badges update automatically as assessments are revised. Colour-coded for immediate comprehension.
Webhook Callbacks
Publisher tier: receive an HMAC-signed POST to your endpoint the moment an assessment completes. No polling required.
Bulk Processing
Backfill your entire catalogue. Submit up to 500 DOIs per batch job with progress tracking and email notification on completion.
Built on Open Science Infrastructure
Every finding is traceable to a peer-reviewed tool. No proprietary black boxes.