Methodology

Sources

Two pipelines feed this site. (1) A read-only mirror of the public ArcGIS feature service for the 2026 hantavirus tracker, refreshed every 15 minutes. (2) A news ingestion pipeline that pulls Google News RSS in en/de/es/fr plus a curated allowlist of public-health and science publications (CIDRAP, ProMED-mail, BAG.admin.ch, NZZ Wissen, Nature, etc.).

Robots & copyright

We respect publishers' robots.txt directives for AI ingestion (GPTBot, Google-Extended, ClaudeBot). Articles whose publishers disallow AI access are skipped. We never republish full text or paywalled content; AI summaries are capped at 250 words with no extracted verbatim quotes, and every article carries source attribution and a 'Read original' link to the publisher.

AI generation

Article headlines and summaries on this site are AI-rewritten using Anthropic's Claude Haiku 4.5. The original publication, byline, and link are always shown. Each article page carries a visible disclosure: 'This is an AI-generated summary. For full reporting, read the original.'

Dedup logic

Each new case from the news pipeline is scored against every existing case. Score components: geographic proximity under 10 km (+0.30), onset date within ±3 days (+0.30), same age within ±2 years (+0.20), same sex (+0.10), same status (+0.10). Identical ArcGIS case IDs auto-merge. Soft-dup score ≥ 0.85 auto-merges; 0.70–0.85 enters a human review queue; under 0.70 is treated as a new case.

Confidence threshold

An extraction is auto-published only when the AI's type-classification confidence is ≥ 0.85, case-extraction confidence is ≥ 0.80 (case-linked items), geocoding confidence is ≥ 0.85 (case-linked items), the dedup score is outside the 0.70–0.85 ambiguity band, and the source publication is on a trusted-source allowlist. Anything else queues for human review.

Disclaimer

Informational only — not medical advice. Numbers and details may lag the underlying sources. For clinical decisions or exposure guidance, consult your local public-health authority.