Every outlet named here really ran these words. Counting that echo is the point: apparent consensus is not independent confirmation. "47 outlets report X" often means: one source, 47 times. Counter-measurement counts what passes as independent but is copied.
News via the GDELT DOC 2.0 API (GDELT — open / frei nutzbar). Eight broad beats (politics, economy, technology, health, science, business, sports, weather), English-language. The cited "independent" sources are the domains that ran the sentence verbatim — listed by name on the work page.
Daily. The machine selects on its own: the phrase with the widest spread across distinct source domains is the "headline of the day". Canonical artefact: versioned JSON in src/data/consensus/ — git is the archive.
Pool articles (dedupe by URL) → count verbatim 6-gram title phrases across distinct domains → the most replicated is the headline. Echo index = share of titles belonging to a ≥3-domain echo. Symbolic provenance: the earliest timestamp marks the source candidate and the cascade.
The lab experiments with data AND AI. Implemented: v1 verbatim baseline; v2 TF-IDF/cosine catches paraphrased coordination (reworded wire copy that verbatim misses); v3 a symbolic, rule-based classifier separates chain syndication from scattered placement from the graph structure (TLD homogeneity, time window) — auditable, no black-box model. Optional/future: deep embeddings and an LLM classifier verified against the graph (prompt disclosed). Condition throughout: every AI step transparent, output verified or marked as an estimate.
Eight lightweight HTTP fetches per day, no API key, no LLM in v1. The site itself is static.