Tyche Institute · Patent Topology

From a candidate flood to defensible patent evidence.

A repair-first, reproducible pipeline that turns a flood of AI-trust-stack patent candidates into deduplicated, source-fidelity-checked evidence. It reports the scale of the problem and a useful negative calibration result — not an ownership ranking. "Who owns the AI trust stack?" stays the program's question, not its answer.

Normalized candidate rows647,410raw AI-trust-stack patent candidates ingested
Canonical records151,890after deduplication by DOI / URL / title + authority scoring
Duplicate-candidate burden495,520the scale of the dedup problem, made explicit
INPADOC family IDs12,632for named clusters — a feasibility result, not a dominance ranking

Repair-first, gate by gate

Corpus & dedup

647,410 candidate rows reduced to 151,890 canonical records; the 495,520 duplicate burden is reported, not hidden.

Authority & date repair

A source-fidelity ladder (public registers, USPTO ODP, INPADOC, Lens/EPO). US date repair reached 30,458 / 31,242 docs (97.5%) with authoritative ODP dates; non-US projection stays blocked.

Citation pilot

138 examiner-citation edges over a 20-application sample — kept explicitly as a pilot; no citation-topology claim.

When the classifier failed — on purpose

Triage calibration

A gate accepted drift labels for routing but rejected substantive auto-labels at 17% strict / 32% lenient precision.

Why it matters

The failed over-eager classifier is a positive methods finding: automated substantive labelling is not yet trustworthy, so manual cluster review gates any ownership reading.

Open & reproducible

Scripts, EPO families, ODP repair data, and adjudication traces are archived at Zenodo DOI 10.5281/zenodo.20644809.

Patent Topology is an evidence-pipeline / method observatory. It reports corpus scale, a source-fidelity ladder, US date repair, family-ID feasibility, and a pilot citation slice. It makes no claim of ownership, concentration, patent-family topology, citation influence, licensing position, market control, legal status, or freedom-to-operate; those require assignee normalization, non-US date projection, manually reviewed cluster assignments, and family/citation closure. Counts are attributed to this observatory and never fused with other surfaces.