Skip to content

Coverage methodology

/v1/health/coverage returns a domain_coverage object per jurisdiction:

"MGA": {
"domain_coverage": {
"operators_with_domain": 92,
"normative_operators": 111,
"coverage_pct": 83
}
}

This page explains what those three numbers mean — because the honest answer depends on which question you’re asking.

FieldMeaning
normative_operatorsOperators in this jurisdiction whose license type is one where domain disclosure is normative — i.e., the operator is consumer-facing and the regulator publishes (or could publish) a website field.
operators_with_domainSubset of those operators that have ≥1 row in domains in our DB.
coverage_pctoperators_with_domain / normative_operators × 100. The honest “what fraction of operators we should have a domain for, do we?” number.

Different regulators license different activities. A B2B software supplier or a non-remote arcade operator can hold a perfectly valid license without ever running a consumer website. Including them in the denominator deflates coverage in a way that has nothing to do with our scraper quality — it just measures the regulator’s mix of consumer-facing vs. supplier licenses.

The normative subset filters them out. What’s left is operators where a missing domain is interesting — either we missed it (scraper bug we can fix) or the regulator didn’t publish it (upstream ceiling, defer to enrichment).

JurNormative subsetExcluded
UKGClicense_type_raw = 'Remote'Non-Remote (physical premises), Ancillary Remote (B2B suppliers, game studios)
MGAlicense_number LIKE 'MGA/B2C/%'MGA/B2B/* (suppliers), MGA/CRP/* (Critical Gaming Supply / Recognition / Personal)
CWlicense_type_raw ILIKE 'b2c%'b2b (suppliers post-LOK)
KHlicense_type_raw = 'Interactive Gaming Permit'CSPA (Casino Software Provider Authorisation, B2B suppliers)
AN'B2C' = ANY(license_types)pure-B2B and pure-White-Labeling-only operators
TGClicense_type_raw = 'B2C'B2B (suppliers)

For MGA we filter on the license-number prefix because license_type_raw carries the gaming-service breakdown (Type 1, Type 2, Type 3, Type 4), not the B2B/B2C class distinction.

For AN, operators with mixed license_types (e.g., ["B2C","B2B"]) are included — having any B2C component makes the operator consumer-facing.

For a given jurisdiction, coverage_pct is the most useful single indicator of “how complete is iGregulator on this regulator?” relative to what’s recoverable. It’s not the same as ”% of all licensees that have a domain” — that older framing mixed B2B and consumer-facing operators and obscured the real ceiling.

Two consequences worth flagging:

  • A coverage_pct of 0 means either (a) the upstream regulator doesn’t publish website data at all (CW, currently), or (b) our scraper hasn’t started picking up the field yet.
  • A coverage_pct near 100 means we’re at the upstream ceiling — any remaining gap is operators who haven’t disclosed a domain to the regulator, not a scraper bug.

Numbers can shift when methodology changes

Section titled “Numbers can shift when methodology changes”

The 1.4.0 changelog (2026-04-28) introduced this metric. Prior versions reported coverage as total_domains / total_licenses per jurisdiction, which inflated KH (3+ domains per IGL holder average made it read 91% even though only 42 of 61 holders had any domain disclosed) and deflated UKGC (the 17.3% denominator was 2,660 total ops including 1,594 non-remote licensees, dragging the rate down).

If you compare numbers in a 1.3.x response to numbers in ≥1.4.0, expect movement. The DB hasn’t changed — only the question we’re asking it.

As of 1.4.0:

Jurcoverage_pctReading
AN98%Near-ceiling — the 19 missing AN operators chose not to disclose a domain upstream
MGA83%First post-enrichment run; ceiling appears to be ~89% (12 B2C operators omit the Website Urls section)
UKGC40%Matches the upstream ceiling — 482 / 1,180 Remote-license holders are in domain-names.csv, and we’ve ingested ~350 of those
KH69%Matches the upstream ceiling — 19 IGL holders chose not to disclose
CW0%Upstream doesn’t publish; defer to Phase 4 WHOIS/Tranco enrichment
TGC0%Upstream doesn’t publish a website column; same defer-to-enrichment bucket as CW

The investigation docs in docs/scrapers/ go deeper on each jurisdiction’s specific upstream behaviour.