# iGregulator — full documentation Concatenated markdown of every guide page at https://igregulator.io/docs. Source files in the order they appear in the sidebar. Generated at build time; served verbatim at https://igregulator.io/llms-full.txt. For the canonical machine-readable API schema, fetch the OpenAPI spec: https://api.igregulator.io/openapi.json --- # File: docs/index.mdx --- title: Introduction description: iGregulator API — iGaming licensing intelligence across UKGC, MGA, CGA, Kahnawake, Anjouan, and Tobique. template: doc sidebar: order: 1 --- :::caution[Legal notice] Information provided by iGregulator is sourced from public regulator records and is intended for **informational purposes only**. License verification results do not constitute legal advice. Customers are responsible for their own compliance, KYB, and AML decisions. Always confirm critical licensing decisions directly with the issuing regulator. Full terms at [/terms](https://igregulator.io/terms). ::: iGregulator is a REST API for verifying **iGaming operator licences** against the public registers of six regulators — UK Gambling Commission (UKGC), Malta Gaming Authority (MGA), Curaçao Gaming Authority (CGA, post-LOK), Kahnawake Gaming Commission (KGC), Anjouan Gaming Authority (AGA), and Tobique Gaming Commission (TGC). Daily-refreshed; updated within 24 hours of regulator changes. > **Building with AI?** → [/docs/for-ai-agents](/docs/for-ai-agents/) > covers MCP, structured errors, `_meta` provenance, and > machine-readable resources (llms.txt, OpenAPI) for LLM integrations. ## What you can do with it - **Verify a domain** — hit `GET /v1/check?domain=X` and get an operator + licence + confidence score in one round-trip. - **Look up an operator** — search by name or trading name via `GET /v1/operators/search?q=…` and drill into licences, domain portfolios, and regulatory actions. - **Pull a whole jurisdiction** — paginate `GET /v1/jurisdictions/:code/operators` for a clean dataset of everyone currently licensed. - **Track regulatory actions** — enforcement decisions (fines, warnings, licence revocations) surface via the operator detail endpoint. ## Data coverage (live as of April 2026) | Jurisdiction | Licences | Source | Cadence | | --- | --- | --- | --- | | UKGC | ~3,460 | Public register ZIP | Daily 03:00 UTC | | MGA | ~310 | Playwright-scraped SPA | Daily 03:15 UTC | | CGA | ~650 | OGL PDF parse | Daily 03:30 UTC | | KGC | ~60 | Interactive Gaming + CSPA HTML | Daily 03:45 UTC | | AGA | ~1,275 | Embedded JSON on register page | Daily 04:00 UTC | | TGC | ~160 | Static HTML table (via CF Worker proxy) | Daily 04:15 UTC | ## Who it's for Four buyer profiles drive the product today: - **Affiliate sites** — verify that a brand they're promoting is still licensed before writing reviews and paying out referral payments. - **Compliance + AML teams** — weekly sweeps of their operator counterparties for status changes and enforcement actions. - **Payment providers** — merchant onboarding checks + ongoing KYB. - **Investment intelligence** — correlate licence churn, enforcement fines, and domain-expiry signals into early-warning scores. ## Start here 1. [Getting started](/docs/getting-started/) — first curl call in under a minute. 2. [Authentication](/docs/authentication/) — when you graduate from the public 10 req/hr limit. 3. [Confidence scoring](/docs/confidence/) — how the `/v1/check` endpoint picks between `high`, `medium`, and `low`. 4. [API playground](/docs/playground/) — try any endpoint interactively. 5. [Full endpoint reference](/docs/api/) — detailed schemas for every route. --- # File: docs/getting-started.mdx --- title: Getting started description: First curl call in under a minute. No API key required. template: doc sidebar: order: 2 --- The fastest path to a working integration. You won't need an API key for this walk-through — the `/v1/check` endpoint is public at 10 requests per IP per hour. ## 1. Send your first request ```bash curl https://api.igregulator.io/v1/check?domain=bet365.com ``` Response: ```json { "query": { "domain": "bet365.com" }, "match": { "confidence": "high", "match_type": "domain_exact", "operator": "Hillside (UK Sports) ENC", "operator_slug": "hillside-uk-sports-enc", "jurisdiction": "UKGC", "license_number": "055148-R-331498-001", "status": "active", "expires_at": null, "domain_association": "direct" }, "alternatives": [], "confidence": "high" } ``` That's it — no signup, no key. The `match` object is the answer; `alternatives[]` populates when we're not 100% sure. See [confidence scoring](/docs/confidence/) for the semantics. ## 2. Verify by licence number Compliance teams often receive a licence number from a regulator and need the reverse lookup — who holds it and what's its status? Same endpoint, different query param: ```bash curl "https://api.igregulator.io/v1/check?license_number=055148-R-331498-001" ``` Returns the same `{ query, match, alternatives, confidence }` shape; `confidence: high` when the licence number exists in the register, `none` otherwise. Pass `?domain=` **or** `?license_number=`, not both. ## 3. Try a fuzzy match ```bash curl https://api.igregulator.io/v1/check?domain=paddypower.com ``` This domain isn't in our authoritative registry, but the trading-name fuzzy fallback finds it: ```json { "query": { "domain": "paddypower.com" }, "match": { "confidence": "medium", "match_type": "trading_name_fuzzy", "operator": "Power Leisure Bookmakers Limited", "license_number": "001034-R-315831-012" }, "alternatives": [ { "operator": "PPB Counterparty Services Limited", "similarity": 1 }, { "operator": "PPB Entertainment Limited", "similarity": 1 }, { "operator": "PPB GE Limited", "similarity": 1 } ] } ``` Because `paddypower.com` trigram-matches four Flutter group entities at similarity `1.0`, primary selection falls through a documented tiebreaker cascade — see [confidence scoring → Tiebreaking](/docs/confidence/#tiebreaking-for-equal-similarity). ## 4. Graduate to authenticated requests When you hit the 10-per-hour ceiling, or you need: - Higher volume (10k / 100k / unlimited depending on tier) - The authenticated endpoints: `/v1/operators/:slug`, `/v1/licenses/*` - Full search results (unauthenticated search caps at 3 rows) [Create a free account](https://app.igregulator.io/signup) — founding members get the full Starter plan free, no card. Generate a key at [app.igregulator.io/api-keys](https://app.igregulator.io/api-keys) and attach it with a Bearer header: ```bash curl -H "Authorization: Bearer YOUR_KEY" \ https://api.igregulator.io/v1/operators/search?q=paddy ``` ## 5. Explore interactively Paste any endpoint into the **[API playground](/docs/playground/)** on this site — it's a Scalar-powered try-it-out that runs against the live production API. For authenticated endpoints, paste your key into the Authorize dialog and execute without leaving the page. ## Stability guarantees All `/v1/*` endpoints are **maintained indefinitely**. When `/v2/*` lands, both versions will run in parallel for a minimum of 12 months. Individual fields inside v1 get at least **90 days notice** before removal, surfaced via `Deprecation: true` + `Sunset` response headers (RFC 9745). Full policy in the [changelog](/docs/changelog/). ## Next - [Authentication](/docs/authentication/) — how to create + rotate keys. - [Rate limits](/docs/rate-limits/) — quotas, headers, 429 handling. - [Code examples](/docs/code-examples/) — JS/Python snippets. --- # File: docs/authentication.mdx --- title: Authentication description: Bearer tokens — how to create, use, rotate, and revoke API keys. template: doc sidebar: order: 3 --- import { Tabs, TabItem, Aside } from '@astrojs/starlight/components'; iGregulator uses **Bearer tokens**: a single opaque API key sent in the `Authorization` header. No OAuth, no JWTs, no per-request signatures. Every authenticated endpoint on `api.igregulator.io` uses the same scheme. ## 1. Overview | | Public | Authenticated | | --- | --- | --- | | Needs a key | — | ✓ | | Rate limit | 10 req / IP / hour | per-plan quota (Starter 10k/mo, Pro 100k/mo, Business fair-use) | | Example endpoints | `/v1/check`, `/v1/jurisdictions`, `/v1/operators/search` | `/v1/operators/:slug`, `/v1/licenses/:id`, `/v1/jurisdictions/:code` | | Who it's for | quick lookup, demo, embed in a landing page | production integrations, bulk jobs, compliance sweeps | See [/pricing](https://igregulator.io/pricing) for the full plan comparison. Signup is open and free for founding members (full Starter plan); create an account at [app.igregulator.io/signup](https://app.igregulator.io/signup). Paid self-serve billing lands with Phase 2. ## 2. Generating keys 1. Sign in at [app.igregulator.io](https://app.igregulator.io/login). 2. Go to **[API keys](https://app.igregulator.io/api-keys)** in the nav. 3. Click **+ generate new key**. Give it a descriptive label ("Production server", "Local dev", "Staging job"). 4. The raw key is displayed **once** — copy it now. We store a SHA-256 hash only and cannot recover the plaintext. Lose it → rotate. Key format: `igk_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX` (36 chars total). The `igk_` prefix enables GitHub secret-scanning detection, so if you accidentally commit a key it gets flagged before a bot finds it. ## 3. Using keys ```bash curl -H "Authorization: Bearer igk_yourkeyhere" \ https://api.igregulator.io/v1/operators/paddy-power-holdings-limited ``` ```js const res = await fetch( 'https://api.igregulator.io/v1/operators/paddy-power-holdings-limited', { headers: { Authorization: `Bearer ${process.env.IGREGULATOR_API_KEY}` } }, ); if (!res.ok) throw new Error(`${res.status} ${await res.text()}`); const operator = await res.json(); ``` ```python import os, requests r = requests.get( 'https://api.igregulator.io/v1/operators/paddy-power-holdings-limited', headers={'Authorization': f'Bearer {os.environ["IGREGULATOR_API_KEY"]}'}, timeout=10, ) r.raise_for_status() operator = r.json() ``` Public endpoints work without a key too, but attaching one **skips the 10/hour IP cap** and uses your plan quota instead — useful when serving dashboards that can burst past the public ceiling. ## 4. Key rotation Rotation is overlap-based — create the new key, deploy it, then revoke the old one. No grace period is needed at our end; old keys stay valid until you explicitly revoke them. 1. Generate a new key. Label with the rotation reason. 2. Deploy the new key to every consumer (CI variables, running services, teammates' `.env` files). 3. Verify traffic shifted — check the *Last used* column on the [API keys](https://app.igregulator.io/api-keys) page; the old key should show no recent usage. 4. Click **Revoke** on the old key in the dashboard. Confirmation is required. Revocation is immediate — the next request carrying the old key returns `401 auth_revoked`. ## 5. Security - **Never commit keys to version control.** We don't scan public repos for you; a leaked key is your risk. GitHub's secret-scanning may flag `igk_` strings, but don't rely on it as a safety net. - **Use environment variables.** `process.env.IGREGULATOR_API_KEY` in Node, `os.environ['IGREGULATOR_API_KEY']` in Python, Docker secrets in containerised deploys. - **One key per client.** Separate keys per environment (prod / staging / dev / CI) make revocation surgical — you kill the leaked instance without affecting every consumer. - **Keys are stored hashed.** SHA-256, never plaintext. If the DB is ever read out, the keys themselves don't leak — only their prefixes (displayed in the UI anyway). - **HTTPS only.** The API doesn't listen on port 80; HTTP would leak the key in plain text. - **Report compromises** to founder@igregulator.io. We'll help triage and can check for anomalous usage patterns on our side. ## 6. Rate limits and quotas Two independent ceilings enforced on every authenticated request: - **Per-second rate limit** — sustains your plan's burst ceiling (Starter 5/s, Pro 20/s, Business 100/s, Enterprise unlimited). Breach → `429 rate_limited`. - **Monthly request quota** — plan quota per calendar month, UTC reset at the first of the month. Breach → `429 quota_exceeded`. Every authenticated response carries headers you can read to stay ahead of either ceiling: | Header | Meaning | | --- | --- | | `X-Monthly-Quota-Limit` | Your plan's monthly ceiling, or `unlimited` | | `X-Monthly-Quota-Used` | Count so far this month (omitted when `unlimited`) | | `X-Monthly-Quota-Remaining` | Quota minus used (omitted when `unlimited`) | | `X-Monthly-Quota-Reset` | ISO-8601 timestamp when the counter rolls over | | `X-Monthly-Quota-Warning` | Present when usage ≥ 80%: `80% of monthly limit used` | | `X-RateLimit-Limit` | Per-second ceiling for the current plan | | `X-RateLimit-Policy` | Human-readable: `tier=starter;limit=5;window=second` | | `RateLimit-Policy` | IETF draft format: `"default";q=5;w=1` | See the [rate limits guide](/docs/rate-limits/) for parser examples and the full tier table. ## 7. Errors Authenticated endpoints return structured JSON on every non-2xx. Branch on `code` for behaviour, `details.reason` for refinement, and use `details.suggestion` verbatim in user-facing messaging when present. | Status | code | When | | --- | --- | --- | | 401 | `auth_required` | No `Authorization` header. | | 401 | `auth_invalid` | Header malformed or key not recognised. | | 401 | `auth_revoked` | Key was revoked via the dashboard. | | 402 | `payment_required` | Your plan is `null` or `canceled`. | | 429 | `rate_limited` | Per-second ceiling breached. Sleep, retry once. | | 429 | `quota_exceeded` | Monthly quota exhausted. Wait for `reset_at` or upgrade. | Full code reference lives in the [error handling guide](/docs/errors/). Example body (401): ```json { "error": "API key has been revoked", "code": "auth_revoked", "details": { "reason": "api_key_revoked", "suggestion": "Generate a new API key at https://app.igregulator.io/api-keys. Revoked keys cannot be restored." } } ``` --- # File: docs/rate-limits.mdx --- title: Rate limits description: Public per-IP caps, authenticated per-plan quotas, and 429 handling. template: doc sidebar: order: 4 --- Two independent ceilings: **public per-IP** (no key) and **authenticated per-plan** (monthly quota). Authenticated requests skip the IP ceiling entirely. ## Public endpoints (no key) | Endpoint | Limit | | --- | --- | | `GET /v1/check` | 10 req / IP / hour | | `GET /v1/jurisdictions` | 10 req / IP / hour | | `GET /v1/operators/search` | 10 req / IP / hour (+ 3-row cap on `limit`) | The window is clock-aligned: the counter resets at the top of each UTC hour. A caller that burns 10 requests at 14:58 waits two minutes, not sixty. ## Authenticated endpoints | Plan | Monthly quota | Burst | | --- | --- | --- | | Starter | 10,000 calls | 5 req/sec | | Pro | 100,000 calls | 20 req/sec | | Business | Fair use | No fixed cap | | Enterprise | Custom | Negotiated | **Per-plan enforcement lands with Phase 2 billing.** Plan-aware monthly quotas and burst ceilings activate once Stripe goes live. Until then a lightweight **pre-launch daily cap** of 10,000 requests/day per API key acts as a safety net — see below. ## Pre-launch limitations While Stripe integration is pending, authenticated keys on any plan tier except **business** / **enterprise** carry an additional soft cap of **10,000 requests/day per key**, counted per UTC day and reset at `00:00:00Z`. Rationale: a key shared with a reviewer or journalist shouldn't be able to silently exhaust our Cloudflare bandwidth budget before billing lands. The 10k/day ceiling is deliberately picked to exceed the Pro plan average (~3,333 req/day over a 30-day month at 100k/mo), so a realistic Pro customer never hits it. Hit the cap → `429 rate_limited`: ```json { "error": "Pre-launch rate limit exceeded", "code": "rate_limited", "details": { "reason": "prelaunch_daily_cap", "current_usage": 10000, "limit": 10000, "reset_at": "2026-04-21T00:00:00.000Z", "suggestion": "Pre-launch keys are capped at 10000 requests/day. Upgrade to a paid plan at https://igregulator.io/pricing for production use." } } ``` Response carries `X-Prelaunch-Daily-Limit`, `-Used`, `-Reset` headers so clients can monitor consumption. Cap lifts automatically when the plan enforcement layer takes over; no client change needed. ## Response headers Every public-endpoint response includes: | Header | Meaning | | --- | --- | | `X-RateLimit-Limit` | Ceiling for this caller on this endpoint in the current window (e.g. `10`). | | `X-RateLimit-Remaining` | Requests left. Never negative. | | `X-RateLimit-Reset` | Unix epoch seconds when the window rolls over. | | `X-RateLimit-Policy` | Human-friendly policy string: `tier=public;limit=10;window=hour`. Hand-readable, easy to `awk`. | | `RateLimit-Policy` | IETF draft format ([draft-ietf-httpapi-ratelimit-headers-09](https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/)): `"default";q=10;w=3600`. Modern HTTP clients (Cloudflare SDK, Kong, etc.) auto-parse this. | | `X-Upgrade-URL` | `https://igregulator.io/pricing` — surfaced so a UI can link "upgrade to keep going" on 429. | ### Parsing the policy headers ```js // X-RateLimit-Policy — custom: tier=public;limit=10;window=hour const policyCustom = Object.fromEntries( res.headers.get('X-RateLimit-Policy').split(';').map((kv) => kv.split('=')), ); // { tier: 'public', limit: '10', window: 'hour' } // RateLimit-Policy — IETF: "default";q=10;w=3600 const policyIetf = res.headers.get('RateLimit-Policy'); const q = policyIetf.match(/q=(\d+)/)?.[1]; // quota const w = policyIetf.match(/w=(\d+)/)?.[1]; // window in seconds ``` ## Handling 429 ```http HTTP/2 429 X-RateLimit-Limit: 10 X-RateLimit-Remaining: 0 X-RateLimit-Reset: 1776604800 X-Upgrade-URL: https://igregulator.io/pricing { "error": "Public rate limit reached (10/hour/IP).", "code": "rate_limited", "details": { "limit": 10, "window_seconds": 3600, "upgrade_url": "https://igregulator.io/pricing" } } ``` Recommended client behaviour: 1. Check `X-RateLimit-Remaining` before every call. 2. On 429, sleep until `X-RateLimit-Reset`, then retry once. 3. If the same caller keeps hitting 429, that's a signal to authenticate or upgrade — not to back-off-and-retry indefinitely. ## Tips - **Don't scrape the public endpoint.** Paginate the authenticated `/v1/jurisdictions/:code/operators` list once a day and cache — the data only refreshes at 03:00 UTC anyway. - **Front-ends that surface check results to end-users** — apply the 10/hour IP limit on *your* server and call the API with a single authenticated key; don't let every browser session hit us directly or the shared IP will burn out your quota. - **Bulk re-verification** (weekly AML sweep of 2,000 operators) — use the authenticated operator/licence endpoints, not `/v1/check`. --- # File: docs/endpoints.mdx --- title: Endpoints description: Map of every public and authenticated endpoint. template: doc sidebar: order: 5 --- Every endpoint has full schemas in the [API reference](/docs/api/) and an interactive executor in the [playground](/docs/playground/). This page is a map, not a reference — use it to pick which surface you need. ## Public (no key) | Endpoint | Use | | --- | --- | | `GET /v1/check` | Verify a domain or licence number in one round trip. Confidence-scored. See the [confidence guide](/docs/confidence/). | | `GET /v1/jurisdictions` | List the six jurisdictions we cover, with name / country / currency / licence types. | | `GET /v1/operators/search?q=…` | Type-ahead search by operator display name or trading name. Unauthenticated: capped at 3 rows. | ## Authenticated (Bearer token) | Endpoint | Use | | --- | --- | | `GET /v1/jurisdictions/:code` | Single jurisdiction detail. | | `GET /v1/jurisdictions/:code/operators` | Paginated operators under that jurisdiction. Sort = display_name asc. | | `GET /v1/operators/:slug` | Full operator detail — metadata + licences + domains. | | `GET /v1/operators/:slug/licenses` | Licences for one operator. Append `?include_history=true` for the status-change log. | | `GET /v1/licenses/:license_id` | Licence by uuid. Useful when you want a pinned detail page. | | `GET /v1/licenses/:license_id/history` | Status-change timeline for a single licence. | ## System | Endpoint | Use | | --- | --- | | `GET /v1/health` | Liveness probe. 200 if postgres + redis are reachable. | | `GET /v1/health/coverage` | Per-jurisdiction scraper freshness — last successful scrape, age in hours, fresh vs stale flag per our SLA (UKGC 24 h, MGA / CW / KH 48 h). Public at 10 req / IP / hour. | | `GET /openapi.json` | Canonical OpenAPI 3.1 spec. Consume it, generate a client, etc. | ## Response shape conventions - **Top-level list endpoints** return an envelope: `{ q, total, limit, offset, , _meta }`. These support pagination via `?limit=` + `?offset=`. Example: `GET /v1/jurisdictions/:code/operators`, `GET /v1/operators/search`, `GET /v1/operators/:slug/licenses`, `GET /v1/licenses/:id/history`. - **Single-row endpoints** return the row directly — no wrapping envelope. - **Nested arrays on detail endpoints are bare arrays, not paginated.** `GET /v1/operators/:slug` returns a full operator with its `licenses[]` + `domains[]` inline, without limit / offset. Need pagination over an operator's licences? Use `GET /v1/operators/:slug/licenses` instead — the paginated form ships a proper envelope. - Timestamps are ISO-8601 UTC (`2026-04-19T12:00:00Z`). - Dates are `YYYY-MM-DD`. - UUIDs are lowercase, hyphenated, v4. --- # File: docs/confidence.mdx --- title: Confidence scoring description: How /v1/check picks between high / medium / low, and when it refuses. template: doc sidebar: order: 6 --- The `/v1/check` endpoint returns a `match` object with a `confidence` field. This page explains what each level means, how we pick it, and how UIs should render it. ## The three levels | `confidence` | What it means | Render as | | --- | --- | --- | | `high` | Exact or root-domain match in our authoritative registry. | Green check. Safe to say "this site is licensed by X". | | `medium` | Domain root matched a trading name or operator name. We can identify the operator but can't *prove* this domain is theirs. | Amber / neutral. "Likely operated by X" phrasing. | | `low` | A weak fuzzy match (operator returned, but below the strong-similarity bar), **or** the domain root is a generic gambling term (`casino.com`, `poker.com`) where too many operators share the label to pick one. | Gray / warning. "We can't confirm this domain." | A fourth value — `none` — appears in the top-level `confidence` field (not `match.confidence`) when `match` is `null`. ## Why a query missed: `match_absence_reason` When `match` is `null`, `low`/`none` on its own is ambiguous — it used to conflate "generic term" with "we checked and it isn't there". So on a miss the response carries two extra fields (present **only** when `match` is `null`) so you can phrase the answer precisely instead of guessing: | `match_absence_reason` | Meaning | Say to your user | | --- | --- | --- | | `generic_term` | The label is an ultra-generic gambling word (`casino.com`); we can't map it to one operator. | "Can't identify a specific operator from this domain." | | `no_record_found` | A specific query we checked against **every** covered register and did not find. | "Not found in any of the N jurisdictions iGregulator covers." — **never** an unqualified "unlicensed". | `checked_jurisdictions` accompanies the miss — the exact register codes we checked (e.g. `["AN","CW","KH","MGA","TGC","UKGC"]`) — so a "not licensed" claim is always scoped to our coverage, never stated as an absolute. ```json { "query": { "domain": "some-unknown-site.com" }, "match": null, "confidence": "none", "match_absence_reason": "no_record_found", "checked_jurisdictions": ["AN", "CW", "KH", "MGA", "TGC", "UKGC"] } ``` These fields are **absent** when there's a match — check `match === null` first, then branch on `match_absence_reason`. ## match_type Tells you *how* we arrived at the match, useful for debugging and UX differentiation. | `match_type` | Source | | --- | --- | | `domain_exact` | Found the domain in the `domains` table — sourced from the regulator's official register. Carries `domain_association` (`direct` or `white_label`). | | `trading_name_fuzzy` | Trigram similarity ≥ 0.55 against `operators.trading_names[]` after stripping the TLD. Used when the domain isn't registered but the brand exists. 0.55 was picked empirically against the UKGC register: it catches legitimate variants (`paddypower` ↔ `paddy-power`, `skybet` ↔ `sky-bet`) while rejecting the long tail of single-syllable collisions (`gold`, `star`, `royal`) where the label is too generic to mean one operator. Below 0.55 we land in `low`-confidence territory either way; above it the trigger is stable. | | `name_similarity` | Last-chance similarity against `operators.display_name` — rarely fires for B2C domains, useful when no trading name was populated upstream. | ## domain_association When `match_type = domain_exact`, we differentiate: - **`direct`** — the licensee runs the site themselves. The `operator` field is the company your end-user is gambling with. - **`white_label`** — the licensee has authorised a third-party brand to trade on the domain under their permit. The `operator` field is the *licensee*, not the brand. UK-licensed white-label arrangements are legal and common; surfacing the relationship lets you show "operated by Brand X under ProgressPlay's UKGC permit". Fuzzy matches (`trading_name_fuzzy`, `name_similarity`) don't populate `domain_association` — we don't have a domain row to read it from, so the field is `null`. ## Tiebreaking for equal similarity Brand names like *Paddy Power* trigram-match several sister companies (PPB Counterparty, PPB Entertainment, PPB GE, Power Leisure Bookmakers) at similarity `1.0`. To keep the primary match **stable across DB reindex and VACUUM**, `/v1/check` applies a documented tiebreaker cascade whenever the top candidates are tied on similarity: 1. **similarity DESC** — closeness wins first, as ever. 2. **has_active DESC** — operators with at least one `active` licence are preferred over operators whose licences are all expired / revoked. 3. **oldest_active_issued ASC** — among active-licence candidates, the one whose *oldest* active licence issued first wins. Stability signal: a parent entity that has been licensed longest is the most useful "who actually runs this brand" answer. 4. **total_licenses DESC** — more licences across the register → more likely a parent entity rather than a single-purpose subsidiary. 5. **operator_slug ASC** — lexicographic final fallback. Always deterministic even when every previous rank is tied. Clients that cache domain → operator mappings can rely on the primary result remaining stable between index rebuilds; any change in primary reflects a change in the underlying registry data, not PG query randomness. ## alternatives[] Up to 3 runner-up candidates, sorted by similarity descending. - On `confidence: medium`, these are operators with the same or similar trading name that we ranked below the primary match. - On `confidence: low` (generic label), this is always `[]` — we refuse to guess when the label is ambiguous. - On `confidence: high` / `none`, also `[]`. ## Why the generic-label filter exists Without it, `GET /v1/check?domain=casino.com` would return "Casino MK Limited" with `confidence: medium` — deterministically, because "casino" matches that trading name at similarity 1.0. But the actual casino.com is licensed elsewhere (MGA, Gibraltar) which we don't cover, and the answer "yes, Casino MK owns it" would be wrong. The blocklist is a substring regex over the normalised label — `casino`, `poker`, `bingo`, `gambling`, `bet`, `slot`/`slots`, `sportsbook`, `roulette`, `blackjack`, `wager`, `lottery`, `gaming`. Matches anywhere in the label, so `casino.com`, `bestcasino.com`, and `casino-bonus.com` all return `confidence: low` with empty `alternatives[]`. Licensed brands containing one of these keywords (`bet365.com`, `pokerstars.com`) still resolve to `high` because the domain-exact match runs before the generic gate. --- # File: docs/batch.mdx --- title: Batch domain check description: Verify many domains in one request — a KYB sweep or affiliate-list audit without N sequential calls. template: doc sidebar: order: 8 --- `POST /v1/check/batch` resolves up to **100 domains in one request**, so a KYB sweep or affiliate-list audit is one round trip instead of N. 200 merchants = 2 calls, not 200. Authenticated (the single `GET /v1/check` stays keyless); domains only. ## Request ```bash curl -s -X POST https://api.igregulator.io/v1/check/batch \ -H "Authorization: Bearer $IGREGULATOR_KEY" \ -H "Content-Type: application/json" \ -d '{"domains":["bet365.com","www.virginbet.com","casino.com"]}' ``` ## Response `checked_jurisdictions` is returned **once** at the top (it's the same for every row — keeps the payload lean). Each result mirrors the single-check shape: `match`, `confidence`, and `match_absence_reason` on a miss. ```json { "count": 3, "checked_jurisdictions": ["AN", "CW", "KH", "MGA", "TGC", "UKGC"], "results": [ { "query": { "domain": "bet365.com" }, "match": { "operator": "Hillside (UK Sports) ENC", "...": "…" }, "confidence": "high" }, { "query": { "domain": "www.virginbet.com" }, "match": { "operator": "Virgin Bet Limited", "domain_association": "white_label", "...": "…" }, "confidence": "high" }, { "query": { "domain": "casino.com" }, "match": null, "confidence": "low", "match_absence_reason": "generic_term" } ] } ``` ## Partial success A malformed hostname doesn't fail the batch — that row comes back with an `error` and `match: null`, and every other domain still resolves: ```json { "query": { "domain": "not a domain" }, "match": null, "confidence": "none", "error": "invalid_hostname" } ``` ## Limits & semantics - **Max 100 domains** per request; paginate beyond. - Domains are resolved with bounded concurrency server-side — order of `results` follows the order you sent. - Counts as **one request** against your plan quota today. - Each result uses the same matching as `GET /v1/check` (exact host → eTLD+1 → fuzzy), so `www`/apex variants resolve identically. ## Clients ```python import requests r = requests.post( "https://api.igregulator.io/v1/check/batch", headers={"Authorization": f"Bearer {KEY}"}, json={"domains": domains[:100]}, ) for row in r.json()["results"]: m = row["match"] if m and row["confidence"] in ("high", "medium"): verdict = f"{m['operator']} ({m['status']})" elif row.get("error"): verdict = f"invalid: {row['error']}" else: verdict = f"no match ({row.get('match_absence_reason')})" print(row["query"]["domain"], "→", verdict) ``` ```javascript const res = await fetch('https://api.igregulator.io/v1/check/batch', { method: 'POST', headers: { Authorization: `Bearer ${KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ domains: domains.slice(0, 100) }), }); const { results } = await res.json(); for (const row of results) { if (row.match && ['high', 'medium'].includes(row.confidence)) { console.log(row.query.domain, '→', row.match.operator, row.match.status); } else { console.log(row.query.domain, '→', row.error ?? `no match (${row.match_absence_reason})`); } } ``` --- # File: docs/point-in-time.mdx --- title: Point-in-time lookups (as_of) description: Reconstruct a licence's status as of a past date — strictly within iGregulator's observation window. template: doc sidebar: order: 9 --- iGregulator keeps the **transition history** of every licence, so you can ask "what was this operator's status on date X" — the question a compliance review asks constantly ("was this merchant licensed at the time of the transaction three months ago?") and the one no incumbent can answer, because none keeps the history. Pass `?as_of=` to `/v1/check`, `/v1/licenses/{id}`, or `/v1/operators/{slug}`. ## The one rule that matters **`as_of` answers only within our observation window. We never extrapolate a status before `tracking_since` — the moment we first recorded the licence.** Our history begins when our scraper first saw a record (`change_type: "created"`). We do not know what was true before that, so we never guess. Asking about a date before `tracking_since` returns `knowledge: "before_tracking"` with a **null** status — not a fabricated "active". A tool that invented pre-observation history would force you to assert a historical fact the data never witnessed; that is worse than not having the feature. > **iGregulator answers "as of date X" only within its observation window — > it tells you when it started watching, and never invents a status it didn't > observe.** ## The three states of knowledge | `knowledge` | When | `status_as_of` | | --- | --- | --- | | `observed` | The date is within our window (≥ `tracking_since`). | The real status then. | | `before_tracking` | The date predates when we started watching. | `null` — unknowable, **not** a guess. `tracking_since` tells you the lower bound. | | `no_such_license` | We have no history for this licence at all. | `null`. | | `no_license_resolved` | (`/v1/check` only) A fuzzy match with no specific licence to time-travel. | `null`. | The `as_of` object also returns `established_by` — the exact history transition in effect on your date (`changed_at`, `new_status`, `change_type`, `source_url`) — so you can see *when* that status was last confirmed relative to your query. ## Date semantics - A bare `YYYY-MM-DD` is interpreted as **end of that day, UTC** (status at close of day). - A full ISO-8601 datetime is honoured as given. - A date **in the future** returns `400` — we never answer about a date we haven't observed. It is never silently clamped to "now". ## Examples `before_tracking` — asking before we started watching: ```json // GET /v1/licenses/140a822c-…?as_of=2026-01-01 { "as_of": "2026-01-01T23:59:59.999Z", "knowledge": "before_tracking", "status_as_of": null, "established_by": null, "tracking_since": "2026-04-17T15:15:40.055Z" } ``` `observed` — a date after a revocation transition: ```json // GET /v1/licenses/140a822c-…?as_of=2026-05-20 { "as_of": "2026-05-20T23:59:59.999Z", "knowledge": "observed", "status_as_of": "revoked", "established_by": { "changed_at": "2026-05-13T01:00:05.215Z", "new_status": "revoked", "change_type": "status_change", "source_url": "https://www.gamblingcommission.gov.uk/downloads/business-licence-data.zip" }, "tracking_since": "2026-04-17T15:15:40.055Z" } ``` ## Endpoint notes - **`/v1/licenses/{id}?as_of=`** — cleanest: one licence, one `as_of` object. - **`/v1/operators/{slug}?as_of=`** — resolved **per licence** (each licence in the array gets its own `as_of`); we don't collapse a multi-jurisdiction operator into a single status — you aggregate as your policy requires. - **`/v1/check?domain=X&as_of=`** — the domain→operator attribution is taken as **current**; only the licence **status** is time-travelled. Historical domain attribution (we keep `first_seen` on domains) is a future addition. --- # File: docs/pagination.mdx --- title: Pagination description: limit and offset on list endpoints. template: doc sidebar: order: 7 --- List endpoints accept `limit` and `offset` query parameters and return a `total` in the envelope so callers know when they've walked to the end. ## Endpoints that paginate - `GET /v1/operators/search?q=…` - `GET /v1/jurisdictions/:code/operators` - `GET /v1/operators/:slug/licenses` ## Parameters | Param | Default | Max | Notes | | --- | --- | --- | --- | | `limit` | 50 | 200 (authenticated) / 3 (unauth on `/operators/search`) | Negative values rejected with 400. | | `offset` | 0 | — | Zero-based. High offsets have linear scan cost; prefer stable cursor if you're walking 10k+ rows. | ## Response envelope ```json { "q": "...", "total": 1420, "limit": 50, "offset": 100, "operators": [ ] } ``` - `total` — rows matching the query, ignoring limit/offset. - `operators[].length <= limit`. ## Walking a result set ```bash # Bash loop — fetch all operators for UKGC. offset=0 while :; do resp=$(curl -sH "Authorization: Bearer $KEY" \ "https://api.igregulator.io/v1/jurisdictions/UKGC/operators?limit=200&offset=$offset") rows=$(echo "$resp" | jq '.operators | length') [ "$rows" -eq 0 ] && break echo "$resp" | jq '.operators[]' offset=$((offset + rows)) done ``` ## Why not cursor-based? Offset pagination is simpler to document, easier for UIs that render page numbers, and cheap for our table sizes (~3,700 operators, ~4,500 licences). When any list crosses 100k rows we'll add a `cursor` query param alongside — offset stays supported for back-compat. ## Rate-limit interplay Each paginated request is one API call against your quota. A full sweep of 3,700 UKGC operators at `limit=200` is 19 calls — well within Starter's 10k monthly quota, trivial within Pro's 100k. --- # File: docs/errors.mdx --- title: Error handling description: Error code reference, retry strategy, deprecation warnings. template: doc sidebar: order: 8 --- Errors always return JSON with a stable `code` field. Use `code` for branching in clients, not the HTTP status or the human message (the human message can change). ## Response shape ```json { "error": "Human-readable explanation.", "code": "rate_limited", "details": { "limit": 10, "window_seconds": 3600 } } ``` ## Code reference | HTTP | code | When | Retry? | | --- | --- | --- | --- | | 400 | `invalid_query` | Missing / malformed query params. | No — fix the request. | | 400 | `invalid_slug` | Slug path param failed validation. | No. | | 401 | `auth_required` | No `Authorization` header on a gated endpoint. | No — attach the header. | | 401 | `auth_invalid` | Header malformed or key not recognised. | No. | | 401 | `auth_revoked` | Key has been revoked. | No — provision a new key. | | 404 | `not_found` | Slug / id not in the registry. | No. | | 429 | `rate_limited` | Hit the public 10/hr or per-plan ceiling. | Yes — wait until `X-RateLimit-Reset`. | | 500 | `server_error` | Unhandled upstream failure. | Yes — exponential backoff, 3 attempts. | ## Retry strategy - **4xx** — fix the request, don't retry. The same bad input will always 4xx. - **429** — sleep until `X-RateLimit-Reset` (Unix epoch seconds), retry once. If you hit 429 again, you're under-provisioned — upgrade or authenticate, don't loop. - **5xx** — exponential backoff up to 3 attempts (1s, 2s, 4s). If a 5xx persists past 4 seconds, you're better off surfacing a failure state than holding the UI hostage. ## Reference implementation (JavaScript) ```js async function igRequest(path, init = {}, attempt = 0) { const r = await fetch('https://api.igregulator.io' + path, init); if (r.ok) return r.json(); const body = await r.json().catch(() => ({ code: 'parse_error' })); const code = body.code ?? 'unknown'; if (r.status === 429) { const reset = Number(r.headers.get('X-RateLimit-Reset')) * 1000; const sleepFor = Math.max(0, reset - Date.now()); if (sleepFor < 5 * 60_000) { await new Promise((res) => setTimeout(res, sleepFor + 1_000)); return igRequest(path, init, attempt + 1); } } if (r.status >= 500 && attempt < 3) { await new Promise((res) => setTimeout(res, 2 ** attempt * 1_000)); return igRequest(path, init, attempt + 1); } const err = new Error(body.error ?? r.statusText); err.status = r.status; err.code = code; err.details = body.details; throw err; } ``` --- # File: docs/code-examples.mdx --- title: Code examples description: curl / JavaScript / Python snippets for common operations. template: doc sidebar: order: 9 --- Copy-paste snippets for the two things every integration does first — check a domain, then walk an authenticated list. No SDKs yet; the API is small enough that 15 lines of `fetch` / `requests` does the job. ## Domain check — curl ```bash curl -sG https://api.igregulator.io/v1/check \ --data-urlencode 'domain=paddypower.com' | jq ``` ## Domain check — JavaScript (fetch) ```js const res = await fetch( 'https://api.igregulator.io/v1/check?domain=paddypower.com' ); const { match, confidence } = await res.json(); if (confidence === 'high' || confidence === 'medium') { console.log( `${match.operator} (licensed by ${match.jurisdiction}, ${match.license_number}) — confidence ${confidence}`, ); } else { console.log('No confident match found.'); } ``` ## Domain check — Python (requests) ```python import requests r = requests.get( 'https://api.igregulator.io/v1/check', params={'domain': 'paddypower.com'}, timeout=5, ) r.raise_for_status() data = r.json() match = data.get('match') if match and data['confidence'] in ('high', 'medium'): print(f"{match['operator']} — {match['jurisdiction']} {match['license_number']}" f" (confidence={data['confidence']})") else: print('No confident match.') ``` ## Walk all UKGC operators — JavaScript ```js const KEY = process.env.IGREGULATOR_KEY; const BASE = 'https://api.igregulator.io'; async function* paginate(jurisdiction) { let offset = 0; const limit = 200; while (true) { const r = await fetch( `${BASE}/v1/jurisdictions/${jurisdiction}/operators?limit=${limit}&offset=${offset}`, { headers: { Authorization: `Bearer ${KEY}` } }, ); if (!r.ok) throw new Error(`HTTP ${r.status}`); const body = await r.json(); if (body.operators.length === 0) return; for (const op of body.operators) yield op; offset += body.operators.length; } } for await (const op of paginate('UKGC')) { console.log(op.slug, op.display_name); } ``` ## Walk all UKGC operators — Python ```python import os, requests KEY = os.environ['IGREGULATOR_KEY'] BASE = 'https://api.igregulator.io' session = requests.Session() session.headers['Authorization'] = f'Bearer {KEY}' def paginate(jurisdiction): offset, limit = 0, 200 while True: r = session.get( f'{BASE}/v1/jurisdictions/{jurisdiction}/operators', params={'limit': limit, 'offset': offset}, timeout=10, ) r.raise_for_status() rows = r.json()['operators'] if not rows: return for op in rows: yield op offset += len(rows) for op in paginate('UKGC'): print(op['slug'], op['display_name']) ``` ## Bulk domain verification — rate-limit aware If you're verifying a list of 500 domains as part of a nightly sweep, authenticate and sleep between requests to stay under the per-second burst cap. Simpler than retry-on-429. ```python import os, time, requests KEY = os.environ['IGREGULATOR_KEY'] DOMAINS = open('domains.txt').read().splitlines() S = requests.Session() S.headers['Authorization'] = f'Bearer {KEY}' for d in DOMAINS: r = S.get( 'https://api.igregulator.io/v1/check', params={'domain': d}, timeout=5, ) if r.status_code == 429: reset = int(r.headers.get('X-RateLimit-Reset', 0)) wait = max(1, reset - int(time.time())) time.sleep(wait + 1) r = S.get( 'https://api.igregulator.io/v1/check', params={'domain': d}, timeout=5, ) r.raise_for_status() print(d, r.json()['confidence']) time.sleep(0.05) # 20 req/sec ceiling headroom for Pro ``` --- # File: docs/webhooks.mdx --- title: Webhooks description: Push-based alerts on licence changes, expiries, and regulatory actions. HMAC-signed, retried, deliverable to any HTTPS endpoint. template: doc sidebar: order: 11 --- import { Tabs, TabItem, Aside } from '@astrojs/starlight/components'; iGregulator delivers change alerts via HTTP POST to a URL you control. Ten event types, HMAC-SHA256 signed, retried seven times with jittered backoff. Sign up for alerts in the **[dashboard](https://app.igregulator.io/webhooks)** — no code needed, create a URL, select events, copy the secret once. > Just want to wire it up fast? Skip to > [/docs/webhooks/quickstart](/docs/webhooks/quickstart/) for a > 2-minute tour using webhook.site as the receiver. ## 1. Event types Ten dot-notation events. Subscribe to any subset per endpoint. | Event | Fires | | --- | --- | | `license.status_changed` | Any transition between `active` / `suspended` / `revoked` / `expired`. | | `license.expiring_30d` | Active licence with `expiry_date` exactly 30 days from now. | | `license.expiring_60d` | Same, 60 days. | | `license.expiring_90d` | Same, 90 days. | | `license.expired` | Status became `expired` (either explicitly or via date). | | `license.issued` | New licence first observed in a scraper run. | | `regulatory_action.added` | Fine / warning / revocation / licence_suspension entry added. | | `coverage.degraded` | `/v1/health/coverage` transitions to `degraded` for a jurisdiction. | | `coverage.restored` | `/v1/health/coverage` transitions back to `healthy`. | | `webhook.endpoint_degraded` | Self-notification: one of your endpoints is failing >20% of deliveries. Fires only to your OTHER endpoints. | ## 2. Envelope Every event — production or test — carries the same outer shape: ```json { "event": "license.status_changed", "event_id": "evt_01HX8EGQK3J7WA6MYTP7ZGYF21", "api_version": "2026-04-20", "timestamp": "2026-04-20T14:32:00.000Z", "livemode": true, "data": { "license_id": "uuid", "license_number": "039028-R-319297-013", "operator_id": "uuid", "operator_slug": "888-uk-limited", "jurisdiction_code": "UKGC", "previous_status": "active", "new_status": "suspended", "changed_at": "2026-04-20T03:04:12.000Z", "source_url": "https://www.gamblingcommission.gov.uk/..." } } ``` - `event_id` — ULID, sorted lexicographically. Dedupe on this. `previous_status` may be `null` when a licence is first observed (`change_type: created` — status is its initial state). - `api_version` — date constant. Bumped when a `data` shape gets a breaking change; old subscribers keep receiving the previous version until they upgrade. - `livemode` — `false` only for `test.ping` events from the dashboard Test button. - `timestamp` — when we emitted the event. Not when the change happened (that's in `data.*_at`). ### Regulatory action amounts `regulatory_action.added` includes `amount_minor_units` — in the **smallest currency unit** (pence for GBP, cents for USD). £5 million = `5000000000`. We store it this way so a £5M fine never gets confused with £5,000. ```json { "action_type": "fine", "amount_minor_units": 5000000000, "currency": "GBP" } ``` ## 3. Headers Every delivery, including retries: ``` Content-Type: application/json User-Agent: iGregulator-Webhook/1 (+https://igregulator.io) X-iGregulator-Event: license.status_changed X-iGregulator-Event-Id: evt_01HX8EGQK3J7WA6MYTP7ZGYF21 X-iGregulator-Timestamp: 1776717845 X-iGregulator-Delivery-Id: 3b1e2c4a-f8d1-4a7c-8b5f-111111111111 X-iGregulator-Attempt: 2 X-iGregulator-Signature: t=1776717845,v1=abcd…ef01 ``` `X-iGregulator-Attempt` — tells your receiver this is retry N. `X-iGregulator-Missed-Deliveries` appears on the next successful delivery after one or more deliveries were abandoned (all 7 attempts exhausted). Fetch them via `GET /v1/webhooks/:id/deliveries?status=abandoned`. ## 4. Signature verification The `X-iGregulator-Signature` header is a Stripe-style CSV: ``` t=,v1=[,v1=…] ``` - `t` — the timestamp used in the HMAC input. - `v1` — HMAC-SHA256 hex digest. May appear multiple times when a secret rotation is in progress; each `v1` is the signature computed with a different active secret. Accept the delivery if **any** `v1` value matches. Signed input = `${t}.${raw_body}` — the HTTP body is included byte- for-byte; do not re-serialise the JSON before verifying, or whitespace drift will break the MAC. ```js import crypto from 'node:crypto'; function verify(req, rawBody, secret) { const header = req.headers['x-igregulator-signature']; if (!header) return false; const parts = Object.fromEntries( header.split(',').map((p) => p.split('=')), ); const signed = `${parts.t}.${rawBody}`; const expected = crypto.createHmac('sha256', secret) .update(signed).digest('hex'); // Header may have several v1= values — iterate and accept any. const candidates = header .split(',') .filter((p) => p.startsWith('v1=')) .map((p) => p.slice(3)); return candidates.some((candidate) => candidate.length === expected.length && crypto.timingSafeEqual(Buffer.from(candidate), Buffer.from(expected)), ); } ``` ```python import hmac, hashlib def verify(headers, raw_body: bytes, secret: str) -> bool: header = headers.get('x-igregulator-signature') if not header: return False parts = dict(p.split('=', 1) for p in header.split(',')) signed = f"{parts['t']}.{raw_body.decode()}" expected = hmac.new(secret.encode(), signed.encode(), hashlib.sha256).hexdigest() candidates = [p[3:] for p in header.split(',') if p.startswith('v1=')] return any(hmac.compare_digest(c, expected) for c in candidates) ``` ```bash # Debug verification from a saved request. Pass raw body on stdin. SIG_HEADER="t=1776717845,v1=abcd...ef01" SECRET="whsec_..." T=$(echo "$SIG_HEADER" | tr ',' '\n' | awk -F= '/^t/{print $2}') EXPECTED=$(printf '%s.%s' "$T" "$(cat)" \ | openssl dgst -sha256 -hmac "$SECRET" -hex | awk '{print $2}') echo "$SIG_HEADER" | tr ',' '\n' | grep "^v1=" | awk -F= '{print $2}' \ | grep -Fqx "$EXPECTED" && echo "ok" || echo "mismatch" ``` ## 5. Retry policy Failed deliveries retry on a jittered schedule: | Attempt | Delay after previous | With ± 20 % jitter | | --- | --- | --- | | 1 | — | fired immediately | | 2 | 30 s | 24 – 36 s | | 3 | 2 m | 1:36 – 2:24 m | | 4 | 10 m | 8 – 12 m | | 5 | 1 h | 48 – 72 m | | 6 | 6 h | 4.8 – 7.2 h | | 7 | 24 h | 19.2 – 28.8 h | | — | abandon after 7 total attempts | | Failure = any non-2xx response or network error (DNS, timeout, TLS, connection reset). **3xx redirects are treated as failures on purpose** — point your URL at the final destination. Following them silently would let an attacker redirect deliveries to an internal metadata service after the URL passed creation-time checks. Common gotcha: API gateways that issue a transparent `https://` upgrade on `http://` URLs — register the `https://` form directly to avoid the redirect. Timeout per attempt: **10 seconds**. Long-running receivers should ack fast and process async (return 2xx immediately, queue the body for your worker). ## 6. Delivery guarantees - **At-least-once.** A network blip may have us deliver the same event twice. Dedupe on `event_id`. - **No ordering.** Events from different operators run independently; even events on the same operator can arrive out of order during a retry burst. Use `data.*_at` timestamps inside the payload to sequence consumer-side state. - **Scraper outages queue.** If a jurisdiction's scraper stalls, detected changes queue up and fire on the next successful run — no lost events, just a delayed batch. - **Retention:** 30 days for both `webhook_events` (replay window) and `webhook_deliveries` (delivery history). Fetch deliveries via `GET /v1/webhooks/:id/deliveries` while they're still in the window; older rows are pruned daily. ## 7. Integration patterns ### Pattern A — Webhooks primary The simplest setup. Create an endpoint, subscribe to events, process them in real time. ```js app.post('/igregulator-webhook', async (req, res) => { if (!verify(req, rawBody, process.env.WEBHOOK_SECRET)) { return res.status(400).send('bad signature'); } res.sendStatus(200); // ACK fast, process async void queueForProcessing(req.body); }); ``` ### Pattern B — Polling fallback Agent can't accept inbound webhooks (locked-down corporate network, local dev). Use `GET /v1/watchlist/events` — see [watchlist docs](/docs/watchlist/). Bootstrap with `since=`, then switch to cursor pagination. ### Pattern C — Hybrid webhook + polling Run both. Webhooks are the low-latency primary; polling covers the few-hour window where your receiver was down and deliveries might abandon. Because the same `event_id` ships on both channels, dedupe on it and there's no double-processing. ```js import Redis from 'ioredis'; const redis = new Redis(process.env.REDIS_URL); const DEDUPE_TTL = 7 * 24 * 60 * 60; // match webhook_deliveries retention async function processOnce(event) { // SET NX returns null if the key already exists. const set = await redis.set( `event_seen:${event.event_id}`, '1', 'EX', DEDUPE_TTL, 'NX', ); if (set === null) return; // already processed await handleEvent(event); // your business logic } // Webhook handler: app.post('/webhook', async (req, res) => { if (!verify(req, rawBody, secret)) return res.sendStatus(400); res.sendStatus(200); await processOnce(req.body); }); // Hourly polling fallback: async function pollBackfill() { let cursor = await redis.get('watchlist:cursor'); while (true) { const url = new URL('https://api.igregulator.io/v1/watchlist/events'); if (cursor) url.searchParams.set('cursor', cursor); else url.searchParams.set('since', new Date(Date.now() - 3600_000).toISOString()); url.searchParams.set('limit', '100'); const r = await fetch(url, { headers: { Authorization: `Bearer ${apiKey}` } }); const { events, next_cursor, has_more } = await r.json(); for (const event of events) await processOnce(event); if (next_cursor) await redis.set('watchlist:cursor', next_cursor); if (!has_more) break; } } ``` ```python import redis, time, requests r = redis.Redis.from_url(os.environ['REDIS_URL']) DEDUPE_TTL = 7 * 24 * 3600 def process_once(event): # SET NX returns None if key already exists. if r.set(f"event_seen:{event['event_id']}", '1', ex=DEDUPE_TTL, nx=True) is None: return # already processed handle_event(event) # your business logic # Webhook handler (Flask): @app.post('/webhook') def webhook(): if not verify(request.headers, request.get_data(), SECRET): abort(400) event = request.get_json() # ACK fast, process async in a worker queue spawn(process_once, event) return '', 200 # Hourly backfill: def poll_backfill(): cursor = r.get('watchlist:cursor') while True: params = {'limit': 100} if cursor: params['cursor'] = cursor else: params['since'] = (time.time() - 3600).isoformat() + 'Z' resp = requests.get( 'https://api.igregulator.io/v1/watchlist/events', headers={'Authorization': f'Bearer {API_KEY}'}, params=params, timeout=10, ).json() for event in resp['events']: process_once(event) if resp.get('next_cursor'): r.set('watchlist:cursor', resp['next_cursor']) if not resp.get('has_more'): break ``` ## 8. Testing - **Dashboard Test button** — fires a synthetic `test.ping` event at your endpoint. Does NOT create delivery history rows. Use during wiring to confirm the signature path. - **webhook.site** — paste your URL there, hit Test, inspect headers + body. Fastest way to see what a delivery looks like before your server exists. - **ngrok / cloudflared** — tunnel a local dev server to a public URL. `http://localhost` is blocked by our SSRF filter at creation time; a tunnel gives you a real routable host. ## 9. Secret rotation Rotation overlap is 7 days. The rotation flow: 1. Click **Rotate** on the endpoint in the dashboard. 2. We issue a new secret and stamp the previous one with `expires_at = NOW() + 7 days`. 3. Deliveries sign with **both** secrets during the overlap — every delivery carries `v1=,v1=` in the signature header. 4. Update your server to accept the new secret. Existing code that uses the old one keeps verifying until day 7. 5. After day 7, the old secret expires and deliveries sign only with the new one. ## 10. Errors at creation time - `400 invalid_webhook_url` + `details.reason: private_ip_blocked` — URL resolved to a private / loopback / link-local IP (including `169.254.169.254`, AWS + GCP metadata). Use a public host or a tunnel. - `400 invalid_webhook_url` + `details.reason: invalid_scheme` — Only `http://` and `https://` accepted; https strongly preferred. - `400 invalid_query` + `details.reason: invalid_event_type` — You passed an unrecognised event name. Compare against the list in §1. - `403 quota_exceeded` — You hit your plan's `max_webhook_endpoints`. Pause or delete an endpoint, or upgrade. ## 11. Best practices - **Respond 2xx in < 5 seconds, process async.** We time out at 10 s; if your receiver regularly takes 5+ s you'll start hitting retries. - **Verify every delivery.** Skip only for `test.ping` if you treat test events as connectivity checks and not real data. - **Dedupe on `event_id`.** Always. At-least-once delivery means duplicates will happen eventually. - **Don't assume ordering.** Use `data.*_at` timestamps. - **Keep a polling fallback for critical paths** — see Pattern C. - **Alert on `webhook.endpoint_degraded`.** If one of your endpoints is failing, we notify your OTHER endpoints so you don't have to hear about it from a missing downstream action. --- # File: docs/webhooks/quickstart.mdx --- title: Webhooks quickstart description: First webhook delivery in two minutes using webhook.site — no server required. template: doc sidebar: order: 1 --- This is "get it working in two minutes" — no signature verification yet, no production-grade receiver. Point Ctrl+F at [/docs/webhooks](/docs/webhooks/) when you're ready to harden. ## 1. Set up a receiver Open [webhook.site](https://webhook.site). It hands you a unique URL like `https://webhook.site/#!/`. Copy the URL — that's your temporary endpoint. Leave the page open; deliveries show up in real time. ## 2. Create the webhook 1. Sign in to the [iGregulator dashboard](https://app.igregulator.io/webhooks). 2. Click **+ create webhook**. 3. Paste the webhook.site URL. 4. Subscribe to any event type for now — `license.status_changed` is the most common; you can change later. 5. Submit. You'll see a reveal dialog with a `whsec_` secret — **copy it** (you'll need it when you harden later) and click "copy + close". ## 3. Fire a test delivery Back in the dashboard, click **test** on your new endpoint row. A result dialog pops up showing: - `delivered: true` - HTTP status your endpoint returned - Latency in ms - Response body Switch to the webhook.site tab: the request is there, complete with headers, including: ``` X-iGregulator-Event: test.ping X-iGregulator-Event-Id: evt_test_... X-iGregulator-Signature: t=...,v1=... ``` That's a real production-shape delivery — just flagged `livemode: false` in the envelope so you don't treat it as business data. ## 4. Harden for production - **Verify signatures.** See [/docs/webhooks § signature verification](/docs/webhooks/#4-signature-verification) — includes Node / Python / curl examples. - **Replace webhook.site.** Stand up a real HTTP server; the same envelope + headers will land there. - **Dedupe on `event_id`.** At-least-once delivery means duplicates during retries. See [Pattern C](/docs/webhooks/#7-integration-patterns). - **Understand retries.** 7 attempts, jittered backoff — start at 30 s, end at ~24 h. Details in [§5 retry policy](/docs/webhooks/#5-retry-policy). Main reference: [/docs/webhooks](/docs/webhooks/). --- # File: docs/watchlist.mdx --- title: Watchlist description: Track specific operators for automated alerts on licence changes, expiries, and regulatory actions. Webhook push or polling fallback. template: doc sidebar: order: 12 --- import { Aside } from '@astrojs/starlight/components'; A watchlist is your list of operators you care about, plus the automation layer that fires events when any of them changes. Add an operator once, get alerted whenever its licence status flips, a regulatory action lands, or an expiry date approaches — without writing polling loops against every endpoint we offer. ## 1. Overview Plan limits (also in [pricing](/pricing)): | Tier | Watchlist cap | Webhooks | Polling | | --- | --- | --- | --- | | Starter | 25 operators | 1 endpoint | 10 / hour | | Pro | 250 operators | 5 endpoints | 60 / hour | | Business | unlimited | 20 endpoints | 600 / hour | | Enterprise | unlimited | unlimited | unlimited | Webhooks are the primary alert channel. Polling exists for agents that can't accept inbound HTTP (corporate networks, air-gapped analytics, local dev) and as a backfill mechanism during webhook outages. ## 2. Managing the watchlist ### Dashboard Sign in and open [app.igregulator.io/watchlist](https://app.igregulator.io/watchlist). Type an operator name — we typeahead against every operator slug we know about. Click to add. Click remove to drop. ### API Same surface via bearer token: ```bash # Current watchlist + count + plan cap curl -H "Authorization: Bearer igk_..." \ https://api.igregulator.io/v1/watchlist # Add an operator by slug (lowercase, hyphens) curl -X POST -H "Authorization: Bearer igk_..." \ -H "Content-Type: application/json" \ -d '{"operator_slug":"888-uk-limited"}' \ https://api.igregulator.io/v1/watchlist/operators # Remove (idempotent) curl -X DELETE -H "Authorization: Bearer igk_..." \ https://api.igregulator.io/v1/watchlist/operators/888-uk-limited # Paginated listing with current licence status per operator curl -H "Authorization: Bearer igk_..." \ "https://api.igregulator.io/v1/watchlist/operators?limit=50&offset=0" ``` Discover slugs with `GET /v1/operators/search?q=`. ## 3. Receiving events ### Webhook push (primary) Create an endpoint with the `watchlist_only: true` flag (default). Deliveries fire for operators in your watchlist only — no firehose, no noise. Events covered: - `license.status_changed` - `license.expiring_30d` / `_60d` / `_90d` - `license.expired` - `license.issued` (only if you were watching the operator when the new licence was detected) - `regulatory_action.added` See [/docs/webhooks](/docs/webhooks/) for the signing + retry protocol. Quickstart in 2 minutes: [/docs/webhooks/quickstart](/docs/webhooks/quickstart/). ### Polling (fallback) `GET /v1/watchlist/events` returns the same envelope events, pulled. Cursor-paginated so you don't re-process events between runs. Rate-limited per plan (see §1). Every response carries `X-Poll-RateLimit-Limit`, `-Remaining`, `-Reset`, `-Window`, and a `X-Poll-Recommended-Interval` hint in seconds. ```bash # Bootstrap: 30-day window (matches event retention) curl -H "Authorization: Bearer igk_..." \ "https://api.igregulator.io/v1/watchlist/events?since=2026-04-01T00:00:00Z&limit=100" # Steady state: use the next_cursor from the previous response curl -H "Authorization: Bearer igk_..." \ "https://api.igregulator.io/v1/watchlist/events?cursor=eyJ0cy...&limit=100" ``` Response: ```json { "events": [ { "event": "license.status_changed", "event_id": "evt_...", "api_version": "2026-04-20", "timestamp": "...", "livemode": true, "data": { ... } } ], "next_cursor": "eyJ0c...", "has_more": true } ``` `events` payloads are **identical** to the webhook envelope (minus the signature — polling authenticates via your API key, not per-delivery HMAC). Dedupe on `event_id` whether you're receiving via webhook or poll; you'll use the same key for both. ### Polling best practices - **Persist the cursor.** Save `next_cursor` after each successful batch to your own DB / disk. On restart, resume from it. - **Dedupe on `event_id`.** Crash windows can cause you to re-process the last batch; same dedupe path you'd use for webhooks covers this. - **Respect `X-Poll-Recommended-Interval`.** It's `ceil(3600 / limit)` — sleeping that long between polls guarantees you never hit the hour ceiling. Starter = 360 s, Pro = 60 s, Business = 6 s. - **On 429, wait until `reset_at`.** Don't exponential-backoff — the window resets deterministically on the hour boundary. See [/docs/rate-limits](/docs/rate-limits/). - **Hybrid webhooks + polling is the resilient pattern.** Webhooks for low latency, polling for the few-hour windows where your receiver was down and deliveries abandoned. Both ship the same `event_id`, so dedupe makes double-processing a no-op. Example in [/docs/webhooks § Pattern C](/docs/webhooks/#7-integration-patterns). ### `since=` is capped at 30 days That matches our event retention. Older values fail with `400 since_exceeds_retention_window`. Rare in practice — the first call uses `since`, every subsequent call uses the cursor. ## 4. Plan limits in detail If you exceed your watchlist cap mid-month, the next `POST /v1/watchlist/operators` returns `403 watchlist_quota_exceeded` with the current count + cap. Remove an operator or upgrade. Polling ceiling hits return `429 watchlist_events_poll_limit` with a concrete `reset_at` timestamp — not exponential backoff, deterministic hour-boundary reset. Webhooks do not count against this limit. ## 5. What events don't fire for watched operators - **`coverage.degraded` / `coverage.restored`** — these are jurisdiction-level, not operator-level. They're emitted regardless of watchlist membership; anyone subscribed to the event type gets them. - **`webhook.endpoint_degraded`** — self-notification about your own endpoints. Ignores watchlist. ## 6. Troubleshooting - **No events arriving** — your watchlist may be empty, or the operators you track haven't had any changes this month. Run `GET /v1/watchlist/events?since=2026-04-01T00:00:00Z` to confirm what would have fired over a month. - **Too many events** — too-broad subscription. Each of the three expiry windows fires separately; narrow to the one(s) you act on. - **Events for operators I don't watch** — your webhook might have `watchlist_only: false`. Check `PATCH /v1/webhooks/:id` with `{ "watchlist_only": true }`. Related: [/docs/webhooks](/docs/webhooks/), [/docs/rate-limits](/docs/rate-limits/). --- # File: docs/for-ai-agents.mdx --- title: For AI agents description: Machine-readable resources, structured errors, MCP server, and integration patterns for LLM-powered integrations. template: doc sidebar: order: 10 --- Building an AI agent, LLM-powered integration, or automated compliance tooling? Everything you need to wire iGregulator into a language-model workflow lives here. ## Quick discovery Three machine-readable resources document the entire product. One fetch each — no crawling required. - **[llms.txt](/llms.txt)** — structured index per the [llmstxt.org spec](https://llmstxt.org). ~3 KB, complete product map with links into the longer docs. - **[llms-full.txt](/llms-full.txt)** — every `/docs/*` page concatenated into one file. ~37 KB. One fetch buys full context. - **[OpenAPI 3.1 spec](https://api.igregulator.io/openapi.json)** — authoritative API schema. Every endpoint, request/response shape, auth scheme, rate-limit description. Every page on this site also emits `` pointers to all three URLs so a crawler hitting the landing can auto-discover them. The homepage additionally returns `Link` response headers (RFC 8288 / RFC 9727) pointing at the catalog, OpenAPI spec, docs, and server card — so headless agents find them without parsing HTML. ### Well-known discovery endpoints Standards-based entrypoints under `/.well-known/` (and the apex), for agents that look there first: - **[/.well-known/mcp/server-card.json](/.well-known/mcp/server-card.json)** — MCP Server Card (SEP-1649): server info, transport endpoint, and the full tool list. (Legacy [/.well-known/mcp.json](/.well-known/mcp.json) is still served too.) - **[/.well-known/api-catalog](/.well-known/api-catalog)** — API catalog (RFC 9727) linking the OpenAPI spec, docs, `llms.txt`, and the `/v1/health` status endpoint. - **[/.well-known/agent-skills/index.json](/.well-known/agent-skills/index.json)** — Agent Skills discovery index (v0.2.0). Currently ships a `verify-gambling-license` skill (digest-pinned `SKILL.md`). - **[/auth.md](/auth.md)** — how to authenticate: bearer API keys (we run no OAuth server, so there are deliberately no `/.well-known/oauth-*` documents). Content usage is declared via `Content-Signal` in [robots.txt](/robots.txt) — `search`, `ai-input`, and `ai-train` are all permitted. ## Agent-friendly features Deliberate design choices that make integration cleaner for agents. ### Structured error details Every error response carries `details.reason` + `details.suggestion`. Agents branch on `reason` instead of free-form text and surface `suggestion` to the user / caller as-is. ```json { "error": "domain is not a valid hostname", "code": "invalid_query", "details": { "field": "domain", "reason": "not_a_valid_hostname", "suggestion": "Pass a bare hostname — no scheme, no path, no underscores. Example: 'paddypower.com' or 'www.bet365.com'." } } ``` The full `reason` vocabulary is stable; see [error handling](/docs/errors/) for the code + reason matrix. ### Response `_meta` field Data-returning endpoints include a `_meta` envelope with provenance: - `scraped_at` — ISO-8601 timestamp when we last pulled this record. - `source_url` — exact regulator-register URL, or null if the row doesn't map to one URL. - `confidence_hint` — `authoritative` (direct register dump), `scraped` (HTML / PDF), or `derived` (fuzzy match, not a direct lookup). - `source_modified_at` — regulator-side modification timestamp if exposed (UKGC only), else null. An agent can say "verified via UKGC official register, scraped 6 h ago" with a real evidence trail, not a gloss. ### `/v1/health/coverage` Public endpoint exposing per-jurisdiction scraper freshness. `status: healthy` vs `degraded`, `age_hours`, `record_count` per regulator. See [endpoints](/docs/endpoints/). Useful for SLA dashboards and status pages. ### Stable `operationId`s Every endpoint has a stable `operationId` (`checkDomain`, `searchOperators`, `getOperator`, `listJurisdictions`, `getLicense`, `getLicenseHistory`, `checkCoverage`). SDK generators and MCP servers use these as function names — no `getV1CheckDomain` slug noise. ### Dual rate-limit headers Both custom and IETF-draft standard formats ship on every response. Parse either, both authoritative: ``` X-RateLimit-Policy: tier=public;limit=10;window=hour RateLimit-Policy: "default";q=10;w=3600 ``` The IETF draft (`draft-ietf-httpapi-ratelimit-headers`) is what Cloudflare, Kong, and similar gateways auto-parse; the custom format is human-readable for logs. ### Deprecation lifecycle Breaking changes broadcast via standard headers (example shape): ``` Deprecation: true Sunset: Link: ; rel="deprecation"; type="text/html" ``` Per RFC 9745 + RFC 8594. Minimum 90 days notice. An agent caching request shapes can inspect `Sunset` before assuming stability. No fields are currently deprecated; the headers will reappear when the next removal cycle starts. ## MCP server Live at [`mcp.igregulator.io`](https://mcp.igregulator.io). Streamable HTTP transport (current MCP spec — SSE used for streaming responses). Compatible with Claude Desktop, Cursor, Windsurf, Cline, and any other client that speaks MCP. Tools exposed (lean output — a compact verdict, not the full REST payload): - `check_domain` — verify a licence by domain (supports `as_of`) - `check_domain_batch` — up to 100 domains in one call (KYB sweep) - `search_operators` — search the register by name - `get_operator` — full operator detail (supports `as_of`) - `get_operator_regulatory_actions` — enforcement history (fines, suspensions) - `check_coverage` — data freshness per jurisdiction - `list_jurisdictions` — all covered regulators - `get_jurisdiction` — single regulator metadata - `get_license` — single licence detail - `get_license_history` — status-change timeline On a no-match, `check_domain` returns `match_absence_reason` + `checked_jurisdictions` — never collapse a miss into a bare "unlicensed". Auth = same API keys as the direct HTTP API. Setup walkthrough + example prompts at [/docs/mcp](/docs/mcp/). Discovery via the [MCP Server Card](/.well-known/mcp/server-card.json) (SEP-1649) and the legacy [/.well-known/mcp.json](/.well-known/mcp.json). ## WebMCP (in-browser tools) Distinct from the server above: the homepage registers [WebMCP](https://webmcp.org/) tools via `navigator.modelContext`, so an agent **driving a browser** can act without wiring up the HTTP/MCP integration at all. Two read-only tools, backed by the public API (10 req/IP/hour, no key): - `check_gambling_license` — verify by domain or licence number. - `search_gambling_operators` — search operators by name. They're feature-detected, so they simply don't appear in browsers without the WebMCP API. Use the server-side MCP for production integrations; WebMCP is the zero-setup path for browser agents. ## Integration patterns ### Merchant onboarding verification Payment-processor KYB flow: 1. User submits a merchant application with a domain. 2. Agent calls `GET /v1/check?domain=X`. 3. Branch on the response: - `confidence: high` + `status: active` → approve. - `confidence: medium` → flag for manual compliance review. - `confidence: low` → **manual review, not reject** — `low` fires when the domain root is a generic gambling label (`casino.com`, `poker.com`, etc.) and `/v1/check` refuses to guess which operator runs it. The site may well be licensed, just not identifiable from the label alone. Auto-rejection here blocks legitimate operators. - `match: null` → **branch on `match_absence_reason`, don't blanket-reject**: - `generic_term` → manual review (ambiguous label, see above). - `no_record_found` → not in any covered register. Scope the verdict to `checked_jurisdictions` ("not licensed in any of the 6 jurisdictions we cover") rather than an unqualified "unlicensed". - Any match with `status: revoked` / `suspended` / `expired` → reject regardless of confidence. ### Daily regulatory sweep Portfolio-monitoring automation: 1. Agent keeps a list of N operator slugs in its CRM / knowledge base. 2. Daily cron: iterate the list, call `GET /v1/operators/:slug`. 3. Diff against previous day's snapshot. Detect status changes, regulatory actions, expiry windows. 4. Alert compliance team on anomalies. ### Domain reputation scoring Risk-scoring for gambling-adjacent domains: 1. Agent receives an unknown gambling-related domain. 2. Calls `GET /v1/check?domain=X`. 3. For a confident match, fetches the operator's enforcement history from `GET /v1/operators/{slug}/regulatory-actions` and combines `confidence` + `match_type` + any regulatory actions into a composite risk score. 4. Score feeds the downstream decision (list / delist / require extra verification). ### Bulk KYB sweep (batch) Auditing a whole merchant book or affiliate list in one shot: 1. Collect the domains (chunks of 100). 2. `POST /v1/check/batch` with `{ "domains": [...] }` — one tool-call per 100 instead of one per domain. 3. Iterate `results`: each row carries `match` + `confidence` + `match_absence_reason` (same semantics as the single check); a malformed entry comes back with `error` and doesn't fail the batch. 4. `checked_jurisdictions` is returned once at the top — use it to scope every "not found" verdict. See [Batch domain check](/docs/batch/). ### Retrospective transaction check (as_of) "Was this merchant licensed at the time of the transaction?": 1. `GET /v1/check?domain=X&as_of=2026-03-01` (or `/v1/licenses/{id}?as_of=`). 2. Read the `as_of` object, and **honour `knowledge`**: - `observed` → `status_as_of` is the real status then; `established_by` shows when it was last confirmed relative to your date. - `before_tracking` → the date predates our observation window. Do **not** assert a status — tell the user we weren't watching before `tracking_since`. This is the difference between a defensible answer and a fabricated one. See [Point-in-time lookups](/docs/point-in-time/). ## Start integrating 1. Fetch [llms-full.txt](/llms-full.txt) for complete docs context in a single request. 2. Review the [OpenAPI spec](https://api.igregulator.io/openapi.json). 3. Try endpoints interactively in the [playground](/docs/playground/). 4. [Create a free account](https://app.igregulator.io/signup) for an API key and MCP server access — free for founding members. --- Questions, integration help, feedback — founder@igregulator.io. --- # File: docs/changelog.mdx --- title: Changelog description: API-level changes, deprecations, breaking-change policy. template: doc sidebar: order: 99 --- API-level changes only. Internal refactors, scraper updates, and infra changes don't appear here unless they surface in a response shape or error behaviour. ## Versioning policy All `/v1/*` endpoints are maintained **indefinitely**. We do not silently retire versioned routes. Future major versions will be introduced under a new path (`/v2/*`), with **v1 and v2 running in parallel for a minimum of 12 months** after v2 launch. Migration guidance lands in this changelog at v2 launch. ## Breaking-change policy - Breaking changes **always** land behind a new URL path (`/v2/...`) — we never break a stable endpoint's response shape in place. - Additive changes (new fields, new endpoints, loosened validation) ship at any time without a version bump. - Deprecation flow for individual fields inside a stable version: - `Deprecation: true` header on affected responses the day the field is marked. - `Sunset` header (RFC 9745) with the removal date, **minimum 90 days** out. - Changelog entry below with the migration guidance. - For agents: `Link: …; rel="deprecation"` surfaced alongside the header. ## 1.8.0 — 2026-05-29 All additive — no breaking changes. - **Batch domain check.** New `POST /v1/check/batch` (authenticated) resolves up to 100 domains in one request; each result mirrors the single-check shape, `checked_jurisdictions` is returned once at the top, and a malformed hostname comes back as a per-row `error` without failing the batch. See [Batch domain check](/docs/batch/). - **`match_absence_reason` + `checked_jurisdictions` on `/v1/check`.** When `match` is `null`, the response now says *why* — `generic_term` (ambiguous label) vs `no_record_found` (checked, not present) — and lists the exact registers checked, so a "not licensed" verdict is scoped to coverage, never absolute. Present only on a miss. See [Confidence scoring](/docs/confidence/). - **Point-in-time lookups (`?as_of=`).** `/v1/check`, `/v1/licenses/{id}`, and `/v1/operators/{slug}` accept `?as_of=` to reconstruct a licence's status as of a past date from transition history — **strictly within our observation window** (`knowledge: observed | before_tracking | no_such_license`; never extrapolated before `tracking_since`; future dates `400`). See [Point-in-time lookups](/docs/point-in-time/). - **`domains[].association` on operator detail.** `GET /v1/operators/{slug}` now returns `direct` / `white_label` per domain (the column was always populated; the field was missing from the response). - **`/v1/check` domain matching now collapses www / apex / subdomain variants** via the Public Suffix List (registrable-domain fallback after exact-host), so `virginbet.com` and `www.virginbet.com` resolve to the same operator. Single-owner guard prevents over-matching shared white-label platform domains. - **Rate-limit headers on authenticated responses.** `X-RateLimit-Limit` / `-Remaining` / `-Reset` + `X-Upgrade-URL` now accompany authenticated `/v1/*` responses, not just the public path. - **OpenAPI completeness.** `GET /v1/operators/{slug}/regulatory-actions` and `GET /v1/health/coverage` are now in the spec; `ApiError.code` documents `payment_required` + `quota_exceeded`. - **MCP server caught up to the API.** New tools `check_domain_batch`, `get_operator_regulatory_actions`, `check_coverage`; `check_domain` + `get_operator` accept `as_of`. Tool output is now **lean** (a compact verdict, not the full REST payload) — and `match_absence_reason` + `checked_jurisdictions` are preserved on a miss, so agents never collapse "not found" into "unlicensed". See [MCP server](/docs/mcp/). ## 1.7.0 — 2026-04-29 - **Tobique Gaming Commission (TGC) re-enabled — 6th jurisdiction live.** The Cloudflare Worker proxy at `igregulator-scraper-proxy.scvgr-agent.workers.dev` now bridges btc → thetgc.ca, bypassing the IP-reputation block that paused us in 1.6.1. Same scraper code, same daily 04:15 UTC slot — only the transport changed (HMAC-signed POST to the Worker, the Worker fetches the upstream CF→CF). Scraper opts in via `USE_PROXY=true`; other scrapers unaffected. - **New `@igregulator/scraper-utils` package** carrying the reusable `proxyFetch` helper. Future jurisdictions whose upstream blocks our IP can opt in by setting two env vars (`USE_PROXY=true`, `PROXY_HMAC_SECRET=…`) and adding their hostname to the Worker's `ALLOWED_DOMAINS` list — no code change in the scraper itself. - **Migration 0018** re-inserts the TGC jurisdiction row removed in 0017. ## 1.6.1 — 2026-04-28 - **Tobique Gaming Commission (TGC) ingestion deferred.** PR #119 shipped the scraper, but the upstream `thetgc.ca` blocks the production origin IP at the Cloudflare edge (HTTP 403 across all UAs and header shapes). Other hosts return 200; this is an IP-reputation block on the Hetzner range. We've surgically rolled the public surfaces back to 5 jurisdictions while the scraper code stays merged. Re-enables cleanly once we land a Cloudflare Worker proxy on `scvgr-agent.workers.dev`. Investigation + path-forward documented in [docs/scrapers/tobique-investigation.md](https://github.com/igregulator/igregulator/blob/main/docs/scrapers/tobique-investigation.md). ## 1.6.0 — 2026-04-28 - **Tobique Gaming Commission (TGC) added — 6th jurisdiction.** *(Reverted on the public surface — see 1.6.1. Scraper code stays merged for re-enable when the proxy lands.)* ~160 licences ingested daily from [thetgc.ca/license-holders/](https://thetgc.ca/license-holders/). License-type vocabulary `B2C` / `B2B`. License numbers are synthesised as `TGC//` (TGC doesn't publish IDs, same convention as KH). Cron 04:15 UTC; regulatory tier shifted +15 min. The fuzzy + domain-exact match flow on `/v1/check` includes TGC automatically. - **Domain coverage 0% out of the box for TGC.** Same upstream-doesn't-publish-websites bucket as Curaçao. Documented at [/docs/coverage-methodology](/docs/coverage-methodology/). Phase 4 WHOIS / Tranco enrichment will close this for both regulators in one pass. ## 1.5.0 — 2026-04-28 - **Trust-signals + positioning sweep.** New pages: [/about](https://igregulator.io/about), [/terms](https://igregulator.io/terms), [/privacy](https://igregulator.io/privacy). Footer reorganised — Company column now lists About / Changelog / Terms / Privacy. Stale "MCP server (soon)" replaced with a live link. - **Legal disclaimer surfaced.** Now rendered as a callout at the top of [/docs](https://igregulator.io/docs/) and embedded in the OpenAPI spec's top-level `info.description` so agents reading the spec see it. Same wording: results are informational, customers responsible for their own compliance decisions. - **Hero copy iteration — pain-driven.** "iGaming licensing intelligence API" → **"Verify gambling operator licenses before they cost you."** Sub-copy mentions the buyer profiles (compliance teams, payment providers, affiliate networks) explicitly. Meta description, og:description, llms.txt opening line aligned. No data or endpoint changes. ## 1.4.0 — 2026-04-28 - **MGA domain enrichment.** The MGA scraper now fans out from each B2C license to the legacy `authorisation.mga.org.mt/verification.aspx` page and pulls the **Website URL(s)** field. Runs daily in the same 03:15 UTC slot as the primary register pass, ~30 s wall time at p-limit 5, no separate cron entry. Recovers ~110 B2C operator domains from a previous baseline of zero. B2B / CRP licenses are skipped — the upstream verification page omits the Website URL section for non-consumer-facing license classes. - **Coverage methodology update.** `/v1/health/coverage` now exposes `domain_coverage.{operators_with_domain, normative_operators, coverage_pct}` per jurisdiction. The denominator is operators where domain disclosure is normative for their license type — excludes B2B-only types (CSPA, B2B, Non-Remote, Ancillary, supplier permits, etc.) that don't have consumer-facing domains by design. Methodology change only; no data changes. **Affected metrics under the new denominator:** - **AN** — 98% of B2C operators have ≥1 domain - **CW** — 0% (upstream ceiling, no scraper-side work possible — see [scope doc](https://github.com/igregulator/igregulator/blob/main/docs/scrapers/cw-domain-investigation.md)) - **KH** — 69% of Interactive Gaming Permit holders (gap is upstream non-disclosure) - **MGA** — 0% pre-1.4.0 enrichment, ~90% post - **UKGC** — 40% of `Remote`-license operators (upstream meaningful ceiling, the rest don't operate consumer sites) - **UKGC parser audit closed without code changes.** Time-boxed re-audit confirmed the ~17.3% raw figure is 96% of the 482-operator meaningful ceiling (Active + White Label distinct accounts in `domain-names.csv`). Inactive-domain rows aren't ingested deliberately to avoid stale `/v1/check` matches. Findings documented in [docs/scrapers/ukgc-domain-gap.md](https://github.com/igregulator/igregulator/blob/main/docs/scrapers/ukgc-domain-gap.md). ## 1.3.1 — 2026-04-28 - **MCP server verified live in production.** End-to-end smoke against `https://mcp.igregulator.io/mcp`: `tools/list` returns all 7 tools, `check_domain` resolves UKGC + AN matches with correct confidence hints, `api_request_log` rows tagged `source='mcp'`. DNS via Cloudflare proxy, TLS via the existing Origin CA cert (extended to cover the new subdomain). Pricing page lists MCP support on Starter onwards. Claude Code path documented at /docs/mcp via `claude mcp add --transport http`. ## 1.3.0 — 2026-04-28 - **MCP server live at `mcp.igregulator.io`.** Streamable-HTTP transport (current MCP spec, SSE for streaming). Seven tools exposed: `check_domain`, `search_operators`, `get_operator`, `list_jurisdictions`, `get_jurisdiction`, `get_license`, `get_license_history`. Bearer-token auth — same API keys as the direct HTTP API; tool calls forward to api.igregulator.io and count against your existing per-key quota. Setup walkthrough at [/docs/mcp](/docs/mcp/). Discovery manifests at `/.well-known/mcp.json` on all three iGregulator surfaces. - **`api_request_log.source` column added.** New requests are tagged `http` or `mcp` so admin analytics can split MCP usage from direct-HTTP usage. No client-visible change. ## 1.2.0 — 2026-04-28 - **Anjouan jurisdiction live.** 5th regulator: Anjouan Gaming Authority (AGA), ~1,275 licences ingested daily from [anjouangaming.com/license-register/](https://anjouangaming.com/license-register/). License-type vocabulary `B2C` / `B2B` / `White Labeling`. The fuzzy + domain-exact match flow on `/v1/check` includes AN automatically — no client-side change needed. Coverage table at [/docs/](/docs/) refreshed. - **Marketing copy correction.** Hero / meta / llms.txt now say "Daily-refreshed … updated within 24 hours" instead of "Real-time." The data was never real-time; the new wording matches what scrapers actually deliver (cron 03:00–04:00 UTC, depending on jurisdiction). ## 1.1.0 — 2026-04-28 - **Flat field removal on `/v1/check`.** Top-level legacy keys (`licensed`, `jurisdiction`, `license_number`, `operator`, `status`, `expires_at`) **removed** along with the `Deprecation` / `Sunset` / `Link` headers. Read `match.*` instead. Removed ahead of the announced 2026-05-19 sunset because no customers are integrated against the flat shape yet. - **Pre-launch surface gating.** `trial` keys are now restricted to `/v1/check` only (1,000/day per-key cap). Other authenticated endpoints return `402 payment_required` with `details.reason=endpoint_requires_paid_plan`. Behaviour lifts automatically once `PRELAUNCH_DAILY_CAP=0` (post-Stripe). - **Webhook retention 7 → 30 days.** `webhook_deliveries` history now matches the `webhook_events` replay window. Existing rows live longer immediately; nothing to migrate. - **Generic-label blocklist on `/v1/check` widened.** Now substring-match instead of exact-match — `bestcasino.com`, `casino-bonus.com` return `confidence: low` like `casino.com` already did. Licensed brands containing a generic keyword (`bet365.com`, `pokerstars.com`) still resolve via the domain hit before the gate runs. - **OpenAPI spec** documents the per-jurisdiction `license_types` vocabulary inline so client-side enums can be coded against the audited values (UKGC `Remote`/`Non-Remote`/`Ancillary Remote`, MGA `Type 1-4`/`B2B`/`B2C`, CW `B2C`/`B2B`, KH `Interactive Gaming Permit`/`CSPA`). ## 1.0.0 — 2026-04-19 **Initial public docs release.** - **/v1/check** response shape: `{ query, match, alternatives, confidence }`. `match.confidence` ∈ `high | medium | low`, `match_type` ∈ `domain_exact | trading_name_fuzzy | name_similarity`, `domain_association` ∈ `direct | white_label | null`. - **Public endpoints**: `/v1/check`, `/v1/jurisdictions`, `/v1/operators/search`. Rate-limited 10 req / IP / hour. - **White Label ingestion** live for UKGC — domains with `Status = 'White Label'` in the UKGC register now load with `association = 'white_label'` on the domain row. - **OpenAPI 3.1 spec** available at [api.igregulator.io/openapi.json](https://api.igregulator.io/openapi.json) (canonical source; Scalar + starlight-openapi both consume it). The old Swagger UI on `api.igregulator.io/docs` now 301-redirects to [/docs/api/](/docs/api/) — unified docs at [/docs](/docs/) supersede it. ## 0.3.0 — 2026-04-17 - **Regulatory actions** surfaced on the operator detail page. Cross-jurisdiction feed: UKGC Public Register, MGA Decisions, CGA Warnings. - New endpoint: `/v1/operators/:slug/regulatory-actions` (authenticated). ## 0.2.0 — 2026-04-08 - Kahnawake Gaming Commission jurisdiction added. - Curaçao scraper migrated to the post-LOK OGL PDF source. - Licence category harmonisation: `remote | non-remote | ancillary | permit | other`. ## 0.1.0 — 2026-03-28 - First operational release. UKGC-only coverage. - Core schema: `operators`, `licenses`, `domains`, `jurisdictions`. - Dashboard lives at `app.igregulator.io`. ---