How it works
The gateway keeps personal data on-shore while you use any public LLM. It sits on the egress path, swaps every identifier for a placeholder before the prompt crosses the border, lets the model answer on the placeholders, and restores the real values on the way back — so nothing sensitive ever leaves your network.
The round-trip
Draft a reply to Hans Muster (AHV 756.1234.5678.97, IBAN CH93 0076 …)
│ 🛡 the only thing that crosses the border ↓
Draft a reply to [PERSON_1] ([AHV_1], [IBAN_1])
│ the model answers on placeholders ↓
"Dear [PERSON_1], we'll refund CHF 240 within five business days…"
│ 🛡 restored on the way back ↓
"Dear Hans Muster, we'll refund CHF 240 within five business days…"The model produces a correct, personalised answer having only ever seen [PERSON_1]. Even if the provider logs every prompt, it logged placeholders.
Deterministic by design
The detection is not an LLM and not a cloud API — both would defeat the purpose (you cannot use an unreliable thing as your reliability boundary, and a cloud “PII API” means you have already sent the data away to find it). It is regex plus checksums, run locally:
- Swiss AHV / AVS — matched by shape, validated by its EAN-13 check digit.
- IBAN (CH/LI) — validated by ISO-7064 mod-97.
- Card numbers — validated by the Luhn algorithm.
- Swiss phone numbers and emails — matched by pattern.
The checksums are what make it trustworthy: a random 13-digit string is not flagged, and a real AHV cannot slip through by being reformatted. It runs air-gapped and fails closed — if something looks like an identifier, it is withheld, not waved through. The principle is old and dull and correct: never trust the model to police itself; put a deterministic code boundary around it.
In production
- An OpenAI-compatible reverse proxy: your app changes its
base_urland nothing else. (Or an SDK/middleware, or an API-gateway plugin.) - Runs on-premise or in a Swiss region. The token↔value map — the one place the real personal data lives — is in memory, per-session, and never persisted or transmitted.
- Every request emits an audit line for the DPO (“14:32 — 1 name, 1 AHV, 1 IBAN redacted before egress”) — your record of processing, for free.
- Model-agnostic. Gemini, Claude, DeepSeek, a model hosted abroad — the boundary doesn't care, because nothing sensitive reaches any of them.
Where it stops
- Structured identifiers (AHV, IBAN, card, phone, email) are the deterministic core. Free-form data — names, street addresses — needs a named-entity model; run a small one locally alongside, and fail closed on high-risk flows.
- Some tasks genuinely need the real value (validate this IBAN; compute an age from a date of birth). Tokenization is a per-field policy, not a blanket switch.
- This is data minimisation and residency — not a DPIA, a lawful-basis analysis, or a contract. It removes the transfer question for the data it redacts; it does not remove your other obligations.
Defence in depth, deliberately dumb — the outer layer, not the only one.