Once detection marks a span, Masker replaces it with a token. The token travels to your LLM in place of the real value. On the response leg, Masker swaps it back before the text reaches your caller. The LLM never sees real PHI; your caller never hears a placeholder. Every token Masker produces shares four properties:Documentation Index
Fetch the complete documentation index at: https://docs.masker.dev/llms.txt
Use this file to discover all available pages before exploring further.
- Format-stable — the same shape regardless of input length, so the model is never surprised by a short or long token
- Type-aware — the LLM can still tell a phone from a name from a date
- Reversible — Masker can rehydrate the original on the response leg
- Non-revealing — given only the token, you cannot recover the original value
Token format
All tokens follow this structure:| Field | Meaning | Example |
|---|---|---|
scheme | Versioned scheme identifier | MSKV1 (vault-deterministic) or MSK1 (reversible AEAD) |
kind | Entity type | PHONE, NAME, SSN, MRN, EMAIL, DOB |
kid | Key ID — which key was used to produce this token | K_HEALTHCARE |
value | The opaque token body | Base32-encoded, approximately 22 characters |
Two tokenization schemes
Masker offers two schemes. You choose one per agent.- Vault deterministic (default)
- Reversible AEAD
Algorithm: HMAC-SHA256 of
Use vault-deterministic when you want same-value → same-token behavior across a session — for example, so the LLM can refer to “the patient” consistently across multiple turns.
(kid_secret || normalized_input) → first 16 bytes → base32-encoded.Storage: A row is written to a SQLite vault at /data/vault.db on the Fly volume.Scheme prefix: MSKV1Properties
| Property | Detail |
|---|---|
| Deterministic | The same input always produces the same token within the same kid. The LLM can recognize “this is the same person as in the previous turn.” |
| Reversible | Masker looks up the token in the vault to retrieve the original. |
| Per-agent isolation | Different agents use different kids, so token namespaces don’t collide across customers. |
| Vault-bound | If the vault is lost, tokens are opaque forever. This is a feature when you need hard erasure — drop the vault, and all tokens referencing it become permanently unreadable. |
Choosing a scheme
| You want… | Pick |
|---|---|
| Same value → same token (consistent reference within a call) | Vault deterministic |
| No shared state across regions | Reversible AEAD |
| Hard erasure (drop the vault, tokens are dead forever) | Vault deterministic |
| Self-describing tokens that survive restarts | Reversible AEAD |
| Default for healthcare voice agents | Vault deterministic |
healthcare-default policy ships with vault-deterministic tokenization. You can change it per agent in the portal or via the create-agent API.
Rehydration
On the response leg, Masker scans the LLM’s output for any token matching the patternMSK*.*.*.*. For each match:
- Vault deterministic: look up the original value in
/data/vault.db, replace inline. - Reversible AEAD: decrypt with the
kidkey, replace inline.
rehydration_failed event and replaces the token with [REDACTED:KIND]. The failure is recorded in the audit log.
Key management
Eachkid is a logical key identifier mapped to actual key material in your environment:
kids at once to support key rotation:
Add the new key
Set
MASKER_KEY_K_HEALTHCARE to the new key material. Keep the old key registered as MASKER_KEY_K_HEALTHCARE_OLD.Update the policy
Switch your agent’s policy to reference the new
kid. New tokens will use the new key; existing live tokens still rehydrate via the old key.