Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.masker.dev/llms.txt

Use this file to discover all available pages before exploring further.

Data residency is typically one of the first questions a HIPAA security review asks. This page is the direct answer: what data Masker stores, where it lives in each deployment model, what crosses which boundaries, and what your options are for strict residency requirements.

Core principle: PHI is never persisted

In every deployment model, Masker holds PHI in process memory for the duration of masking — tens of milliseconds — and then drops it. There is no code path that writes original PHI to disk, to logs, or to the audit chain. The audit chain records that a detection occurred and what kind of entity it was, but never records the original value.
On the public demo at try.masker.dev, original payloads are not persisted at all. Only the encrypted audit chain is retained. On self-hosted deployments, the same holds by default — only the encrypted chain is kept unless you explicitly enable encrypted payload retention.

Hosted deployment (Masker-managed)

The Masker-hosted service runs in Fly.io US-West:
  • Primary region: sea (Seattle)
  • Failover: sjc (San Jose)

What is stored and where

DataStorageEncryption
Database (masker.db)Fly volume, US-WestAES-256 at rest
Token vault (vault.db)Fly volume, US-WestAES-256 at rest
Audit chainFly volume, US-WestAES-256 at rest, HMAC-chained
Encrypted payloads (opt-in)Fly volume, US-WestApplication-side encryption with your kid key

Data in transit

  • Inbound (your voice platform → Masker): TLS 1.3
  • Outbound to your LLM (Masker → OpenAI or other): TLS 1.3, but only tokens cross this hop — PHI is masked before this connection is made
  • Inter-AZ: Fly internal network, encrypted
Masker signs a BAA with production beta customers covering this hosted deployment.

Self-hosted deployment (your VPC)

When you run Masker in your own VPC, data residency is whatever your VPC’s residency is. Masker does not phone home, does not write data to any path you have not configured, and does not include analytics or telemetry.
The self-hosted path is the recommended deployment for production healthcare workloads. PHI never leaves your network on the way to the model — it leaves as tokens over TLS.

Region guidance by jurisdiction

JurisdictionRecommended region
US (HIPAA)Any US region on AWS, GCP, or Azure
EU / UK (GDPR, UK GDPR)eu-* regions; Masker-hosted EU region targeted for late 2026
Canada (PIPEDA, OHIP)ca-central-1 (AWS) or northamerica-northeast1 (GCP)
India (DPDP Act)ap-south-1 (AWS Mumbai)
Australia (Privacy Act / My Health Records)ap-southeast-2 (AWS Sydney)
Masker has also been verified to run on AWS GovCloud and Azure Government for sovereign cloud requirements — contact us for the offline activation flow.

Token vault

For vault-deterministic tokenization, Masker uses a SQLite database stored at /data/vault.db. This vault maps PHI spans to stable tokens so that the same value receives the same token across calls within a session. The vault file is:
  • Encrypted at rest (AES-256)
  • Stored only in your configured data volume
  • Never transmitted to Masker infrastructure in a self-hosted deployment
  • Purgeable for right-to-erasure (see below)

Key material

Masker does not manage your encryption keys. Key material lives in your secret store:
  • Fly.io hosted: Fly Secrets
  • AWS: AWS Secrets Manager
  • GCP: GCP Secret Manager
  • Self-hosted or air-gapped: HashiCorp Vault, or any secrets backend you configure
The MASKER_AUDIT_HMAC_KEY and any kid keys for reversible-AEAD tokens are read at startup from environment variables. They are never logged and never written to the data volume.

What crosses which boundary

For a typical hosted-Masker and OpenAI deployment:
[Voice platform]  ─TLS─▶  [Masker (Fly US-West)]  ─TLS─▶  [OpenAI (US-East)]
   PHI (yes)                  PHI (in-memory only)             tokens only
For a customer VPC and OpenAI deployment:
[Voice platform]  ─TLS─▶  [Masker (your VPC)]   ─TLS─▶  [OpenAI (US-East)]
   PHI (yes)                  PHI (in-memory only)             tokens only
For a customer VPC and self-hosted LLM deployment:
[Voice platform]  ─TLS─▶  [Masker (your VPC)]   ─TLS─▶  [Your LLM (your VPC)]
   PHI (yes)                  PHI (in-memory only)             tokens only
In every variant, PHI never crosses the trust boundary in plaintext. If both your voice platform residency and your LLM provider residency are concerns, the customer VPC and self-hosted LLM pattern is the cleanest solution — Masker has been deployed this way at larger pilots.

In-memory PHI lifecycle

To be precise about what “in-memory only” means during a request:
1

Request received

Masker receives the request body. PHI is in process memory.
2

Detection

Pass 1 (regex) and Pass 2 (NER) detectors run. Spans and entity kinds are computed.
3

Tokenization

Tokens are generated. For vault-deterministic tokens, the mapping is written to vault.db. The original value is passed to the HMAC for the audit event — it is not written to the chain itself.
4

Masked body constructed

The masked request body is built. The original PHI is dropped from the working buffer.
5

Original buffer released

The original buffer goes out of scope. Rust drops it. The memory is reclaimed. The original strings existed in process memory for milliseconds.
For the most sensitive workloads, you can enable MASKER_ZEROIZE_BUFFERS=true, which actively zeroes the working buffer after masking before the allocator reclaims it. This costs approximately 1 ms per request and is off by default.

Right to erasure

For GDPR Article 17 and equivalent requirements, Masker provides cryptographic erasure without requiring you to track every location a token may have been written:
  • Drop the vault: For vault-deterministic tokens, deleting vault.db makes every prior token irreversible. The token still exists wherever it was stored, but cannot be reversed to recover the original value.
  • Rotate the key: For reversible-AEAD tokens, rotating the kid key and discarding the old key makes prior tokens unreadable.
  • Purge audit log entries: Use masker vault purge --subject <id> to remove audit log entries for a specific subject. This requires a subject ID to have been attached at request time — configure that at the agent level if you need per-subject erasure.

What Masker does not control

Two residency concerns are outside Masker’s scope:
  • Your voice platform’s residency. If your Vapi, Bolna, or Retell tenant processes data in a region you need to avoid, that is a conversation with that vendor. Masker can only mask what reaches it.
  • Your LLM provider’s residency. Masker ensures only tokens cross that boundary, which significantly reduces the impact. But it does not move the LLM provider’s infrastructure to a different region.
If both concerns are blockers, the customer VPC and self-hosted LLM deployment resolves them completely.