Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.masker.dev/llms.txt

Use this file to discover all available pages before exploring further.

Masker is a Rust service that exposes an OpenAI-compatible /v1/chat/completions endpoint. You can run it on Masker-managed infrastructure, inside your own cloud account, or fully air-gapped with no outbound internet access. The right choice depends on where your PHI must reside and what your compliance program requires.
OptionBest forPHI residencyBAA
Masker hostedPilots, demos, early-stage teamsFly.io US-West (sea/sjc)Available on production beta
Self-hosted VPCProduction, regulated environmentsYour cloud accountNot needed — PHI never leaves your VPC
Air-gapped / on-premHospitals, DoD, strict security posturesYour own hardwareNot needed — Masker is fully offline

Option 1 — Masker hosted

The hosted service runs at masker-voice.fly.dev. You do not deploy or manage any infrastructure.
1

Sign in to the portal

Open masker-voice.fly.dev/portal/login and sign in with GitHub.
2

Create an agent

Click New Agent, choose an upstream model, and copy the generated proxy URL.
3

Drop the URL into your voice platform

Paste the proxy URL as the Custom LLM URL in Vapi, ElevenLabs, Bolna, or any OpenAI-compatible platform. No SDK, no code changes.
What you get: zero infrastructure, instant setup, and a BAA available to all production beta customers. What to consider: PHI transits Masker’s servers on Fly.io US-West before being tokenized. If your compliance program requires PHI to stay in your own environment, use one of the self-hosted options below and plan a migration before going live with real patient data.

Option 2 — Self-hosted in your VPC

Masker ships as a container image at ghcr.io/masker-dev/masker:latest. You can run it on Docker, Kubernetes, or your own Fly.io account.
The quickest path to a self-hosted instance. Suitable for single-node deployments or development environments.
docker run -d \
  --name masker \
  -p 8080:8080 \
  -v masker-data:/data \
  -e MASKER_LISTEN_ADDR=0.0.0.0:8080 \
  -e MASKER_PUBLIC_URL=https://masker.your-vpc.example \
  -e MASKER_DATABASE_URL=sqlite:///data/masker.db \
  -e MASKER_OPENAI_API_KEY=$OPENAI_API_KEY \
  --env-file ./masker.secrets.env \
  ghcr.io/masker-dev/masker:latest
masker.secrets.env must contain at minimum:
MASKER_KEY_K_HEALTHCARE=<base64 32-byte key>
MASKER_SESSION_SECRET=<base64 32-byte key>
MASKER_AUDIT_HMAC_KEY=<base64 32-byte key>
Place a TLS-terminating load balancer (ALB, GCLB, nginx) in front of the container. Masker speaks plain HTTP inside the VPC.
For multi-replica Docker Compose setups, switch to tokenization: reversible-aead in your mask policy so replicas don’t need to share SQLite vault state. Alternatively, point MASKER_DATABASE_URL at a Postgres instance.

Getting a self-hosted deployment

Self-hosted deployments require an activation step during the beta period. Email hello@masker.dev to request a self-hosted license. The team will provide the image credentials and walk through the initial setup.

Option 3 — Air-gapped / on-premises

Masker has no required outbound network calls beyond your chosen upstream LLM. If your environment runs the LLM internally — Ollama, vLLM, or your own Azure OpenAI tenant — Masker can run completely offline. Configure the internal LLM endpoint:
MASKER_OPENAI_BASE_URL=https://internal-llm.your-vpc.example
MASKER_OPENAI_API_KEY=internal-token
Masker implements the OpenAI chat-completions API, so any compatible upstream works without code changes. The Gemma-4 NER model ships inside the container image. There are no model downloads at runtime, no telemetry, and the container does not phone home. For on-premises activation, contact us to request the offline activation flow. It uses a signed license file in place of GitHub OAuth for portal access.

Health checks

Every Masker deployment exposes a health endpoint you can wire into your load balancer or readiness probe:
GET /health
A healthy instance returns:
{"status":"ok","version":"x.y.z","db":"ok","vault":"ok","upstream":"ok"}

Upgrades

Masker follows semantic versioning. Minor and patch releases are drop-in upgrades. Major releases include a migration note in the GitHub release.
docker pull ghcr.io/masker-dev/masker:latest
docker stop masker && docker rm masker
docker run ...   # same flags, new image
Database migrations run automatically at boot. The vault token format (MSKV1.*) is forward-compatible across all minor versions.

Observability

Set MASKER_METRICS_ADDR to expose a Prometheus metrics endpoint on that address. Available metrics:
  • masker_requests_total{agent, status}
  • masker_request_duration_seconds{stage} — stages: detection, tokenize, upstream, rehydrate
  • masker_redactions_total{kind, pass}
  • masker_vault_size_bytes
Logs are emitted as JSON to stdout by default. Forward them with your existing pipeline — CloudWatch, Datadog, GCP Logging, or Loki all work without any Masker-specific configuration.

Current limitations

  • No Helm chart yet. Raw Kubernetes manifests only. Helm chart is on the May 30 roadmap.
  • No Terraform module yet. Same timeline.
  • Each Masker deployment is independent. A multi-tenant control plane is not available in the current release.