> ## Documentation Index
> Fetch the complete documentation index at: https://docs.masker.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Choosing how to deploy Masker in your infrastructure

> Compare hosted SaaS, self-hosted VPC, and air-gapped on-premises options — and understand what each means for data residency and BAA coverage.

Masker is a Rust service that exposes an OpenAI-compatible `/v1/chat/completions` endpoint. You can run it on Masker-managed infrastructure, inside your own cloud account, or fully air-gapped with no outbound internet access. The right choice depends on where your PHI must reside and what your compliance program requires.

| Option                   | Best for                                 | PHI residency                | BAA                                    |
| ------------------------ | ---------------------------------------- | ---------------------------- | -------------------------------------- |
| **Masker hosted**        | Pilots, demos, early-stage teams         | Fly.io US-West (`sea`/`sjc`) | Available on production beta           |
| **Self-hosted VPC**      | Production, regulated environments       | Your cloud account           | Not needed — PHI never leaves your VPC |
| **Air-gapped / on-prem** | Hospitals, DoD, strict security postures | Your own hardware            | Not needed — Masker is fully offline   |

## Option 1 — Masker hosted

The hosted service runs at [masker-voice.fly.dev](https://masker-voice.fly.dev). You do not deploy or manage any infrastructure.

<Steps>
  <Step title="Sign in to the portal">
    Open [masker-voice.fly.dev/portal/login](https://masker-voice.fly.dev/portal/login) and sign in with GitHub.
  </Step>

  <Step title="Create an agent">
    Click **New Agent**, choose an upstream model, and copy the generated proxy URL.
  </Step>

  <Step title="Drop the URL into your voice platform">
    Paste the proxy URL as the Custom LLM URL in Vapi, ElevenLabs, Bolna, or any OpenAI-compatible platform. No SDK, no code changes.
  </Step>
</Steps>

**What you get:** zero infrastructure, instant setup, and a BAA available to all production beta customers.

**What to consider:** PHI transits Masker's servers on Fly.io US-West before being tokenized. If your compliance program requires PHI to stay in your own environment, use one of the self-hosted options below and plan a migration before going live with real patient data.

## Option 2 — Self-hosted in your VPC

Masker ships as a container image at `ghcr.io/masker-dev/masker:latest`. You can run it on Docker, Kubernetes, or your own Fly.io account.

<Tabs>
  <Tab title="Docker">
    The quickest path to a self-hosted instance. Suitable for single-node deployments or development environments.

    ```bash theme={null}
    docker run -d \
      --name masker \
      -p 8080:8080 \
      -v masker-data:/data \
      -e MASKER_LISTEN_ADDR=0.0.0.0:8080 \
      -e MASKER_PUBLIC_URL=https://masker.your-vpc.example \
      -e MASKER_DATABASE_URL=sqlite:///data/masker.db \
      -e MASKER_OPENAI_API_KEY=$OPENAI_API_KEY \
      --env-file ./masker.secrets.env \
      ghcr.io/masker-dev/masker:latest
    ```

    `masker.secrets.env` must contain at minimum:

    ```bash theme={null}
    MASKER_KEY_K_HEALTHCARE=<base64 32-byte key>
    MASKER_SESSION_SECRET=<base64 32-byte key>
    MASKER_AUDIT_HMAC_KEY=<base64 32-byte key>
    ```

    Place a TLS-terminating load balancer (ALB, GCLB, nginx) in front of the container. Masker speaks plain HTTP inside the VPC.

    <Tip>
      For multi-replica Docker Compose setups, switch to `tokenization: reversible-aead` in your mask policy so replicas don't need to share SQLite vault state. Alternatively, point `MASKER_DATABASE_URL` at a Postgres instance.
    </Tip>
  </Tab>

  <Tab title="Kubernetes">
    Sample manifests are available in the repo at `deploy/k8s/`. They include:

    * `Deployment` with resource requests (1 CPU, 2 GB RAM; 1 GPU optional for NER)
    * `Service` and `Ingress` with TLS termination
    * `PersistentVolumeClaim` for the `/data` volume
    * `Secret` and `ConfigMap` separation
    * `HorizontalPodAutoscaler` keyed on request latency

    For multi-replica deployments (EKS, GKE, AKS), use `tokenization: reversible-aead` in your mask policy to eliminate the need for a shared vault, or configure `MASKER_DATABASE_URL` to point to a managed Postgres instance.

    <Note>
      A Helm chart is on the May 30 roadmap. Until then, apply the raw manifests from `deploy/k8s/`.
    </Note>
  </Tab>

  <Tab title="Fly.io (your account)">
    Same approach as the Masker-hosted deployment, but running entirely inside your Fly.io organization. PHI stays in your account.

    ```toml fly.toml theme={null}
    app = "masker-yourname"
    primary_region = "sea"

    [build]
      image = "ghcr.io/masker-dev/masker:latest"

    [http_service]
      internal_port = 8080
      force_https = true

    [[mounts]]
      source = "masker_data"
      destination = "/data"

    [env]
      MASKER_LISTEN_ADDR = "0.0.0.0:8080"
      MASKER_PUBLIC_URL  = "https://masker-yourname.fly.dev"
      MASKER_DATABASE_URL = "sqlite:///data/masker.db"
    ```

    Then set secrets separately so they are never stored in `fly.toml`:

    ```bash theme={null}
    flyctl secrets set \
      MASKER_SESSION_SECRET=$(openssl rand -base64 32) \
      MASKER_KEY_K_HEALTHCARE=$(openssl rand -base64 32) \
      MASKER_AUDIT_HMAC_KEY=$(openssl rand -base64 32) \
      MASKER_OPENAI_API_KEY=sk-...
    ```
  </Tab>
</Tabs>

### Getting a self-hosted deployment

Self-hosted deployments require an activation step during the beta period. Email [hello@masker.dev](mailto:hello@masker.dev) to request a self-hosted license. The team will provide the image credentials and walk through the initial setup.

## Option 3 — Air-gapped / on-premises

Masker has no required outbound network calls beyond your chosen upstream LLM. If your environment runs the LLM internally — Ollama, vLLM, or your own Azure OpenAI tenant — Masker can run completely offline.

Configure the internal LLM endpoint:

```bash theme={null}
MASKER_OPENAI_BASE_URL=https://internal-llm.your-vpc.example
MASKER_OPENAI_API_KEY=internal-token
```

Masker implements the OpenAI chat-completions API, so any compatible upstream works without code changes.

The Gemma-4 NER model ships **inside the container image**. There are no model downloads at runtime, no telemetry, and the container does not phone home.

For on-premises activation, [contact us](mailto:hello@masker.dev) to request the offline activation flow. It uses a signed license file in place of GitHub OAuth for portal access.

## Health checks

Every Masker deployment exposes a health endpoint you can wire into your load balancer or readiness probe:

```
GET /health
```

A healthy instance returns:

```json theme={null}
{"status":"ok","version":"x.y.z","db":"ok","vault":"ok","upstream":"ok"}
```

## Upgrades

Masker follows semantic versioning. Minor and patch releases are drop-in upgrades. Major releases include a migration note in the GitHub release.

```bash theme={null}
docker pull ghcr.io/masker-dev/masker:latest
docker stop masker && docker rm masker
docker run ...   # same flags, new image
```

Database migrations run automatically at boot. The vault token format (`MSKV1.*`) is forward-compatible across all minor versions.

## Observability

Set `MASKER_METRICS_ADDR` to expose a Prometheus metrics endpoint on that address. Available metrics:

* `masker_requests_total{agent, status}`
* `masker_request_duration_seconds{stage}` — stages: `detection`, `tokenize`, `upstream`, `rehydrate`
* `masker_redactions_total{kind, pass}`
* `masker_vault_size_bytes`

Logs are emitted as JSON to stdout by default. Forward them with your existing pipeline — CloudWatch, Datadog, GCP Logging, or Loki all work without any Masker-specific configuration.

## Current limitations

* No Helm chart yet. Raw Kubernetes manifests only. Helm chart is on the May 30 roadmap.
* No Terraform module yet. Same timeline.
* Each Masker deployment is independent. A multi-tenant control plane is not available in the current release.
