The HIPAA Privacy Rule defines two paths to de-identification: Expert Determination and Safe Harbor. Masker is built around Safe Harbor — it is the more prescriptive of the two, and the one that can be audited from logs. Safe Harbor enumerates 18 identifier categories under 45 CFR §164.514(b)(2). To qualify as de-identified under Safe Harbor, all 18 categories must be removed. This page is an honest, current account of where Masker stands against each one.Documentation Index
Fetch the complete documentation index at: https://docs.masker.dev/llms.txt
Use this file to discover all available pages before exploring further.
Coverage is continuously expanding. The May 30, 2026 production launch targets at least “partial” status for every remaining in-progress category. If your workload depends on a category listed as partial or in progress, contact us — we have accelerated roadmap items for pilot customers before.
Coverage matrix
| # | Category | Status | Notes |
|---|---|---|---|
| A | Names | ✓ Full | Pass 2 (Gemma-4 NER), high recall on conversational input |
| B | Geographic subdivisions smaller than a state | ✓ Full | ZIP, city, and street address covered via regex and NER |
| C | Dates more granular than year (birth, admission, discharge, death) | ◐ Partial | Birth, admission, and discharge dates: full. Certain year-only phrasings still pass through. |
| D | Telephone numbers | ✓ Full | E.164, North American, and spoken-word forms |
| E | Fax numbers | ✓ Full | Same detector as telephone |
| F | Email addresses | ✓ Full | RFC 5322 simplified pattern |
| G | Social Security numbers | ✓ Full | Dashed, undashed, and spoken-word forms |
| H | Medical record numbers (MRN) | ✓ Full | Tenant-configurable patterns for custom formats |
| I | Health plan beneficiary numbers | ◐ Partial | Common payer formats (Medicare, BCBS): full. Long-tail payer formats require custom policy. |
| J | Account numbers | ✓ Full | Generic numeric patterns and Luhn-checked account numbers |
| K | Certificate and license numbers | ◐ Partial | Major US state driver’s license formats covered. DEA and NPI numbers: partial. |
| L | Vehicle identifiers and serial numbers, including license plates | ⏳ In progress | Targeted for May 30 |
| M | Device identifiers and serial numbers | ⏳ In progress | Targeted for May 30 |
| N | Web URLs | ⏳ In progress | URLs containing query-string PHI are currently redacted; full-URL redaction in progress |
| O | IP addresses | ✓ Full | IPv4 and IPv6, default action is redact (not tokenize) |
| P | Biometric identifiers including finger and voice prints | ⏳ In progress | Voice fingerprinting not in scope today |
| Q | Full-face photographs and comparable images | N/A | Masker does not process images |
| R | Any other unique identifying number, characteristic, or code | ⏳ In progress | Intentionally vague category; best-effort via NER — full coverage cannot be claimed |
What each status means
Full coverage
Full coverage
A category is full when:
- All common shapes are matched by the Pass 1 regex detector with high precision
- All conversational forms are caught by the Pass 2 NER model with confidence > 0.6
- Coverage has been validated against at least two external benchmark sets (i2b2/n2c2, MedDialog, Switchboard, or Masker’s own internal benchmark)
- Recall on the internal evaluation set is ≥ 95%
Partial coverage
Partial coverage
A category is partial when:
- Common and well-known shapes work reliably
- Long-tail or unusual formats need tenant-level customization (custom regex per tenant)
- Recall on the internal evaluation set is between 70% and 95%
In progress
In progress
A category is in progress when the work is scoped, the evaluation set exists, and a target ship date is tracked. For all in-progress categories, the May 30, 2026 production launch is the date Masker commits to reaching at least “partial” status.
PCI-DSS scope reduction
In addition to HIPAA Safe Harbor, Masker performs Luhn-checked detection of credit card numbers (category J covers account numbers more broadly). Cardholder data is redacted before egress to your LLM provider, which means the LLM never processes raw card numbers. This reduces the PCI-DSS scope of your voice AI pipeline. Every session compliance report includes a PCI-DSS scope section confirming whether cardholder data was detected and redacted before egress.How coverage appears in your reports
Every session compliance report includes the Safe Harbor coverage matrix stamped at the time the report is generated. The matrix is not static — when you upgrade Masker and a category is promoted from partial to full, your next report reflects that improvement automatically. Earlier reports continue to show the coverage that was true at the time they were generated. This is intentional: auditors need point-in-time evidence.Expert Determination
If you prefer Expert Determination over Safe Harbor — typically because you need to retain certain identifier categories for clinical utility — Masker supports this. Configure the relevant entity actions aspassthrough, document the expert determination separately, and the audit log will capture the active policy at the time of each call.
Masker does not provide expert determination services. That relationship is between you and a qualified statistician.