Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.masker.dev/llms.txt

Use this file to discover all available pages before exploring further.

The HIPAA Privacy Rule defines two paths to de-identification: Expert Determination and Safe Harbor. Masker is built around Safe Harbor — it is the more prescriptive of the two, and the one that can be audited from logs. Safe Harbor enumerates 18 identifier categories under 45 CFR §164.514(b)(2). To qualify as de-identified under Safe Harbor, all 18 categories must be removed. This page is an honest, current account of where Masker stands against each one.
Coverage is continuously expanding. The May 30, 2026 production launch targets at least “partial” status for every remaining in-progress category. If your workload depends on a category listed as partial or in progress, contact us — we have accelerated roadmap items for pilot customers before.

Coverage matrix

#CategoryStatusNotes
ANames✓ FullPass 2 (Gemma-4 NER), high recall on conversational input
BGeographic subdivisions smaller than a state✓ FullZIP, city, and street address covered via regex and NER
CDates more granular than year (birth, admission, discharge, death)◐ PartialBirth, admission, and discharge dates: full. Certain year-only phrasings still pass through.
DTelephone numbers✓ FullE.164, North American, and spoken-word forms
EFax numbers✓ FullSame detector as telephone
FEmail addresses✓ FullRFC 5322 simplified pattern
GSocial Security numbers✓ FullDashed, undashed, and spoken-word forms
HMedical record numbers (MRN)✓ FullTenant-configurable patterns for custom formats
IHealth plan beneficiary numbers◐ PartialCommon payer formats (Medicare, BCBS): full. Long-tail payer formats require custom policy.
JAccount numbers✓ FullGeneric numeric patterns and Luhn-checked account numbers
KCertificate and license numbers◐ PartialMajor US state driver’s license formats covered. DEA and NPI numbers: partial.
LVehicle identifiers and serial numbers, including license plates⏳ In progressTargeted for May 30
MDevice identifiers and serial numbers⏳ In progressTargeted for May 30
NWeb URLs⏳ In progressURLs containing query-string PHI are currently redacted; full-URL redaction in progress
OIP addresses✓ FullIPv4 and IPv6, default action is redact (not tokenize)
PBiometric identifiers including finger and voice prints⏳ In progressVoice fingerprinting not in scope today
QFull-face photographs and comparable imagesN/AMasker does not process images
RAny other unique identifying number, characteristic, or code⏳ In progressIntentionally vague category; best-effort via NER — full coverage cannot be claimed
Current tally: 9 full, 3 partial, 5 in progress, 1 N/A.

What each status means

A category is full when:
  • All common shapes are matched by the Pass 1 regex detector with high precision
  • All conversational forms are caught by the Pass 2 NER model with confidence > 0.6
  • Coverage has been validated against at least two external benchmark sets (i2b2/n2c2, MedDialog, Switchboard, or Masker’s own internal benchmark)
  • Recall on the internal evaluation set is ≥ 95%
A category is partial when:
  • Common and well-known shapes work reliably
  • Long-tail or unusual formats need tenant-level customization (custom regex per tenant)
  • Recall on the internal evaluation set is between 70% and 95%
If you depend on a partial category, you can harden coverage with a custom policy — contact us to scope that work.
A category is in progress when the work is scoped, the evaluation set exists, and a target ship date is tracked. For all in-progress categories, the May 30, 2026 production launch is the date Masker commits to reaching at least “partial” status.

PCI-DSS scope reduction

In addition to HIPAA Safe Harbor, Masker performs Luhn-checked detection of credit card numbers (category J covers account numbers more broadly). Cardholder data is redacted before egress to your LLM provider, which means the LLM never processes raw card numbers. This reduces the PCI-DSS scope of your voice AI pipeline. Every session compliance report includes a PCI-DSS scope section confirming whether cardholder data was detected and redacted before egress.
Masker is not itself a cardholder data environment and does not seek PCI certification. The scope reduction benefit is about what your LLM provider and downstream systems never see.

How coverage appears in your reports

Every session compliance report includes the Safe Harbor coverage matrix stamped at the time the report is generated. The matrix is not static — when you upgrade Masker and a category is promoted from partial to full, your next report reflects that improvement automatically. Earlier reports continue to show the coverage that was true at the time they were generated. This is intentional: auditors need point-in-time evidence.

Expert Determination

If you prefer Expert Determination over Safe Harbor — typically because you need to retain certain identifier categories for clinical utility — Masker supports this. Configure the relevant entity actions as passthrough, document the expert determination separately, and the audit log will capture the active policy at the time of each call. Masker does not provide expert determination services. That relationship is between you and a qualified statistician.

Why the honest framing matters

Most privacy products in this space claim full Safe Harbor coverage. Many fail on red-team tests. Masker would rather lose a deal where the customer needs a category not yet covered than win it on a claim that cannot be defended in an audit. If a category gap is a blocker, the options are: accelerate it in the roadmap, harden partial coverage with a custom policy, or wait for the full implementation. All three are available.