Top 5 HIPAA Compliance Mistakes Cloud SaaS Companies Make (and What Each One Actually Costs)
Alexander Sverdlov
Security Analyst

Key Takeaways
- A signed BAA with AWS, GCP, or Azure does not make you HIPAA-compliant. It moves liability for the underlying infrastructure controls only. Everything you build on top is your problem.
- The single most common 7-figure mistake we see is ePHI in observability tooling (CloudWatch logs, Datadog traces, Sentry breadcrumbs, BigQuery analytics). Most teams discover it during an OCR audit, not before.
- If your staging environment was ever populated from a production snapshot, you have ePHI in staging right now. Roughly 7 out of 10 SaaS audits we run surface this exact pattern.
- Unmanaged collaboration tools (Slack DMs, Notion pages, shared Google Docs, screenshots in Linear tickets) account for the majority of "covered entity complains that PHI is somewhere it should not be" cases. These are the easiest leaks to find and the hardest to clean up after the fact.
- There is no such thing as a "HIPAA certification." There is only a documented 164.308(a)(1)(ii)(A) Risk Analysis, kept current, and the controls you implemented because of it. Without that document, every other control is unfalsifiable.
- All five mistakes below are fixable in 90 days for under \$80,000 of professional services plus internal time. The cost of not fixing them is measured in resolution agreements that start at \$1.5 million.
In April we got a call from the CTO of a 35-person digital health startup. They had been live with their first hospital system customer for nine months. The customer's security team had just run a routine review and flagged that the startup's Sentry organization was capturing patient names, dates of birth, and visit dates inside JavaScript error stack traces. The Sentry retention policy was two years. The total ePHI footprint in Sentry was estimated at 142,000 records.
The CTO opened the call with a sentence we have heard many times: "But we have a BAA with Sentry." They did. It was signed in 2023. It did not matter. The customer had a 30-day clock to determine whether this was a reportable breach under their own BAA with the startup, which would trigger downstream notifications to roughly 90,000 individuals.
The startup avoided a Notice of Privacy Breach by deleting the data, executing a corrective action plan with the covered entity, and signing a remediation addendum. The total cost (legal, breach counsel, our engagement, internal engineer time, customer-mandated re-audit, and 18 months of additional monitoring) came to roughly \$340,000. Their Series A bridge round closed three months late because the lead investor wanted the corrective plan signed first.
None of the controls in this article are exotic. All of them are listed in 45 CFR Part 164. We see the same five mistakes in roughly 80 percent of cloud SaaS engagements we run for healthcare-adjacent companies. If you are building anything that touches a hospital, a payer, a provider group, a pharma sponsor, a clinical trial site, or a Medicare Advantage plan, read on.
Context
Why Cloud SaaS HIPAA Looks Different From Hospital HIPAA
HIPAA was written in 1996 for paper records inside hospital walls. The Security Rule was finalized in 2003 for client-server environments. The HITECH Act of 2009 added breach notification and pushed Business Associates into direct accountability. None of those drafters were thinking about a 12-person Vercel-hosted React app talking to a Supabase backend that nightly syncs to Snowflake for analytics.
The regulation is still the regulation. The Office for Civil Rights (OCR) does not care that your stack is more modern than the controls. They will still ask you for three documents in any audit: your most recent 164.308(a)(1)(ii)(A) Risk Analysis, your 164.316(b)(1) policies and procedures, and your 164.312(b) audit logs demonstrating that you can review activity in systems containing ePHI. If any of the three is missing or out of date, you are not arguing about specific controls anymore. You are arguing about whether the program exists.
If you only look at Tier 1, your HIPAA program will appear excellent. If you only look at Tier 1, you will also miss the entire blast radius of any actual incident. The five mistakes below all live in Tiers 2, 3, and 4. They are also the five places we find evidence of ePHI in roughly 70 to 90 percent of audits.
Mistake One
Treating the BAA as the Compliance Boundary
A Business Associate Agreement (BAA) is a contract. It is not a control. It allocates responsibility under HIPAA for what each party will do with ePHI, and it commits both sides to specific obligations under 45 CFR 164.504(e). It does not, by signing, make any technical control exist.
In real engagements, the BAA is the thing that customers ask for and the thing that founders assume is the program. The actual program is much larger. Here is what the BAA covers and what it leaves to you.
| What the BAA does | What it does not do |
|---|---|
| Commits the vendor to specific uses and disclosures of ePHI | Make any specific technical control exist in your application |
| Requires the vendor to report incidents and security breaches | Cover services within the vendor that are not BAA-eligible (most marketing analytics, free tiers, beta features) |
| Imposes the same obligations on subprocessors | Audit whether the vendor's other customers' configurations are isolated from yours |
| Allocates breach notification responsibilities | Replace your own Risk Analysis under 164.308(a)(1)(ii)(A) |
| Provides contractual remedies if the vendor breaches its obligations | Forgive you for sending ePHI to a service that is configured outside its HIPAA-eligible scope |
The specific trap with AWS, GCP, and Azure BAAs
Each hyperscaler publishes a list of "HIPAA-eligible services." Anything not on that list is outside the BAA, even though the BAA is signed. AWS as of recent updates lists roughly 175 eligible services out of 200+. Bedrock generative AI services have specific configuration requirements before they are BAA-covered. Athena queries against S3 buckets containing ePHI are eligible. SageMaker notebooks are eligible only with specific encryption settings. Check the list every quarter, because services move on and off and the list at the time the BAA was signed is not the list today.
How to fix it. Build a single living document, in Notion or your favorite tool, that maps every place ePHI lives or could live to: the vendor, whether a BAA is signed, the date of the most recent BAA review, the specific service-level configuration that makes it BAA-eligible (e.g. "AWS S3 bucket policy: SSE-KMS with CMK, bucket logging enabled, public access blocked, MFA delete on, replicated to us-east-2"), the data classification, and the assigned owner. We call this the ePHI Register. It is the document we ask for first in every engagement. If it does not exist, we build it in the first week.
The cost of skipping it. The startup in our opening anecdote had eight BAAs on file. They did not have an ePHI Register. They had no way to discover, before the customer told them, that Sentry was capturing patient data. The register would have surfaced the gap in about 90 minutes the day a Sentry account was created.
Mistake Two
Letting ePHI Bleed Into Observability Tooling
This is the single most common 7-figure mistake we see. Logs and traces are written by engineers who are debugging an outage at 2am. Nobody is thinking about the difference between a patient ID and a patient name when the production database is on fire. The default behavior of nearly every modern logging library is to serialize the entire argument list of the function that threw the error. If that function took a Patient object as input, that Patient object is now in Sentry, Datadog, and CloudWatch.
The seven specific places where ePHI most often leaks into observability:
- Sentry error stack traces. Local variables and function arguments are captured automatically. A patient object passed to a date-of-birth validator becomes a Sentry event with DOB in scope.
- Datadog APM request traces. Query strings, request bodies, and database query parameters are captured by default. Patient names show up in URL paths and in SQL parameters.
- CloudWatch application logs. Anything written to stdout from a Lambda or ECS task lands here.
console.log(patient)in development becomes a CloudWatch log group in production. - LogRocket / FullStory session replays. Capture every DOM mutation in the browser. PHI rendered on a clinician screen is now a recording in a third-party service.
- PostHog / Mixpanel product analytics. Event properties commonly include patient IDs, plan names, MRNs as "user properties" when an engineer wanted to debug a funnel.
- GitHub Actions / CircleCI workflow logs. Integration tests against staging often run with real data, with verbose logging, with logs retained 90 days, publicly visible to the engineering team.
- Slack #alerts channel. PagerDuty / Opsgenie alert payloads include the offending request body. Slack search indexes 18 months by default. Tier 3 leaks into Tier 4.
The fix is structural, not curative
Trying to scrub ePHI from logs after the fact is a losing battle. The fix is to make it impossible to log ePHI in the first place. Three controls do most of the work: (1) a typed PHI wrapper class that throws if serialized to JSON or string, (2) a Sentry / Datadog beforeSend hook that drops events containing fields named like ePHI (mrn, ssn, dob, patient_name, address), and (3) a CI check that fails the build if anything in the diff calls a logger with a parameter typed as PHI. We have implemented this pattern for clients in TypeScript, Python, and Go. The first version takes about three engineer-days. After that it pays for itself on every incident.
How to find what is already in there. Run a one-time data-discovery sweep across your Sentry, Datadog, CloudWatch, and any session replay tool you use. Search for regex patterns that look like SSN (3-2-4 digits with separators or contiguous), DOB (anything that parses as a date with year in a plausible patient range), and a small dictionary of common first names. Most teams find something in 30 minutes. The bigger the team, the bigger the find.
The cost. One specific resolution agreement we worked alongside (not as primary counsel) cost the client \$2.1 million in OCR penalty plus another \$3.4 million in remediation, audit, monitoring, and lost revenue. The original leak was a single CloudWatch log group nobody had reviewed in 14 months.
Mistake Three
Letting Production Data Leak Into Staging, Dev, and Analytics
The pattern is universal. A staging environment is initially populated with synthetic data. Six months later, a tricky reproduction case forces an engineer to copy a production snapshot into staging "for debugging." The copy is never deleted. New engineers join. The snapshot becomes the canonical fixture for QA. Over time, the staging environment becomes a less-protected mirror of production, with weaker auth, broader IP access, more permissive logging, and the same ePHI.
The same pattern happens with analytics. A data team gets read-only access to a production replica. The replica is used to build a BigQuery dataset for reporting. The BigQuery dataset is queried by Looker. Looker dashboards are shared by email link. The chain ends with ePHI rendered inside a Looker dashboard that a marketing manager can access because they were given "view-only" Looker permissions for a different report.
The fix. Set a hard rule: production data only leaves production through a deterministic de-identification pipeline. The pipeline runs nightly. It outputs a synthetic dataset that preserves cardinality and join behavior but contains no values that resemble real ePHI. Names become a deterministic hash mapped to a fake-name dictionary. Dates of birth are shifted by a consistent per-record offset. MRNs become a different format with the same uniqueness. Free-text fields are scrubbed by a Named Entity Recognition model with a strict false-positive bias.
Then enforce it. The staging environment cannot connect to production network endpoints. The data warehouse pulls from the de-identified replica only. The analytics tools see the de-identified dataset, period. Real production access requires a break-glass procedure, time-limited, audited, and reviewed by a security-team member who is not on the requesting engineer's team.
Cost to implement. For a 30-person engineering team with a moderately complex data model, building this pipeline is a 4-to-6-week project. Tooling like Tonic, Mockaroo Enterprise, or Gretel can shave a few weeks off, with annual cost in the \$15K-\$60K range. Custom-built scrubbing with Python and a NER library can be done for engineering time alone but is more brittle.
Mistake Four
Letting Slack, Notion, and Google Drive Become an Unmanaged ePHI Repository
There is no engineer who has not, at some point, pasted a customer support ticket containing real patient information into a Slack DM with a colleague to ask "is this a bug or is this the user being weird?" There is no product manager who has not, at some point, attached a screenshot of a clinician's screen to a Notion spec to illustrate a workflow. There is no support agent who has not, at some point, forwarded a patient email to an internal Google Group to coordinate a response.
Each of these is a HIPAA violation in slow motion. The combined Slack workspaces of a typical 50-person healthcare SaaS contain, by our measurement, between 300 and 2,400 instances of ePHI by the time they are audited. None of it is malicious. All of it accumulates because the collaboration tool is faster than the ticketing system, and the ticketing system never quite has the right context.
The four patterns we find every single time
- Support escalations as Slack threads. Customer-support agent pastes ticket body into #cs-escalations channel. Patient information is now indexed by Slack search for the lifetime of the workspace.
- Bug repro steps in Linear tickets. Engineer attaches a screen recording to a Linear ticket to demonstrate a bug. The recording shows a real patient chart. Linear is not BAA-covered for most plans.
- "Just for me" Google Docs. An ops lead exports a list of all active patients from the admin tool into Google Sheets to do a one-off cohort analysis. The sheet is saved in their personal Drive folder for the next three years.
- Zoom recordings of support calls. Customer success records a call with a clinician demoing their workflow. The recording captures live ePHI on screen and is auto-uploaded to Zoom Cloud Storage with 365-day retention.
The fix has three parts.
1. Build the alternative
If support agents and engineers need to share patient context, give them a tool that is BAA-covered, isolates ePHI from indexing, and is faster to use than Slack. We have built these as a tiny internal portal that auto-redacts known ePHI patterns and produces a short-lived shareable link. Engineering can ship a minimum viable version in two weeks.
2. Put real friction in front of the wrong path
Slack Enterprise Grid plus Slack DLP, Nightfall, or Polymer can scan messages and attachments for ePHI patterns in real time. When detected, the message is replaced with a banner asking the sender to confirm or redirect. The same applies to Google Drive (Workspace DLP) and Microsoft 365 (Purview DLP). False positives are tolerable; missed positives are not.
3. Train, document, and run quarterly sweeps
Engineering and CS need a 15-minute training on "where ePHI is allowed and where it is not." Repeat annually. Run a quarterly sweep: search Slack, Notion, Drive, Linear, and Zoom for known ePHI patterns and a small dictionary of patient identifiers. Whatever you find, delete with documented hash-and-timestamp records. The first sweep always returns something. The fourth one usually returns nothing.
Mistake Five
No Current Risk Analysis Under 164.308(a)(1)(ii)(A)
There is no such thing as a HIPAA certification. There is a Risk Analysis. The Risk Analysis is the document OCR asks for first in every audit and every investigation, and the absence of one (or the presence of one dated three years ago and never updated) is the single most common finding in resolution agreements. Roughly two-thirds of published OCR resolution agreements name "failure to conduct an accurate and thorough risk analysis" as a primary finding.
A Risk Analysis is not a spreadsheet of vulnerabilities. It is a written document that identifies all the systems where ePHI exists, enumerates threats and vulnerabilities for each, estimates likelihood and impact, documents the controls in place and the residual risk, and proposes mitigations. NIST 800-66 Rev. 2 (the HHS-recommended methodology) is the framework that maps cleanly to the HIPAA Security Rule.
What a defensible Risk Analysis contains. The structure is documented in NIST 800-66 Rev. 2. The shortest readable version is: (1) a system inventory of every place ePHI lives; (2) a threat catalog appropriate to your environment (we use a 32-item list for cloud SaaS); (3) a vulnerability assessment for each system against each threat; (4) likelihood and impact estimates on a defensible scale (we use 1-to-5, documented); (5) the controls in place and their effectiveness; (6) residual risk; (7) mitigation plan with named owners and target dates. For a 30-person SaaS this document is typically 40 to 80 pages and takes three to six weeks to build from zero.
Why the cadence matters. The number one finding in published HIPAA resolution agreements is not "had no Risk Analysis." It is "had a Risk Analysis from three years ago that no longer reflected reality." A stale document is worse than no document, because it shows you knew the obligation and did not maintain the program. Annual refresh is the floor. Quarterly delta reviews are the standard we recommend.
The Numbers
The Cost of Fixing vs. The Cost of Finding Out the Hard Way
These numbers are not abstract. They come from 30 healthcare-adjacent engagements we have personally run or supported in the past 36 months. The fix-cost ranges depend on team size, code base complexity, and whether the team is starting from zero or has partial work in place. The breach-cost ranges come from published resolution agreements plus our knowledge of unpublished settlements where we were involved alongside breach counsel.
How Atlant Security Helps
90-Day HIPAA Cloud SaaS Hardening Program
We run this as a single 90-day engagement for healthcare-adjacent SaaS between 10 and 200 people. By Day 90, you have a current Risk Analysis, a complete ePHI Register, observability tooling that cannot leak ePHI, a de-identified staging pipeline, DLP enforced on collaboration tools, and a documented quarterly cadence ready to hand to your next customer.
- Fixed pricing from \$58,000, scope written before contract
- Senior consultants only, never juniors
- Half the engagement is implementation, not just findings
- Deliverable kit: Risk Analysis, ePHI Register, DLP runbook, IR plan, training deck, evidence pack for your next vendor security review
- Pay after each phase, not all up front
Frequently Asked
Questions Healthcare SaaS Founders Ask Us Every Week
If we encrypt everything at rest, are we good?
Encryption at rest is one safeguard out of dozens listed in 45 CFR 164.312. It addresses the "data on a stolen disk" threat. It does not address logging, sharing, copying into lower environments, vendor BAAs, access control, audit trail, breach notification, or the existence of a Risk Analysis. Encryption is necessary and far from sufficient. Most resolution agreements we have read involved companies that were encrypted at rest.
Our customer is a covered entity. Are we a Business Associate?
If you create, receive, maintain, or transmit ePHI on behalf of a covered entity, you are a Business Associate under 45 CFR 160.103. You have direct HIPAA Security Rule liability under HITECH. Hosting only matters if the data is encrypted with keys you do not hold and you have no other ePHI access - the so-called "conduit exception" is narrow and rarely applies to SaaS. Assume you are a BA unless legal counsel has told you in writing otherwise.
Do we need SOC 2 if we are doing HIPAA?
Not for the regulator. HIPAA Security Rule is its own framework and has no SOC 2 requirement. For sales, most covered entity customers will ask for either SOC 2 Type 2 or a HITRUST certification, and increasingly some larger health systems require both. A SOC 2 Type 2 with the HIPAA-related criteria included is the most common combined posture for cloud SaaS. HITRUST is more expensive and more rigorous, and is often required by payers and large IDNs.
What is the minimum HIPAA program for an early-stage SaaS pre-revenue?
Pre-revenue and pre-ePHI, the program is two documents: a written security policy and a Risk Analysis covering your intended ePHI handling. Once you sign your first BAA with a customer, the program needs all of the controls in this article in some form. There is no "we will get to it after the deal closes" path that works in practice. The deal closes contingent on the controls existing.
If we discover ePHI in Sentry today, are we required to report it as a breach?
It depends on the four-factor risk assessment in 45 CFR 164.402: nature and extent of the ePHI, who used or received it, whether the ePHI was actually acquired or viewed, and the extent to which the risk has been mitigated. Sentry access is typically limited to your engineering team and to Sentry employees under their BAA. If you can document that no unauthorized party accessed the data, and you delete the data within a defined window, most counsel will conclude no breach notification is required. The four-factor analysis must be documented in writing regardless. This is where you bring breach counsel in before you make the call.
Our engineering team is 7 people. Can we run this program ourselves?
Parts of it, yes. Most teams of that size can build the ePHI Register, configure DLP, and remove ePHI from observability with their own engineers if they have a clear playbook. The pieces that benefit most from outside help are the Risk Analysis (which has a defensible format OCR expects), the tabletop exercise, and the contract review of vendor BAAs. Roughly half our clients in that size band run the implementation themselves and engage us for the assessment, the documents, and the quarterly review cadence.
The five mistakes in this article are not edge cases. They are the modal pattern. If you are running a cloud-native healthcare SaaS, you almost certainly have at least three of them right now, and probably four. The good news is that all five are fixable inside a single quarter for a fixed and knowable cost. The less good news is that the cost of finding them through an OCR investigation or a covered-entity-triggered audit is roughly an order of magnitude higher, and the discovery is rarely on your schedule.
A healthcare customer who asks you for evidence of HIPAA compliance is not trying to trip you up. They are trying to make their own program defensible. Give them the documents that close their internal review without a back-and-forth: Risk Analysis, ePHI Register, current BAA list with subprocessors, IR plan, and the most recent quarterly attestation that the program is being maintained. Those five documents close more deals than any number of dashboards.
If a hospital or payer is in your pipeline this quarter, book a 30-minute scoping call or email alexander@atlantsecurity.com.

Alexander Sverdlov
Founder of Atlant Security. Author of 2 information security books, cybersecurity speaker at the largest cybersecurity conferences in Asia and a United Nations conference panelist. Former Microsoft security consulting team member, external cybersecurity consultant at the Emirates Nuclear Energy Corporation.