zinch
Header image for the article: HIPAA-Compliant Agents on Google Cloud: a Reference Architecture
  • Healthcare
  • Governance
  • Reference Architecture

HIPAA-Compliant Agents on Google Cloud: a Reference Architecture

Where PHI lives at every step of an agent request, and which Google Cloud services are HIPAA-eligible today — a named-component reference architecture.

Zinch Engineering14 May 202614-min read

Ask an AI Overview whether you can build a HIPAA-compliant agent on Google Cloud and you get a confident yes, sourced from Google's healthcare landing pages and an analyst summary or two. The answer is correct the way a brochure is correct: directionally true, operationally useless. It does not tell you which services carry a Business Associate Agreement, where protected health information (PHI) sits as a request moves through an agent, or which platform components — months after the April 2026 rebrand — are not yet eligible to touch PHI at all.

This is the version a healthcare engineering team can build against: it maps the post-rebrand Gemini Enterprise Agent Platform components onto the HIPAA Security Rule's technical safeguards and names where PHI lives at each step. One commitment runs through every section. When this article says "HIPAA-compliant," it means an agent on Google Cloud's HIPAA-eligible services — Vertex AI Platform, Gemini models, Model Armor, and MedLM — under Google's BAA. That qualifier is not boilerplate; it is the difference between a claim you can defend to an auditor and one you cannot.

1. The compliance surface

HIPAA is not one rule. A healthcare agent touches four obligations, and an architecture that satisfies one while ignoring the others reads, in a breach investigation, as non-compliant.

  1. The Security Rule — administrative, physical, and technical safeguards for electronic PHI. The technical safeguards (access control, audit controls, integrity, transmission security) are what an agent architecture satisfies in code and configuration, and what the rest of this article maps to.
  2. The Privacy Rule — use and disclosure of PHI: minimum necessary, permitted purposes, patient rights. An agent that retrieves more of a record than the task requires is a minimum-necessary problem before it is a security problem.
  3. The Breach Notification Rule — what a breach is and the notification obligations that follow. Its architectural relevance is blunt: your audit logs are what let you scope a breach; their absence turns a contained incident into an unbounded one.
  4. 42 CFR Part 2, where SUD data is in scope — a stricter federal overlay, separate from HIPAA. Section 11 covers what it changes.

The technical safeguards are deliberately technology-neutral; a reference architecture's job is to bind them to named platform components. Read the HHS HIPAA Security Rule guidance directly — it is the source the rest of this maps against.

2. What "HIPAA-compliant agent" actually means

There is no HIPAA-compliant product you can buy, and there is no HIPAA certification. Both statements surprise people, and both matter for evaluating vendors.

HIPAA compliance is a property of an organization's handling of PHI — its safeguards, agreements, and practices, assessed against the rule. It is not a property of a platform, model, or agent in isolation. What Google Cloud offers is HIPAA-eligible services: services Google covers under a BAA and has built to support a customer's compliance. "HIPAA-compliant platform" is not that — it skips the step where the customer configures the eligible services correctly.

The honest decomposition is two layers. A compliant platform substrate: Google offers HIPAA-eligible services and signs a BAA, and the eligible set for an agent workload is Vertex AI Platform, Gemini models, Model Armor, and MedLM. Google's published HIPAA compliance documentation carries the authoritative Covered Products list — that is the document you check, not a sales claim. And a compliant deployment: the customer, with their partner, configures those services so PHI is access-controlled, encrypted, logged, segmented, and retained per the customer's own HIPAA policies. This is where the actual compliance work lives.

3. The shared responsibility model

Cloud compliance runs on a shared responsibility model, and the line follows what each party controls. Google owns physical security of the data centers, the HIPAA-eligible service's infrastructure security, encryption at rest and in transit as a platform default, and the BAA itself — its contractual commitment as a business associate for the covered services. The customer, with their partner, owns everything that is configuration: which identities can call which agent, which tools an agent is scoped to invoke, whether PHI in a prompt is redacted before a model sees it, network isolation, audit-log retention duration, whether the workload touches only Covered Products services, and the minimum-necessary scoping of every retrieval.

The practical test for any component: is this Google's infrastructure responsibility, or my configuration responsibility? For a healthcare agent the answer is "configuration" far more often than teams expect. That is the model working as designed — and it is most of the engagement when Zinch builds a healthcare agent.

4. Reference architecture: where PHI lives at each step

A healthcare agent request is not one event — it is a pipeline, and PHI enters, changes form, and persists at distinct stages. Compliance means knowing, for every stage, what data is present and what control governs it. Trace a prior-authorization determination end to end:

StepWhat happensPHI present?Governing control
1. IntakeCase data arrives from an EHR connector or intake formYes — full PHIAgent Gateway: authN/authZ, allowlist, rate limit
2. Pre-model inspectionThe outbound prompt is assembled and inspected before a modelYes — PHI in the prompt payloadModel Armor: detection and redaction on the outbound path
3. Model callThe policy-cleared prompt reaches a Gemini model or MedLMMinimized PHI, or none, per the redaction policyHIPAA-eligible model service under the BAA; VPC Service Controls perimeter
4. Tool executionThe agent calls tools — eligibility lookup, policy retrieval, criteria parsingYes — tools read and return PHIAgent Identity: per-agent scoped credentials
5. Response assemblyModel output and tool results compose into a responseYes — PHI in the draft outputModel Armor: inbound inspection on the response path
6. Memory persistenceCase state is written so a multi-turn workflow can resumeYes, if state carries PHIEncryption at rest, retention controls — and a hard eligibility constraint, see section 8
7. Audit logEvery step above is recordedThe log records that PHI was processed — not PHI valuesAgent Registry; retention per your HIPAA policy

Two things to internalize. First, PHI is present at almost every step — intake, prompt, tool calls, response, persisted state. The architecture's job is per-step control, not perimeter control. Second, the model call is where you have the most room to design: steps 2 and 5, Model Armor on the outbound and inbound paths, structurally reduce how much PHI reaches a model at all. A redaction policy that lets the model see only the minimum the task needs is the minimum-necessary principle expressed as architecture.

5. Model Armor for PHI redaction

Model Armor is the platform's screening layer for model interactions. In a HIPAA architecture its job is to enforce PHI handling policy on data crossing into and out of a model, at two points. On the outbound (prompt) path, before an assembled prompt reaches a Gemini model or MedLM, Model Armor detects PHI patterns — names, MRNs, dates of birth, account numbers, free-text identifiers — then redacts, tokenizes, or blocks per your policy. On the inbound (response) path, the model's output is inspected before it returns to the agent, catching PHI that should not be in a response.

Three design questions decide whether the layer earns its place:

  1. Redact, tokenize, or block? Redaction strips the value; tokenization replaces it with a reversible reference a later step can rehydrate; blocking refuses the request. A prior-auth agent that needs a member ID for a tool call but not in the model prompt is a tokenization case. Decide this per data element, not globally.
  2. Where does the policy live? The Model Armor policy is itself a governed artifact — in version control, changed through review, with a change history that is part of your audit story. A policy edited ad hoc in a console is a finding waiting to happen.
  3. What is the failure mode? If Model Armor cannot evaluate a prompt, the request fails closed. A request that cannot be screened does not proceed.

Model Armor reduces PHI exposure structurally. It does not replace the customer's responsibility to design for minimum necessary — a redaction policy is a backstop, not a substitute for an agent scoped to ask for less.

6. Agent Identity and IAM

The Security Rule's access-control safeguard is not "the service account can reach the database." It is each actor having the minimum access its function requires — and in an agent system, the agent is an actor.

The commitment: every agent runs as its own IAM principal with its own scoped credentials. Not a shared service account across a fleet. A prior-auth agent gets exactly the EHR scopes, policy store, and tools it needs, and nothing else. That buys three things under HIPAA. Minimum necessary, enforced — an agent cannot retrieve PHI outside its scope because the credential does not reach it. Audit attribution — when Agent Registry logs a PHI access, it logs which agent identity made it; a shared credential makes "which agent touched this record" unanswerable. Lifecycle control — an agent identity is provisioned, rotated, and decommissioned as a unit, so retiring an agent revokes its access cleanly.

The honest scoping note: per-agent identity is more provisioning work than a shared account, and teams under deadline pressure reach for the shortcut an auditor finds first. Design the identity model before the first agent ships — retrofitting it onto a fleet built on a shared credential is a re-architecture, not a tweak.

7. Agent Gateway: where runtime policy lands

If Agent Identity is who an agent is, the Agent Gateway is what gets enforced when an agent is called. It is the runtime control point — the controls that have to be live on every request rather than configured once and trusted:

  • Authentication and authorization on every call — no agent invocation reaches the agent without an authenticated, authorized caller.
  • Allowlists — which callers, tools, and downstream services are permitted, enforced as policy, not as a convention the calling code is trusted to follow.
  • Rate limits — a security control, not just a cost control; the limit bounds the blast radius of an endpoint under attack.
  • Model Armor inspection at the enforcement point — the Gateway is where the Model Armor policy from section 5 is applied to live traffic. The policy is defined elsewhere; the Gateway is where it lands.

A control that exists in configuration but is not enforced at runtime is not a control. The Gateway is the difference between "we have a Model Armor policy" and "every request was screened" — and for an auditor, only the second sentence is worth anything.

8. Memory and state with PHI

This is where the named-services qualifier becomes a hard architectural constraint rather than a phrasing convention.

Agent workflows need state — a multi-turn prior-authorization process accumulates case context across an assembly-and-draft pass, and that state has to live somewhere. Memory Bank is the platform's managed persistent-memory service and the natural home for it — except that, as of this writing, Memory Bank is not on Google's Covered Products list. Neither is Agent Runtime. Neither is Agent Garden. That list is the authority, and the document to verify against your build date, because eligibility changes as Google brings services under the BAA. What that means concretely, today:

  • PHI does not go into a service that is not HIPAA-eligible. If case state carries PHI and Memory Bank is not on the Covered Products list, PHI-bearing state does not persist in Memory Bank — the line between a defensible "HIPAA-compliant agent on Google Cloud" claim and an indefensible one.
  • Keep PHI in eligible storage and reference it from state. Persist PHI in a HIPAA-eligible Google Cloud storage service, encrypted at rest, with its own access controls and retention. Let the agent's working state carry references — tokens, record identifiers — not PHI values. The state coordinates; the eligible store holds.
  • Encryption at rest is the floor. Retention controls and a deletion path are what make the store compliant.

On the "right to be forgotten" question: HIPAA grants no general right to erasure the way GDPR does, but it grants patients rights over their PHI, and other regimes that apply to your data can require deletion. The architecture has to find and delete a specific patient's PHI — straightforward when PHI lives in a known, eligible, queryable store, hard when it has been smeared across agent state, logs, and caches. The reference-from-state pattern keeps deletion tractable. When Memory Bank, Agent Runtime, or Agent Garden land on the Covered Products list, this constraint relaxes — the verification step does not.

9. Agent Registry as the audit primitive

The Security Rule requires audit controls — mechanisms that record and examine activity in systems containing electronic PHI. For an agent system, Agent Registry is that mechanism. What to log: every agent, enrolled with its identity, version, scoped permissions, and owner — the Registry is the inventory of what is running; every tool call, with which agent, which tool, when, and against which resource, because a tool call is a PHI access; and every model call, with which agent and which model — the fact and metadata of the call, not the PHI content of the prompt. An access record that itself contains PHI has multiplied your exposure surface, not reduced it.

How long to retain: per the customer's HIPAA retention policy. HIPAA's documentation-retention requirement is six years for required documentation; many covered entities set audit-log retention to match or exceed it, and state law can push it longer. The commitment is that retention is configured deliberately to the customer's policy — not left at a platform default and discovered too short during an investigation. How to demonstrate it to an auditor: the Registry is queryable. "Show every PHI access by this agent in this window" is a query, not an archaeology project. The test of an audit primitive is not that it logs — it is that the log answers an auditor's actual question quickly.

10. Network controls and the region constraint

PHI moving over a network is the Security Rule's transmission-security safeguard; PHI sitting in a service reachable from outside an intended perimeter is an access-control problem. Two controls do most of the work. VPC Service Controls puts a service perimeter around the Google Cloud services in your healthcare workload — even a compromised credential cannot move PHI to a project outside the perimeter, because the perimeter, not just the IAM binding, has to permit the path. Private Service Connect gives private connectivity so traffic between your environment and the platform's services does not traverse the public internet; combined with the encryption-in-transit default, that is transmission security expressed as network architecture.

The model-region constraint catches teams late, so name it early. Gemini models and MedLM are available in specific regions, and the set in a given region shifts as Google rolls capability out. A HIPAA architecture frequently has a data-residency requirement — PHI processed in a defined geography. The constraint is the intersection: the model you need, available in the region you are required to process in. If the model you designed against is not offered in your required region, that is a design problem to find before staging, not during. Treat the model-and-region intersection as a fixed input to the architecture.

11. The BAA conversation, and 42 CFR Part 2

The BAA conversation

A covered entity sharing PHI with a vendor needs that vendor to be a business associate bound by a BAA — a contract committing the vendor to safeguard PHI and defining the obligations. Google offers a BAA for Google Cloud covering its HIPAA-eligible services. What to verify:

  1. The BAA is in place before PHI flows — not in parallel with the first PHI-bearing request. The agreement precedes the data.
  2. Every service in your architecture that touches PHI is on Google's Covered Products list. Walk the diagram service by service. The boxes that are not on the list — Agent Runtime, Memory Bank, Agent Garden, as of this writing — do not get PHI. Re-run this check at your build date; the list is versioned and it moves.
  3. The eligible set for an agent workload is Vertex AI Platform, Gemini models, Model Armor, and MedLM. If your PHI path stays inside that set under the BAA, the "HIPAA-compliant agent on Google Cloud" claim is defensible; if it does not, it is not.

Ask at design time, not at launch — the BAA's covered-services boundary is an input to the architecture. The final word belongs to the customer's counsel; the engineering team's job is to hand counsel an architecture where every PHI-touching service is verifiably on the covered list.

42 CFR Part 2: when SUD data is in scope

42 CFR Part 2 protects the confidentiality of substance use disorder patient records held by federally assisted SUD programs. It is stricter than HIPAA, separate from HIPAA, and an architecture handling SUD data has to satisfy both. The canonical text is the eCFR's Part 2 — required reading if SUD data is anywhere in your scope. When Part 2 applies, three architectural changes are non-negotiable, and all three have to be designed in:

  1. Consent capture and enforcement. Part 2 has historically required specific patient consent for disclosures, with tighter constraints than HIPAA's permitted-use framework. The architecture has to capture that consent and enforce it — an agent cannot disclose Part 2 data on a path the patient has not consented to. Consent is a gate the agent checks.
  2. Data segmentation. Part 2 data has to be identifiable and segregable so the stricter rules apply to it specifically. An architecture that commingles Part 2 data indistinguishably with general PHI cannot apply Part 2 controls selectively — it applies them to everything or fails.
  3. Redisclosure tracking. Part 2 restricts redisclosure — a recipient of Part 2 data is themselves constrained. The architecture has to track that a data element is Part 2-protected so downstream handling honors the restriction.

12. HITRUST CSF: platform capability vs. customer control

HITRUST CSF is a control framework mapping across HIPAA, other regulations, and industry standards into one assessable set of controls. A common buyer question is whether an agent architecture "meets HITRUST out of the box." It does not — for any platform, from any vendor. HITRUST CSF assesses an organization's control implementation; a platform can support HITRUST controls but does not hold them. The customer implements and the customer is assessed. The HITRUST CSF reference is the authoritative description of the framework.

Where the platform helps, mapped to this architecture: access control and identity — Agent Identity and IAM (section 6) provide the per-principal, least-privilege capability HITRUST access-control categories ask for; audit logging — Agent Registry (section 9) provides the activity-recording capability; encryption — at-rest and in-transit defaults plus the network controls (section 10) provide the cryptographic-control capability.

Where the customer still owns the control: whether those capabilities are configured to the HITRUST requirement; the administrative and physical safeguards HITRUST also assesses — policies, training, incident response, physical security; and the assessment itself, which the organization earns. The platform gives you the technical capability to satisfy the technical controls. It does not give you the certification, and it does not configure itself.

13. The architecture, in one pass

PHI enters through the Agent Gateway, which authenticates and rate-limits every call. Model Armor redacts or tokenizes PHI on the outbound prompt before it reaches a Gemini model or MedLM — both HIPAA-eligible under Google's BAA — and inspects the inbound response. The agent runs as its own IAM principal with Agent Identity, scoped to exactly the tools and data its function requires. PHI-bearing state stays in HIPAA-eligible storage, referenced — not copied — by the agent's working state. Agent Registry logs every agent, tool call, and model call as the audit primitive, and VPC Service Controls plus Private Service Connect keep the workload inside a perimeter and off the public internet. Where SUD data is in scope, 42 CFR Part 2's consent, segmentation, and redisclosure mechanics are in the data layer from the start. Every PHI-touching service in that description is on Google's Covered Products list — which is what makes "HIPAA-compliant agents on Google Cloud" an honest claim, and why Agent Runtime, Memory Bank, and Agent Garden, not yet on that list, are not on the PHI path here.

This is the architecture under Zinch's healthcare blueprints. The prior authorization blueprint is a single-agent build that exercises it directly — Model Armor on every PHI access, Agent Registry tying every drafted determination to the named reviewer who finalized it. The claims processing adjudication blueprint is the multi-agent version, where the same controls extend across an agent-to-agent handoff; the enterprise A2A guide covers what production-grade A2A demands on top of the protocol, and signed Agent Cards are part of why a cross-agent PHI handoff stays auditable. For the orchestration layer underneath, the ADK build guide is the code-first walkthrough, and the platform overview maps how the Build, Scale, Govern, and Optimize pillars fit around an agent your engineering team owns.

If you are scoping a healthcare agent and want this architecture pressure-tested against your specific PHI flow, regulatory overlay, and data-residency constraints, talk to the healthcare practice — the six-week delivery methodology builds the governance bundle, the eval set, and the audit trail as first-class deliverables, not a post-launch retrofit.

14. FAQ

FAQ

Frequently asked questions

The platform is not a single service — it is a set of components, and they are not uniformly eligible. As of this writing, the HIPAA-eligible services for an agent workload are Vertex AI Platform, Gemini models, Model Armor, and MedLM, covered under Google's BAA. Agent Runtime, Memory Bank, and Agent Garden are not on Google's Covered Products list. The accurate statement is not "the platform is BAA-eligible" — it is "these specific services are, and these are not." Verify the Covered Products list against your build date; the list is versioned and Google brings services under the BAA over time.

Model Armor inspects model interactions at two points. On the outbound path it screens the assembled prompt before a Gemini model or MedLM and redacts, tokenizes, or blocks PHI per your policy — enforcing that the model receives the minimum PHI the task requires. On the inbound path it inspects the model's response before it returns to the agent. At runtime the policy is enforced at the Agent Gateway. Model Armor is a structural backstop — it reduces PHI exposure but does not replace designing the agent to ask for less in the first place.

Not as of this writing. Memory Bank is not on Google's Covered Products list, so PHI — including patient identifiers — does not belong in it under a "HIPAA-compliant on Google Cloud" claim. The pattern is to persist PHI in a HIPAA-eligible Google Cloud storage service, encrypted at rest with its own access and retention controls, and let the agent's working state carry references — tokens or record identifiers — rather than PHI values. If Memory Bank later lands on the Covered Products list, that constraint relaxes; the verification step against the list does not.

42 CFR Part 2 applies when the data includes substance use disorder patient records held by a federally assisted SUD program. It is a separate, stricter federal regime — an architecture handling SUD data has to satisfy both Part 2 and HIPAA. Three changes are non-negotiable and all must be designed in: consent capture and enforcement (the agent checks consent as a gate before any Part 2 disclosure), data segmentation (Part 2 data must be identifiable and segregable so its stricter rules apply selectively), and redisclosure tracking (the Part 2 marking travels with the data). It changes the data model, so it is an architecture-time decision.

No architecture from any vendor meets HITRUST out of the box, because HITRUST CSF assesses an organization's control implementation, not a platform. What the platform provides is the technical capability to satisfy the technical controls — Agent Identity and IAM for access control, Agent Registry for audit logging, encryption and network controls for cryptographic protection. The customer still owns whether those capabilities are configured to the assessed requirement, the administrative and physical safeguards HITRUST also covers, and the assessment itself. The capability is real; the implementation and the certification are the organization's.

"HIPAA-compliant agents on Google Cloud" is honest when the agent runs on HIPAA-eligible services — Vertex AI Platform, Gemini models, Model Armor, MedLM — under Google's BAA, configured correctly. The claim names a specific, verifiable substrate. "HIPAA-aware" is a softer claim that usually means "we know HIPAA exists and have considered it" without committing to the eligible-services substrate — what a vendor says when they cannot make the stronger claim honestly. What no vendor can claim is "HIPAA-certified": HIPAA has no certification body, so the designation does not exist. Treat that claim as a signal to look elsewhere.

One workflow. One outcome. Code your team owns.

Ship the first agent in two weeks. See where it leads.

  • Code

    Lives in your Git org, owned from commit one.

  • Governance

    Model Armor and Agent Registry on day one.

  • Speed

    Two weeks to a runnable pilot. Eight to production.

Not ready to talk? Take the 4-min readiness assessment