Vertex AI SDK to Google Gen AI SDK: a Migration Guide

If you ask an AI Overview whether you need to migrate off Vertex AI, you will most likely be told some version of "no — the platform was just renamed to Gemini Enterprise Agent Platform, there is nothing to do." That answer is half right and operationally dangerous.

The rebrand is naming-only. Your projects, your endpoints, your IAM bindings, your deployed models — none of that changes because Google changed a product name. But the Vertex AI SDK for Python — the google-cloud-aiplatform package, specifically its vertexai.* generative surface — is being genuinely deprecated. Google announced the timeline well ahead of the rebrand, and the rebrand did not move it. On June 24, 2026, five modules stop receiving updates and the recommended path is the Google Gen AI SDK (google-genai).

This guide is the engineer's version of that answer: the exact modules, before/after code you can paste into a Colab and run, a parity test pattern, and a six-week roadmap. It is deliberately the most thorough walkthrough we could write, because the migration itself is not hard — it is the quietly wrong migration that costs you a production incident.

1. The June 24, 2026 deadline: what's actually being removed

Two separate things travel under the word "migration," and conflating them is the single most common mistake.

The platform rename. Vertex AI became Gemini Enterprise Agent Platform on April 22, 2026. This is a marketing and console-navigation change. It does not deprecate an API, break a deployed endpoint, or require a code change.
The SDK deprecation. The generative-AI portion of the Vertex AI SDK for Python — the part you import as vertexai.generative_models and friends — is deprecated in favor of the Google Gen AI SDK. This requires a code change, and it has a date.

The non-generative parts of google-cloud-aiplatform — custom training jobs, pipelines, the model registry, feature store, batch prediction infrastructure — are not part of this deprecation. If your codebase only uses Vertex AI for MLOps plumbing and never calls vertexai.generative_models, you may have nothing to migrate. The deprecation is scoped to the generative SDK surface, and that scoping is what makes the work bounded.

2. Who this guide is for

This is written for the person who owns the decision and the person who does the work:

Engineering leaders scoping the migration as a line item — you need the date, the blast radius, and a defensible estimate.
Platform engineers who will run the dependency audit and own the cutover.
ML platform owners with fine-tuned checkpoints and serving infrastructure who need to know what survives.

If you are none of those and you maintain a single script that calls Gemini, skip to section 5 — your migration is about fifteen lines.

3. The five legacy modules being deprecated

Here is the precise mapping. Each row is a module in vertexai.* and what it becomes under google.genai.

Legacy module (`vertexai.*`)	What it did	Google Gen AI SDK replacement
`vertexai.generative_models`	`GenerativeModel`, text + chat generation, tool calling	`client.models.generate_content` / `client.chats`
`vertexai.language_models`	Legacy PaLM text/chat/embedding models (`TextGenerationModel`, `TextEmbeddingModel`)	`client.models.generate_content` and `client.models.embed_content` on current models
`vertexai.vision_models`	Imagen image generation, `ImageTextModel` captioning	`client.models.generate_images` and multimodal `generate_content`
`vertexai.tuning`	Supervised fine-tuning job creation and management	`client.tunings.tune`
`vertexai.caching`	Explicit context caching (`CachedContent`)	`client.caches.create`

Two things to internalize from this table before you write any code.

First, vertexai.language_models is not a one-to-one port. It wrapped the PaLM-generation models. The replacement is not "the same models under a new import" — it is "you are now on current Gemini models, addressed through a unified generate_content call." If you are still on text-bison or textembedding-gecko, the migration and a model upgrade happen together. Budget for the eval work that implies.

Second, the Google Gen AI SDK collapses five module namespaces into one client. There is no genai.generative_models and genai.tuning. There is a Client, and capabilities hang off it as client.models, client.chats, client.tunings, client.caches, client.files. The mental model shifts from "import the model class I need" to "construct one client, call methods on it."

4. Pre-migration checklist

Do not start rewriting code. Start with an audit. The rewrite is mechanical; the audit is where you find the surprises.

Dependency audit

Find every place the legacy surface is imported. A repository-wide grep is faster and more honest than memory:

# Every legacy generative import across the repo
grep -rn --include="*.py" \
  -e "import vertexai" \
  -e "from vertexai" \
  .
 
# Pin inventory — what version is actually installed?
pip show google-cloud-aiplatform | grep -E "^(Name|Version)"

Pay attention to transitive usage. A shared internal library, a notebook checked into the repo, a Cloud Function with its own requirements.txt — each is a separate migration unit. The grep above catches direct imports; for transitive risk, check whether any first-party package you depend on pins google-cloud-aiplatform for its generative features.

IAM and project configuration review

The Google Gen AI SDK, when pointed at Vertex (vertexai=True), uses the same Application Default Credentials and the same project/location model as the legacy SDK. The permission you need is unchanged — aiplatform.endpoints.predict on the project, granted via roles/aiplatform.user or a tighter custom role. There is no new role to provision.

What does change is where configuration lives. The legacy SDK relied on a global vertexai.init(project=..., location=...) call that mutated module state. The Gen AI SDK takes configuration as explicit Client(...) constructor arguments. Inventory every vertexai.init() call site now — each one becomes a Client construction, and you want to decide deliberately whether that is one shared client or several.

Current SDK version inventory

Record the installed google-cloud-aiplatform version per service before you touch anything. You want this for two reasons: it is your rollback target, and behavior in the legacy SDK shifted across its own minor versions. Knowing you were on a specific version makes a parity discrepancy diagnosable instead of mysterious.

Install the replacement

The Google Gen AI SDK is a separate package. It installs alongside the legacy one — they do not conflict at the package level, which is what makes a parallel cutover possible.

pip install google-genai

The source and release notes live in the googleapis/python-genai repository. Read its changelog before you pin a version — it moves quickly.

5. Before/after: text generation

The most common call in most codebases. Here is the legacy pattern:

# BEFORE — Vertex AI SDK (deprecated)
import vertexai
from vertexai.generative_models import GenerativeModel, GenerationConfig
 
vertexai.init(project="my-project", location="us-central1")
 
model = GenerativeModel("gemini-2.0-flash")
response = model.generate_content(
    "Summarize the migration in one sentence.",
    generation_config=GenerationConfig(
        temperature=0.2,
        max_output_tokens=256,
    ),
)
print(response.text)

And the Google Gen AI SDK equivalent:

# AFTER — Google Gen AI SDK
from google import genai
from google.genai import types
 
client = genai.Client(
    vertexai=True,
    project="my-project",
    location="us-central1",
)
 
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="Summarize the migration in one sentence.",
    config=types.GenerateContentConfig(
        temperature=0.2,
        max_output_tokens=256,
    ),
)
print(response.text)

Read the diff carefully — the changes are small and there are exactly four of them:

Import. from google import genai and from google.genai import types. The types module is where every config and content class now lives.
No global init. vertexai.init(...) is gone. Project and location are constructor arguments to genai.Client(...). The vertexai=True flag is what tells the client to route to your Vertex backend rather than the Gemini Developer API — omit it and you are calling a different service.
Model is an argument, not an object. You do not instantiate GenerativeModel("...") and hold it. The model name is a string argument on each generate_content call.
generation_config became config, and GenerationConfig became types.GenerateContentConfig. The fields inside — temperature, max_output_tokens — keep their names.

The convenient part: response.text still works. The accessor for the simple case is stable across both SDKs, which is exactly why a smoke test passes and a careless migration ships. The response object is not identical — more on that in section 13.

For multi-turn chat, the shape is similar — client.chats.create(model=...) returns a chat session, and chat.send_message(...) replaces the legacy model.start_chat() / chat.send_message() pair.

6. Before/after: multimodal generation

Passing an image alongside text. Legacy:

# BEFORE — Vertex AI SDK (deprecated)
import vertexai
from vertexai.generative_models import GenerativeModel, Part
 
vertexai.init(project="my-project", location="us-central1")
 
model = GenerativeModel("gemini-2.0-flash")
response = model.generate_content([
    Part.from_uri(
        "gs://my-bucket/diagram.png",
        mime_type="image/png",
    ),
    "Describe the architecture in this diagram.",
])
print(response.text)

Gen AI SDK:

# AFTER — Google Gen AI SDK
from google import genai
from google.genai import types
 
client = genai.Client(
    vertexai=True,
    project="my-project",
    location="us-central1",
)
 
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=[
        types.Part.from_uri(
            file_uri="gs://my-bucket/diagram.png",
            mime_type="image/png",
        ),
        "Describe the architecture in this diagram.",
    ],
)
print(response.text)

The Part abstraction survives the move — it now lives at types.Part. One signature change to watch: Part.from_uri takes the URI as a keyword argument (file_uri=) in the Gen AI SDK, where the legacy SDK accepted it positionally. The same applies to Part.from_bytes (data= and mime_type=). These are the kind of changes a type checker catches and a quick eyeball does not — run mypy or pyright over the migrated code.

7. Before/after: model tuning

Supervised fine-tuning. This is where teams with real ML investment need to slow down. Legacy:

# BEFORE — Vertex AI SDK (deprecated)
import vertexai
from vertexai.tuning import sft
 
vertexai.init(project="my-project", location="us-central1")
 
tuning_job = sft.train(
    source_model="gemini-2.0-flash",
    train_dataset="gs://my-bucket/train.jsonl",
    validation_dataset="gs://my-bucket/val.jsonl",
    epochs=4,
)
# Poll until done, then read the tuned endpoint
tuning_job.refresh()
tuned_model = tuning_job.tuned_model_endpoint_name

Gen AI SDK:

# AFTER — Google Gen AI SDK
from google import genai
from google.genai import types
 
client = genai.Client(
    vertexai=True,
    project="my-project",
    location="us-central1",
)
 
tuning_job = client.tunings.tune(
    base_model="gemini-2.0-flash",
    training_dataset=types.TuningDataset(
        gcs_uri="gs://my-bucket/train.jsonl",
    ),
    config=types.CreateTuningJobConfig(
        epoch_count=4,
        validation_dataset=types.TuningValidationDataset(
            gcs_uri="gs://my-bucket/val.jsonl",
        ),
    ),
)
# Poll, then read the tuned model
tuning_job = client.tunings.get(name=tuning_job.name)
tuned_model = tuning_job.tuned_model.endpoint

The dataset arguments are now typed objects (types.TuningDataset, types.TuningValidationDataset) instead of bare GCS strings, and several field names changed — epochs is epoch_count, source_model is base_model. The accessor for the finished endpoint moved from tuned_model_endpoint_name to tuned_model.endpoint.

8. Before/after: caching

Explicit context caching — reusing a large shared prefix across many calls. Legacy:

# BEFORE — Vertex AI SDK (deprecated)
import vertexai
from vertexai.generative_models import GenerativeModel, Part
from vertexai.caching import CachedContent
import datetime
 
vertexai.init(project="my-project", location="us-central1")
 
cached = CachedContent.create(
    model_name="gemini-2.0-flash",
    contents=[Part.from_text(LARGE_SHARED_CONTEXT)],
    ttl=datetime.timedelta(minutes=60),
)
 
model = GenerativeModel.from_cached_content(cached_content=cached)
response = model.generate_content("Question against the cached context.")
print(response.text)

Gen AI SDK:

# AFTER — Google Gen AI SDK
from google import genai
from google.genai import types
 
client = genai.Client(
    vertexai=True,
    project="my-project",
    location="us-central1",
)
 
cached = client.caches.create(
    model="gemini-2.0-flash",
    config=types.CreateCachedContentConfig(
        contents=[types.Part.from_text(text=LARGE_SHARED_CONTEXT)],
        ttl="3600s",
    ),
)
 
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="Question against the cached context.",
    config=types.GenerateContentConfig(
        cached_content=cached.name,
    ),
)
print(response.text)

The structural change here is the one worth naming: in the legacy SDK you built a special model object from the cache (GenerativeModel.from_cached_content). In the Gen AI SDK there is no special object — you create the cache, get back a name, and pass that name as cached_content in an ordinary generate_content config. The cache is a parameter, not a model variant. Also note ttl is now a duration string ("3600s"), not a timedelta.

9. Authentication and project configuration changes

This deserves its own section because it is the change that touches every file and the one most likely to be done thoughtlessly.

The legacy SDK's vertexai.init(project=..., location=...) set global module state. Every subsequent GenerativeModel(...) in the process picked up that configuration implicitly. It was convenient and it was a footgun — a library that called init() could silently reconfigure your application.

genai.Client is an explicit object. Configuration is passed at construction and scoped to that instance. This is strictly better for testability and for any process that talks to more than one project or region, but it changes how you structure code:

# A single shared client, constructed once at module load.
# This is the right default for most services.
from google import genai
 
_client = genai.Client(
    vertexai=True,
    project="my-project",
    location="us-central1",
)
 
def summarize(text: str) -> str:
    response = _client.models.generate_content(
        model="gemini-2.0-flash",
        contents=f"Summarize: {text}",
    )
    return response.text

The client reads Application Default Credentials the same way the legacy SDK did. Three practical notes:

Local development still uses gcloud auth application-default login. Nothing changes.
On Google Cloud compute (Cloud Run, GKE, Cloud Functions), the attached service account is picked up automatically. Nothing changes.
project and location can come from environment (GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION) plus GOOGLE_GENAI_USE_VERTEXAI=true instead of constructor arguments — useful for keeping configuration out of code, but be explicit about which mechanism you chose so it is not ambiguous in review.

The migration is your chance to replace scattered vertexai.init() calls with one deliberately-placed client. Take it.

10. Streaming and async patterns

Two API-surface changes that a search-and-replace will miss.

Streaming. The legacy SDK exposed streaming via a stream=True keyword on generate_content. The Gen AI SDK uses a separate method name:

# BEFORE — Vertex AI SDK (deprecated)
for chunk in model.generate_content(prompt, stream=True):
    print(chunk.text, end="")
 
# AFTER — Google Gen AI SDK
for chunk in client.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents=prompt,
):
    print(chunk.text, end="")

There is no stream=True flag in the Gen AI SDK. If your grep for the migration only looks for generate_content, it finds the streaming calls — but the fix is "rename the method," not "change the import," so flag streaming call sites separately.

Async. Async support is namespaced under client.aio rather than expressed through separate async classes:

# AFTER — Google Gen AI SDK, async
response = await client.aio.models.generate_content(
    model="gemini-2.0-flash",
    contents=prompt,
)
 
# Async streaming
async for chunk in await client.aio.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents=prompt,
):
    print(chunk.text, end="")

The client.aio surface mirrors the synchronous one method-for-method. If your service is built on asyncio, every call moves under aio — a mechanical change, but a pervasive one. Inventory async call sites in the audit so the estimate reflects them.

11. Parity test framework: validate before you cut over

This is the section that separates a clean migration from an incident. Because response.text works identically in both SDKs, the failure mode is not a crash — it is a behavioral drift that your smoke test cannot see. The defense is a parity harness: run a representative set of inputs through both SDKs and assert the outputs are equivalent before the legacy code leaves your repo.

# parity_check.py — run BEFORE decommissioning the legacy SDK.
# Requires both packages installed simultaneously, which is supported.
import vertexai
from vertexai.generative_models import GenerativeModel, GenerationConfig
from google import genai
from google.genai import types
 
PROJECT, LOCATION, MODEL = "my-project", "us-central1", "gemini-2.0-flash"
 
# --- legacy path ---
vertexai.init(project=PROJECT, location=LOCATION)
_legacy_model = GenerativeModel(MODEL)
 
def legacy_generate(prompt: str) -> dict:
    r = _legacy_model.generate_content(
        prompt,
        generation_config=GenerationConfig(temperature=0, max_output_tokens=512),
    )
    return {
        "text": r.text,
        "finish_reason": str(r.candidates[0].finish_reason),
        "total_tokens": r.usage_metadata.total_token_count,
    }
 
# --- new path ---
_client = genai.Client(vertexai=True, project=PROJECT, location=LOCATION)
 
def genai_generate(prompt: str) -> dict:
    r = _client.models.generate_content(
        model=MODEL,
        contents=prompt,
        config=types.GenerateContentConfig(temperature=0, max_output_tokens=512),
    )
    return {
        "text": r.text,
        "finish_reason": str(r.candidates[0].finish_reason),
        "total_tokens": r.usage_metadata.total_token_count,
    }
 
# --- the harness ---
GOLDEN_PROMPTS = [
    "Classify the sentiment of: 'the build broke again'. One word.",
    "Extract the date from: 'ship it by June 24'. ISO format only.",
    # ...load your real representative inputs, ideally from production logs
]
 
def run_parity():
    mismatches = []
    for prompt in GOLDEN_PROMPTS:
        old, new = legacy_generate(prompt), genai_generate(prompt)
        # temperature=0 makes text comparison meaningful but not guaranteed
        # identical — assert on structure, review text deltas by hand
        if old["finish_reason"] != new["finish_reason"]:
            mismatches.append((prompt, "finish_reason", old, new))
        if abs(old["total_tokens"] - new["total_tokens"]) > 0:
            mismatches.append((prompt, "token_count", old, new))
        if old["text"].strip() != new["text"].strip():
            mismatches.append((prompt, "text", old, new))
    return mismatches
 
if __name__ == "__main__":
    failures = run_parity()
    if failures:
        for prompt, field, old, new in failures:
            print(f"MISMATCH [{field}] prompt={prompt!r}\n  old={old}\n  new={new}\n")
        raise SystemExit(1)
    print(f"PARITY OK across {len(GOLDEN_PROMPTS)} prompts")

Three principles for using this well:

Set temperature=0. It makes text comparison meaningful. It does not make it guaranteed identical across SDKs — treat a text mismatch as "a human reviews this delta," not "the build is broken."
Use your real inputs. GOLDEN_PROMPTS should be sampled from production logs, not invented. The discrepancies that matter live in the inputs your users actually send.
Assert on structure first. finish_reason and token counts are exact, comparable, and the fastest signal that something diverged — a MAX_TOKENS finish that used to be STOP is a real behavior change hiding behind text that still looks plausible.

Wire this into CI on the migration branch. It should pass before the legacy SDK is removed and be deleted in the same commit that removes it.

12. The 6-week migration roadmap

For a team treating this as part-time work alongside their normal roadmap, six weeks is a realistic, unhurried shape. Smaller codebases compress it; the phases do not change.

Week 1–2 — Audit and inventory. Run the dependency grep across every repo and requirements.txt. Catalog every vertexai.init() call site, every module from the five-module table in use, every streaming and async call site. Produce one document: what is affected, in which services, owned by whom. Most teams find the blast radius is smaller than feared — and the audit is what proves it.

Week 3–4 — Code rewrite. Migrate service by service behind the parity harness. Install google-genai alongside the legacy package; both run in the same environment. Run a type checker over every migrated file — the keyword-argument and field-name changes are exactly what mypy and pyright catch.

Week 5 — Staging parity. Deploy the migrated code to staging with the legacy version still pinned and recoverable. Run the parity harness against production-sampled inputs. Watch latency and token counts, not just correctness — a region or model-default difference shows up here, not in code review.

Week 6 — Production cutover and decommission. Ship the migrated code. Watch error rates and latency for a few days against your recorded baseline. Once stable, remove google-cloud-aiplatform from your dependencies (if nothing else needs it), delete the parity harness, and close the migration.

This shape will look familiar if you have seen our six-week pilot methodology — the audit-build-validate-ship rhythm is the same discipline applied to a migration instead of a greenfield agent.

13. Common migration pitfalls and how to avoid them

The mechanical changes are covered above. These are the ones that pass review and surface in production.

Response object shape changes. response.text is stable. Almost everything around it is worth re-checking. If your code reaches into response.candidates[...], inspects safety_ratings, reads usage_metadata, or pattern-matches on finish_reason, verify each field name and enum value against the Gen AI SDK's types — the response is a different class, and a KeyError or an AttributeError on a rarely-hit branch is the classic post-migration incident. This is the entire reason section 11 asserts on structure.

Model naming differences. Audit the literal model strings in your code. If you are on a PaLM-era model (text-bison, chat-bison, textembedding-gecko) through vertexai.language_models, there is no "same model, new SDK" — you are moving to a current Gemini model, and that is a model change requiring its own eval pass. Do not let a model upgrade ride along invisibly inside an SDK migration; name it, scope it, eval it.

Region availability gaps. Confirm the model you call is available in the location you pass. The legacy SDK and the Gen AI SDK both honor your region, but the set of models available per region shifts as Google rolls capabilities out. A model that resolved in us-central1 under the old SDK is not guaranteed to resolve in europe-west4 under the new one — check the regional availability for your target models before staging, not during.

Forgetting vertexai=True. genai.Client() without it talks to the Gemini Developer API — a different service, different auth, different quota — not your Vertex backend. Set it explicitly (or GOOGLE_GENAI_USE_VERTEXAI=true) and assert on it in a startup check.

Treating vertexai.init() removal as cosmetic. It changes configuration from global to instance-scoped. If you had code relying on one module's init() configuring another module's calls, that coupling is now broken — correctly, but breaking it is a real change, not a no-op. Make client construction deliberate.

14. After cutover: when to move the agent layer to ADK

Finishing the SDK migration leaves you with a clean question that the AI Overview will not raise: should the orchestration sitting on top of these calls also move?

The Google Gen AI SDK is the right tool for model calls — generate, stream, tune, cache, embed. It is not an agent framework. If what you have around those calls is a hand-rolled loop — your own tool-dispatch logic, your own state passed through dictionaries, your own retry and trace plumbing — the migration just put you in the ideal position to evaluate the Agent Development Kit (ADK) v1.0 on Gemini Enterprise Agent Platform, where tool calling, session state, evaluation, and deployment to Agent Runtime are first-class instead of bespoke.

The signal is concrete. Audit the agent layer the same way you audited the SDK layer, and if you see hand-rolled tool dispatch, ad-hoc state management, or trace instrumentation you wrote yourself, that is the orchestration ADK is built to replace. The honest scoping point: do not bundle an orchestration rewrite into the SDK migration's six weeks — finish the mechanical migration, ship it, stabilize it, then decide on the agent layer as its own scoped piece of work. Two clean migrations beat one entangled one.

When that decision is made, the next thing you want is a worked example. Our guide to building production agents with ADK takes one agent from an empty directory to a deployed Agent Runtime endpoint with an eval set in CI — the file-by-file picture of the orchestration layer the Gen AI SDK alone does not give you. The Engineering Code Review blueprint is that pattern applied to a real engineering workflow: a single-agent build with Memory Bank and an eval set. The full set of agent blueprints covers the other shapes, and the platform overview maps how the Build, Scale, Govern, and Optimize pillars fit around code your team owns.

15. FAQ

FAQ

Frequently asked questions

It does not stop running on June 25 — old pinned versions of google-cloud-aiplatform keep installing from PyPI. What you lose is forward motion: no bug fixes, no security patches, and no addressability for Gemini models released after the deadline through the vertexai.* generative surface. You are running unmaintained code against a backend that keeps moving. The risk is not an immediate outage; it is accumulating, unsupported drift.

Only the SDK calls. The deprecation is scoped to the generative surface of the Vertex AI SDK for Python — the vertexai.* modules. Your deployed endpoints, your tuned models, your project configuration, and the non-generative parts of google-cloud-aiplatform (training, pipelines, registry, feature store, batch prediction) are not part of this change. You are rewriting how your code calls the platform, not redeploying what runs on it.

The principle is the same — the generative SDK is being superseded by the Google Gen AI SDK — but this guide's code is Python-specific (google-cloud-aiplatform to google-genai). The Gen AI SDK is available across multiple languages, and the conceptual moves are identical: drop the global init, construct an explicit client, treat the model as an argument, re-check the response object shape. The exact package names, idioms, and async patterns differ per language. A Java codebase follows the same roadmap with Java-specific code.

Yes. A model you fine-tuned through the legacy SDK is a resource that lives in your Google Cloud project — it is not coupled to the SDK that created it. After migrating, you address that same tuned model by its resource name through client.models.generate_content. You do not re-run the tuning job and you do not lose the weights. What you do re-test is inference against the tuned model, because the call path changed even though the checkpoint did not.

Yes, and you should rely on it. google-cloud-aiplatform and google-genai are separate packages that install into the same environment without conflict. That is precisely what makes a safe cutover possible: you run both, send the same inputs through each, and assert the outputs match — the parity harness in section 11 — before the legacy code leaves your repo. Migrate service by service with both installed, then remove the legacy package once everything is verified.

Decide it as a separate piece of work, not a bundled one. The SDK migration is mechanical and bounded — finish it, ship it, stabilize it first. Then look at the orchestration layer: if it is a hand-rolled loop with your own tool dispatch, ad-hoc state, and self-written trace plumbing, moving to the Agent Development Kit (ADK) v1.0 on Gemini Enterprise Agent Platform is worth scoping, because ADK makes tool calling, session state, evaluation, and Agent Runtime deployment first-class. But scope it on its own timeline. Two clean migrations beat one entangled one.

Vertex AI SDK to Google Gen AI SDK: a Migration Guide

1. The June 24, 2026 deadline: what's actually being removed

2. Who this guide is for

3. The five legacy modules being deprecated

4. Pre-migration checklist

Dependency audit

IAM and project configuration review

Current SDK version inventory

Install the replacement

5. Before/after: text generation

6. Before/after: multimodal generation

7. Before/after: model tuning

8. Before/after: caching

9. Authentication and project configuration changes

10. Streaming and async patterns

11. Parity test framework: validate before you cut over

12. The 6-week migration roadmap

13. Common migration pitfalls and how to avoid them

14. After cutover: when to move the agent layer to ADK

15. FAQ

Frequently asked questions

What happens if my code is still on the legacy SDK after June 24, 2026?

Do I have to migrate the model serving layer too, or only the SDK calls?

Is the migration the same for Python and Java codebases?

Will my fine-tuned model checkpoints survive the migration?

Can the legacy SDK and the Google Gen AI SDK run side-by-side during cutover?

When does it make sense to also move my agent orchestration to ADK at the same time?

One workflow. One outcome. Code your team owns.