One plugin for endless model choices

One plugin for endless model choices

Simplify your LLM development by using one Genkit plugin to access models like Claude, Mistral, Gemini, and Llama from Vertex AI's Model Garden. Learn how to switch between large language models without the hassle of rotating API keys or tracking multiple quotas.

Being able to swap models and try out different models has a lot of appeal. Being able to bounce between Claude, Mistral, Gemini, and Llama without having to worry about rotating your API keys, having an API key leak accidentally, or from trying to track your different quotas from different consoles is a boon as well. In this article we are going to take a look at a Genkit plugin that allows you to switch models all through one plugin and provider.

Model Garden from Vertex AI

Model Garden is a collection of AI and ML models that allows you to run some partner models in a serverless environment. If the serverless deployment is not available, you also have the option to host the model through Vertex AI and it handles a lot of the configuration for you. The serverless models available at the time of this writing come from Anthropic (Claude), Meta (Llama), and Mistral AI (Mistral). All you need to do is go into model garden, select your model, and then enable it.

How to get the model into Genkit?

To get the model into Genkit you must have the Vertex AI plugin for Genkit installed in your project and then configure the modelGarden plugin which comes within that package. Once it’s there, you call generate in one of your flows and pass in the model with the vertexModelGarden.model() method and pass in the model name which can be found at the end of the model id. In my case, since I wanted to use a mistral model, I can look at the model id on the publisher page which is publishers/mistralai/models/mistral-medium-3 and then get the mistral-medium-3 model from there.

import { genkit, z } from "genkit";
import { vertexModelGarden } from "@genkit-ai/vertexai/modelgarden";

export const ai = genkit({
    plugins: [
        vertexModelGarden({
            projectId: "YOUR_PROJECT_ID",
            location: "DATA_CENTER_LOCATION",
        }),
    ],
});

export const generateSeaShantyFlow = ai.defineFlow(
    {
        name: "generateSeaShantyFlow",
        inputSchema: z.string().describe("Topic for the sea shanty"),
        outputSchema: z.string().describe("The generated sea shanty"),
    },
    async (topic) => {
        const response = await ai.generate({
            model: vertexModelGarden.model("mistral-medium-3"),
            prompt: `Write a lively sea shanty about ${topic}.`,
            config: { temperature: 0.8 },
        });
        return response.text;
    }
);

Conclusion

With one plugin, you have now enabled a wide variety of different models without needing to remember or require a key for each model. Since these models run on Google infrastructure you get the same value and protections offered by Vertex AI and can use different models as you see fit.

Give model garden a try and let the Genkit team know what you think!

Learn more by visiting the documentation at genkit.dev!

Related Content

genkit • Mar 4, 2026

Maximize Your Agent's Output: Leveraging Multi-Provider Models in Genkit

Ready to move beyond single-model agents? Dive into continuous improvement loops (Kaizen) using Genkit and Vertex AI Model Garden. We show you how to use models from multiple providers as a 'writer' and a 'critic' to build AI agents that critique and refine their own outputs for maximum quality and improved results.

Dive in
genkit • Mar 11, 2026

Genkit, AI Continuous Improvement & Astro Website Builds

In this episode of Tech Sits, Rody and Nohe dive into their latest development projects and AI workflows. Topics covered in this episode: Genkit: An overview of the AI orchestration framework for full-stack applications. Continuous Improvement Loops (Kaizen): Using models like Mistral and Gemini to iteratively refine AI outputs. Vertex AI Model Garden: Accessing serverless models without managing infrastructure. Operational Transparency: Exploring how the Gemini CLI streams thinking tokens to improve user experience. Developer Portfolio Rebuilds: Rody shares his process for rebuilding his website using AI Studio, React, and Astro. Automating Metadata: Embedding Gemma 3N to auto-generate blog descriptions and surface related posts via KNN queries. Astro Galleries: Creating photo galleries with masonry layouts and lightboxes.

Watch on YouTube
ai • Feb 4, 2026

Talking about Skills, Optimizing Prompts and building MCP servers in apps

We’re diving deep into the latest paradigms in AI development, starting with the difference between traditional context files (Gemini.md) and the new "Agent Skills" dynamic. We also share a story about using the Vertex AI Prompt Optimizer to automate our YouTube descriptions. It took 5 hours and nearly 100 million tokens, but the results were surprisingly consistent. Finally, we geek out on the Model Context Protocol (MCP), experimenting with exposing Flutter application state as local tools using Unix sockets.

Watch on YouTube