Skip to main content
Technology areas
close
AI and ML
Application development
Application hosting
Compute
Data analytics and pipelines
Databases
Distributed, hybrid, and multicloud
Generative AI
Industry solutions
Networking
Observability and monitoring
Security
Storage
Cross-product tools
close
Access and resources management
Costs and usage management
Infrastructure as code
Migration
SDK, languages, frameworks, and tools
Related sites
close
Google Cloud Home
Free Trial and Free Tier
Architecture Center
Blog
Contact Sales
Google Cloud Developer Center
Google Developer Center
Google Cloud Marketplace
Google Cloud Marketplace Documentation
Google Cloud Skills Boost
Google Cloud Solution Center
Google Cloud Support
Google Cloud Tech Youtube Channel
/
English
Deutsch
Español
Español – América Latina
Français
Indonesia
Italiano
Português
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어
Console
Sign in
Generative AI on Vertex AI
Guides
API reference
Vertex AI Cookbook
Prompt gallery
Resources
FAQ
Pricing
Contact Us
Start free
Technology areas
More
Guides
API reference
Vertex AI Cookbook
Prompt gallery
Resources
FAQ
Pricing
Cross-product tools
More
Related sites
More
Console
Contact Us
Start free
Discover
Overview of Generative AI on Vertex AI
Generative AI beginner's guide
Glossary
Get started
Get an API key
Configure application default credentials
API quickstart
Vertex AI Studio quickstart
Migrate from Google AI Studio to Vertex AI
Deploy your Vertex AI Studio prompt as a web application
Vertex AI Studio capabilities
Generate an image and verify its watermark using Imagen
Google GenAI libraries
Compatibility with OpenAI library
Vertex AI in express mode
Overview
Console tutorial
API tutorial
Select models
Model Garden
Overview of Model Garden
Use models in Model Garden
Test model capabilities
Supported models
Google Models
Overview
Gemini
Gemini 2.5 Pro
Gemini 2.5 Flash
Gemini 2.5 Flash Image
Gemini 2.5 Flash Live API
Gemini 2.5 Flash-Lite
Gemini 2.0 Flash
Gemini 2.0 Flash-Lite
Vertex AI Model Optimizer
Migrate to the latest Gemini models
SDKs
Imagen
Imagen 3.0 Generate 002
Imagen 3.0 Generate 001
Imagen 3.0 Fast Generate 001
Imagen 3.0 Capability 001
Imagen 4.0 Generate
Imagen 4.0 Fast Generate
Imagen 4.0 Ultra Generate
Virtual Try-On Preview 08-04
Imagen product recontext preview 06-30
Migrate to Imagen 3
Veo
Veo 2
Veo 2 Preview
Veo 2 Experimental
Veo 3
Veo 3 Fast
Veo 3 preview
Veo 3 Fast preview
Veo 3.1 preview
Veo 3.1 Fast preview
Lyria
Lyria 2
Model versions
Managed models
Model as a Service (MaaS) overview
Partner models
Overview
Claude
Overview
Request predictions
Batch predictions
Prompt caching
Count tokens
Web search
Safety classifiers
Model details
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Claude 3.7 Sonnet
Claude 3.5 Haiku
Claude 3 Haiku
Mistral AI
Overview
Model details
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Codestral 2
Open models
Overview
Grant access to open models
Models
DeepSeek
Overview
DeepSeek-R1-0528
DeepSeek-V3.1
OpenAI
Overview
OpenAI gpt-oss-120b
OpenAI gpt-oss-20b
Qwen
Overview
Qwen 3 Next Instruct 80B
Qwen 3 Next Thinking 80B
Qwen 3 Coder
Qwen 3 235B
Embedding (e5)
Multilingual E5 Small
Multilingual E5 Large
Llama
Overview
Request predictions
Model details
Llama 4 Maverick
Llama 4 Scout
Llama 3.3
Llama 3.2
Llama 3.1 405b
Llama 3.1 70b
Llama 3.1 8b
Model deprecations (MaaS)
API
Call MaaS APIs for open models
Function calling
Thinking
Structured output
Batch prediction
Self-deployed models
Overview
Deploy models with custom weights
Google Gemma
Use Gemma
Tutorial: Deploy and inference Gemma (GPU)
Tutorial: Deploy and inference Gemma (TPU)
Llama
Use Hugging Face Models
Comprehensive guide to vLLM for Text and Multimodal LLM Serving (GPU)
vLLM TPU
Hex-LLM
xDiT
Tutorial: Deploy Llamma 3 models with SpotVM and Reservations
Model Garden notebooks
Tutorial: Optimize model performance with advanced features in Model Garden
Build
Agents
Overview
Agent Development Kit
Overview
Agent Engine
Overview
Runtime
Quickstart using Agent Development Kit
Quickstart
Set up the environment
Develop an agent
Overview
Agent Development Kit
Agent2Agent
LangChain
LangGraph
AG2
LlamaIndex
Custom
Deploy an agent
Use an agent
Overview
Agent Development Kit
Agent2Agent
LangChain
LangGraph
AG2
LlamaIndex
Custom
Manage deployed agents
Overview
Access control
Tracing
Logging
Monitoring
Bidirectional streaming
Using Private Service Connect interface
Evaluate an agent
Sessions
Sessions overview
Manage sessions using Agent Development Kit
Manage sessions using API calls
Memory Bank
Overview
Set up Memory Bank
Quickstart with Agent Engine SDK
Quickstart with Agent Development Kit
Generate memories
Fetch memories
Troubleshooting
Example Store
Example Store overview
Example Store quickstart
Create or reuse an Example Store instance
Upload examples
Retrieve examples
Code Execution
Code Execution overview
Code Execution quickstart
Getting help
Troubleshoot setting up the environment
Troubleshoot developing an agent
Troubleshoot deploying an agent
Troubleshoot using an agent
Troubleshoot managing deployed agents
Troubleshoot Code Execution
Get support
Agent2Agent (A2A) Protocol
Overview
A2A Python SDK
A2A JavaScript SDK
A2A Java SDK
A2A C#/.NET SDK
A2A samples
Agent Tools
Built-in tools
Google Cloud tools
Model Context Protocol (MCP) tools
MCP Toolbox for Databases
Ecosystem tools
Prompt design
Introduction to prompting
Prompting strategies
Overview
Give clear and specific instructions
Use system instructions
Include few-shot examples