Skip to main content
Google Cloud
Technology areas
  • AI and ML
  • Application development
  • Application hosting
  • Compute
  • Data analytics and pipelines
  • Databases
  • Distributed, hybrid, and multicloud
  • Generative AI
  • Industry solutions
  • Networking
  • Observability and monitoring
  • Security
  • Storage
Cross-product tools
  • Access and resources management
  • Costs and usage management
  • Infrastructure as code
  • Migration
  • SDK, languages, frameworks, and tools
Related sites
  • Google Cloud Home
  • Free Trial and Free Tier
  • Architecture Center
  • Blog
  • Contact Sales
  • Google Cloud Developer Center
  • Google Developer Center
  • Google Cloud Marketplace
  • Google Cloud Marketplace Documentation
  • Google Cloud Skills Boost
  • Google Cloud Solution Center
  • Google Cloud Support
  • Google Cloud Tech Youtube Channel
/
  • English
  • Deutsch
  • Español
  • Español – América Latina
  • Français
  • Indonesia
  • Italiano
  • Português
  • Português – Brasil
  • 中文 – 简体
  • 中文 – 繁體
  • 日本語
  • 한국어
Console Sign in
  • Generative AI on Vertex AI
Guides API reference Vertex AI Cookbook Prompt gallery Resources FAQ Pricing
Contact Us Start free
Google Cloud
  • Technology areas
    • More
    • Guides
    • API reference
    • Vertex AI Cookbook
    • Prompt gallery
    • Resources
    • FAQ
    • Pricing
  • Cross-product tools
    • More
  • Related sites
    • More
  • Console
  • Contact Us
  • Start free
  • Discover
    • Overview of Generative AI on Vertex AI
    • Generative AI beginner's guide
    • Glossary
  • Get started
    • Get an API key
    • Configure application default credentials
    • API quickstart
    • Vertex AI Studio quickstart
    • Migrate from Google AI Studio to Vertex AI
    • Deploy your Vertex AI Studio prompt as a web application
    • Vertex AI Studio capabilities
    • Generate an image and verify its watermark using Imagen
    • Google GenAI libraries
    • Compatibility with OpenAI library
    • Vertex AI in express mode
    • Overview
    • Console tutorial
    • API tutorial
  • Select models
    • Model Garden
    • Overview of Model Garden
    • Use models in Model Garden
    • Test model capabilities
    • Supported models
    • Google Models
    • Overview
    • Gemini
      • Gemini 2.5 Pro
      • Gemini 2.5 Flash
      • Gemini 2.5 Flash Image
      • Gemini 2.5 Flash Live API
      • Gemini 2.5 Flash-Lite
      • Gemini 2.0 Flash
      • Gemini 2.0 Flash-Lite
      • Vertex AI Model Optimizer
      • Migrate to the latest Gemini models
      • SDKs
    • Imagen
      • Imagen 3.0 Generate 002
      • Imagen 3.0 Generate 001
      • Imagen 3.0 Fast Generate 001
      • Imagen 3.0 Capability 001
      • Imagen 4.0 Generate
      • Imagen 4.0 Fast Generate
      • Imagen 4.0 Ultra Generate
      • Virtual Try-On Preview 08-04
      • Imagen product recontext preview 06-30
      • Migrate to Imagen 3
    • Veo
      • Veo 2
      • Veo 2 Preview
      • Veo 2 Experimental
      • Veo 3
      • Veo 3 Fast
      • Veo 3 preview
      • Veo 3 Fast preview
      • Veo 3.1 preview
      • Veo 3.1 Fast preview
    • Lyria
      • Lyria 2
    • Model versions
    • Managed models
    • Model as a Service (MaaS) overview
    • Partner models
      • Overview
      • Claude
        • Overview
        • Request predictions
        • Batch predictions
        • Prompt caching
        • Count tokens
        • Web search
        • Safety classifiers
        • Model details
        • Claude Sonnet 4.5
        • Claude Opus 4.1
        • Claude Haiku 4.5
        • Claude Opus 4
        • Claude Sonnet 4
        • Claude 3.7 Sonnet
        • Claude 3.5 Haiku
        • Claude 3 Haiku
      • Mistral AI
        • Overview
        • Model details
        • Mistral Medium 3
        • Mistral OCR (25.05)
        • Mistral Small 3.1 (25.03)
        • Codestral 2
    • Open models
      • Overview
      • Grant access to open models
      • Models
      • DeepSeek
        • Overview
        • DeepSeek-R1-0528
        • DeepSeek-V3.1
      • OpenAI
        • Overview
        • OpenAI gpt-oss-120b
        • OpenAI gpt-oss-20b
      • Qwen
        • Overview
        • Qwen 3 Next Instruct 80B
        • Qwen 3 Next Thinking 80B
        • Qwen 3 Coder
        • Qwen 3 235B
      • Embedding (e5)
        • Multilingual E5 Small
        • Multilingual E5 Large
      • Llama
        • Overview
        • Request predictions
        • Model details
        • Llama 4 Maverick
        • Llama 4 Scout
        • Llama 3.3
        • Llama 3.2
        • Llama 3.1 405b
        • Llama 3.1 70b
        • Llama 3.1 8b
      • Model deprecations (MaaS)
      • API
      • Call MaaS APIs for open models
      • Function calling
      • Thinking
      • Structured output
      • Batch prediction
    • Self-deployed models
    • Overview
    • Deploy models with custom weights
    • Google Gemma
      • Use Gemma
      • Tutorial: Deploy and inference Gemma (GPU)
      • Tutorial: Deploy and inference Gemma (TPU)
    • Llama
    • Use Hugging Face Models
    • Comprehensive guide to vLLM for Text and Multimodal LLM Serving (GPU)
    • vLLM TPU
    • Hex-LLM
    • xDiT
    • Tutorial: Deploy Llamma 3 models with SpotVM and Reservations
    • Model Garden notebooks
      • Tutorial: Optimize model performance with advanced features in Model Garden
  • Build
    • Agents
    • Overview
    • Agent Development Kit
      • Overview
    • Agent Engine
      • Overview
      • Runtime
        • Quickstart using Agent Development Kit
        • Quickstart
        • Set up the environment
        • Develop an agent
          • Overview
          • Agent Development Kit
          • Agent2Agent
          • LangChain
          • LangGraph
          • AG2
          • LlamaIndex
          • Custom
        • Deploy an agent
        • Use an agent
          • Overview
          • Agent Development Kit
          • Agent2Agent
          • LangChain
          • LangGraph
          • AG2
          • LlamaIndex
          • Custom
        • Manage deployed agents
          • Overview
          • Access control
          • Tracing
          • Logging
          • Monitoring
        • Bidirectional streaming
        • Using Private Service Connect interface
      • Evaluate an agent
      • Sessions
        • Sessions overview
        • Manage sessions using Agent Development Kit
        • Manage sessions using API calls
      • Memory Bank
        • Overview
        • Set up Memory Bank
        • Quickstart with Agent Engine SDK
        • Quickstart with Agent Development Kit
        • Generate memories
        • Fetch memories
        • Troubleshooting
      • Example Store
        • Example Store overview
        • Example Store quickstart
        • Create or reuse an Example Store instance
        • Upload examples
        • Retrieve examples
      • Code Execution
        • Code Execution overview
        • Code Execution quickstart
      • Getting help
        • Troubleshoot setting up the environment
        • Troubleshoot developing an agent
        • Troubleshoot deploying an agent
        • Troubleshoot using an agent
        • Troubleshoot managing deployed agents
        • Troubleshoot Code Execution
        • Get support
    • Agent2Agent (A2A) Protocol
      • Overview
      • A2A Python SDK
      • A2A JavaScript SDK
      • A2A Java SDK
      • A2A C#/.NET SDK
      • A2A samples
    • Agent Tools
      • Built-in tools
      • Google Cloud tools
      • Model Context Protocol (MCP) tools
      • MCP Toolbox for Databases
      • Ecosystem tools
    • Prompt design
    • Introduction to prompting
    • Prompting strategies
      • Overview
      • Give clear and specific instructions
      • Use system instructions
      • Include few-shot examples