Models
ChatBotKit supports various models to create engaging conversational AI experiences. These include foundational OpenAI models such as O1, GPT4o, GPT4, and GPT3, along with models from Anthropic, Mistral, Groq, Facebook, DeepSeek, and others. Additionally, ChatBotKit uses several of its own models, including text-algo-005 and text-algo-004, for our in-house general assistant.
Below is a table that summarizes the different models. It includes their names, short descriptions, and context sizes (the maximum number of tokens).
| Model Name | Short Description | Token Ratio |
|---|---|---|
| gpt-5.2 | GPT-5.2 is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. | 0.7778 |
| gpt-5.1 | GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. | 0.5556 |
| gpt-5 | GPT-5 is the next generation language model with enhanced reasoning capabilities and improved performance across all domains including coding, mathematics, science, and creative tasks. | 0.5556 |
| gpt-5-mini | GPT-5 Mini is the cost-efficient version of GPT-5, offering excellent performance for most tasks while being faster and more affordable than GPT-5. | 0.1111 |
| gpt-5-nano | GPT-5 Nano is the most lightweight and fastest model in the GPT-5 family, optimized for simple tasks requiring quick responses with minimal computational overhead. | 0.0222 |
| o4-mini | o4-mini is the latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. | 0.2444 |
| o3 | o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. | 0.4444 |
| gpt-4.1-nano | GPT-4.1 nano is the fastest, most cost-effective GPT 4.1 model. | 0.0222 |
| gpt-4.1-mini | GPT 4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases. | 0.0889 |
| gpt-4.1 | GPT 4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains. | 0.4444 |
| gpt-4.5 | GPT-4.5 excels at tasks that benefit from creative, open-ended thinking and conversation, such as writing, learning, or exploring new ideas. | 8.3333 |
| o3-mini | o3-mini is a cost-efficient reasoning model that's optimized for coding, math, and science, and supports tools and Structured Outputs. | 0.2444 |
| o1 | o1 is a powerful reasoning model that supports tools, Structured Outputs, and vision. The model has 200K context and an October 2023 knowledge cutoff. | 3.3333 |
| gpt-4o-mini | GPT-4o mini is OpenAI's most cost-efficient small model that's smarter and cheaper than GPT-3.5 Turbo, and has vision capabilities. The model has 128K context and an October 2023 knowledge cutoff. | 0.0333 |
| gpt-4o | GPT-4o is faster and cheaper than GPT-4 Turbo with stronger vision capabilities. The model has 128K context and an October 2023 knowledge cutoff. | 0.5556 |
| gpt-4-turbo | GPT-4 Turbo is offered at 128K context with an April 2023 knowledge cutoff and basic support for vision. | 1.6667 |
| gpt-4 | The GPT-4 model was built with broad general knowledge and domain expertise. | 3.3333 |
| gpt-3.5-turbo | GPT-3.5 Turbo is fast and inexpensive model for simpler tasks. | 0.0833 |
| gpt-3.5-turbo-instruct | GPT-3.5 Turbo is fast and inexpensive model for simpler tasks. | 0.1111 |
| mistral-large-latest | Top-tier reasoning for high-complexity tasks. The most powerful model of the Mistral AI family. | 0.6667 |
| mistral-small-latest | Cost-efficient reasoning for low-latency workloads. | 0.1667 |
| llama-3.3-70b-versatile | Llama 3.3 is an auto-regressive language model that uses an optimized transformer architecture. | 0.0439 |
| gemini-3-flash | Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. | 1 |
| gemini-3-pro | Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window. | 1 |
| gpt-5.1-codex-max | GPT-5.1-Codex-Max is OpenAI's latest agentic coding model, designed for long-running, high-context software development tasks. | 0.5556 |
| gpt-5.1-codex-mini | GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex. | 0.1111 |
| gpt-5.1-codex | GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. | 0.5556 |
| gpt-5-codex | GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. | 0.5556 |
| claude-4.5-opus | Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and reasoning benchmarks, and improved robustness to prompt injection. | 1.3889 |
| claude-4.1-opus | Claude Opus 4.1 is Anthropic's most powerful model, with enhanced capabilities for complex reasoning, coding, and creative tasks. | 4.1667 |
| claude-4-opus | Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. | 4.1667 |
| claude-4.5-sonnet | Claude 4.5 Sonnet: advanced Sonnet tuned for agents, long coding and sustained reasoning. | 0.8333 |
| claude-4-sonnet | Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. | 0.8333 |
| claude-3.7-sonnet | Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. | 0.8333 |
| claude-3.5-sonnet | Anthropic's most intelligent and advanced model, Claude 3.5 Sonnet, demonstrates exceptional capabilities across a diverse range of tasks and evaluations while also outperforming Claude 3 Opus. | 0.8333 |
| claude-4.5-haiku | Claude Haiku 4.5 is Anthropic's fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4's performance across reasoning, coding, and computer-use tasks, Haiku 4.5 brings frontier-level capability to real-time and high-volume applications. | 0.2778 |
| claude-3.5-haiku | Anthropic's fastest, most compact model for near-instant responsiveness. It answers simple queries and requests with speed. | 0.2222 |
| gemini-2.5-flash | A capable and inexpensive, multi-modal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents. | 0.1944 |
| gemini-2.5-pro | A capable multi-modal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents. | 0.8333 |
| deepseek-chat-v3-0324 | DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the DeepSeek V3 model and performs really well on a variety of tasks. | 0.0489 |
| deepseek-r1-distill-llama-70b | Top-tier reasoning for high-complexity tasks. The most powerful model of the Deepseek AI family. | 0.055 |
| sonar-deep-research | Deep Research conducts comprehensive, expert-level research and synthesizes it into accessible, actionable reports. | 0.4444 |
| sonar-reasoning-pro | Premier reasoning offering powered by DeepSeek R1 with Chain of Thought (CoT) and advanced search grounding. | 0.4444 |
| sonar-reasoning | Premier reasoning offering powered by DeepSeek R1 with Chain of Thought (CoT). | 0.2778 |
| sonar-pro | Premier search offering with search grounding, supporting advanced queries and follow-ups. | 0.8333 |
| sonar | Lightweight offering with search grounding, quicker and cheaper than Sonar Pro. | 0.0556 |
| gemini-2.0-flash | A capable multi-modal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents. | 0.0222 |
| gemini-2.0-flash-lite | Small and most cost effective model, built for at scale usage | 0.0167 |
| gemini-1.5-flash | Fast multi-modal model with great performance for diverse, repetitive tasks and a 1 million token context window. | 0.0167 |
| gemini-1.5-pro | Highest intelligence Gemini 1.5 series model, with a breakthrough 2 million token context window. | 0.2778 |
| claude-v3-opus | Anthropic's most powerful AI model, with top-level performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding. | 4.1667 |
| claude-v3-sonnet | Claude 3 Sonnet strikes the ideal balance between intelligence and speed—particularly for enterprise workloads. It offers maximum utility, and is engineered to be the dependable for scaled AI deployments. | 0.8333 |
| claude-v3-haiku | Anthropic's fastest, most compact model for near-instant responsiveness. It answers simple queries and requests with speed. | 0.0694 |
| claude-v3 | Claude 3 Sonnet strikes the ideal balance between intelligence and speed—particularly for enterprise workloads. It offers maximum utility, and is engineered to be the dependable for scaled AI deployments. | 0.8333 |
| claude-v2.1 | Claude 2.1 is a large language model (LLM) by Anthropic with a 200K token context window, reduced hallucination rates, and improved accuracy over long documents. | 1.3333 |
| claude-v2 | Claude 2.0 is a leading LLM from Anthropic that enables a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction. | 1.3333 |
| claude-instant-v1 | Claude Instant is Anthropic's faster, lower-priced yet very capable LLM. | 0.1333 |
| custom | Any custom model created by the user. | 0.0028 |
| text-qaa-web-001 | Fast and efficient question and answer model with web search grounding. | 0.0556 |
| text-qaa-005 | This model belongs to the GPT-4o mini family of ChatBotKit models. It is designed for question and answer applications. The model has a token limit of 128000 and provides a balance between cost and quality. It is a custom model based on the gpt model architecture. | 0.8333 |
| text-qaa-004 | This model belongs to the GPT-4o family of ChatBotKit models. It is designed for question and answer applications. The model has a token limit of 128000 and provides a balance between cost and quality. It is a custom model based on the gpt model architecture. | 0.8333 |
| text-qaa-003 | This model belongs to the GPT-4 Turbo family of ChatBotKit models. It is designed for question and answer applications. The model has a token limit of 128000 and provides a balance between cost and quality. It is a custom model based on the gpt model architecture. | 1.6667 |
| text-qaa-002 | This model belongs to the GPT-4 family of ChatBotKit models. It is designed for question and answer applications. The model has a token limit of 8 * ONE_K and provides a balance between cost and quality. It is a custom model based on the gpt model architecture. | 3.3333 |
| text-qaa-001 | This model belongs to the GPT 3.5 Turbo family of ChatBotKit models. It is designed for question and answer applications. The model has a token limit of 4000 and provides a balance between cost and quality. It is a custom model based on the gpt model architecture. | 0.0833 |
| gpt-image-1-mini | GPT Image 1 Mini is a state-of-the-art image generation model. It is a natively multimodal language model that accepts both text and image inputs, and produces image outputs. | 0.4444 |
| gpt-image-1 | GPT Image 1 is a state-of-the-art image generation model. It is a natively multimodal language model that accepts both text and image inputs, and produces image outputs. | 2.2222 |
| dalle3 | This model is based on the DALL-E 3 architecture. It is a high-quality model that can generate images from text. It is tunable and offers a balance between cost and quality. | 1 |
| dalle2 | This model is based on the DALL-E 2 architecture. It is a high-quality model that can generate images from text. It is tunable and offers a balance between cost and quality. | 1 |
| stablediffusion | This model is based on the Stable Diffusion architecture. It is a high-quality model that can generate images from text. It is tunable and offers a balance between cost and quality. | 1 |
About our latest models
We will try to keep this page up-to-date. The latest and most up-to-date list of supported models and their configurations can be found here.
About token costs
The token ratio serves as a key indicator for token cost. A higher ratio corresponds to a more expensive token type.
ChatBotKit uses the token ratio as a multiplier to calculate the actual number of tokens consumed by the model. Each model token is multiplied by the token ratio to determine the number of tokens ChatBotKit records. This ensures accurate tracking of the resources each model uses and correct user billing.
The context size refers to the maximum tokens (words or symbols) the model can consider when generating a response. A larger context size allows for more information to be taken into account, potentially leading to more accurate and relevant responses.
When choosing a model, it's essential to evaluate not just its capabilities, but also its cost and size. Larger and more expensive models aren't always the best choice for every task. Often, a smaller model can perform equally well or even better. As a rule of thumb, gpt-4o and gpt-4 are the best choices if you need the most advanced and capable model. However, if you're looking for a capable model that's also smaller, gpt-4o-mini might be a better fit.
Bring Your Own Model
ChatBotKit offers the unique option of bringing your own model and keys to the platform. This feature is designed for those who desire more control over their models and costs. If you have a model that you've trained and perfected over time for your specific use case or requirement, you're free to bring it to our platform. This means you can use your own keys, which allows you to handle the payment for the model usage directly. This could be beneficial, especially if you have particular budget constraints or specific cost strategies. In essence, with ChatBotKit, you're not just limited to using our pre-built models, but you can also introduce your custom-made models, providing you with more flexibility and control to meet your specific needs.
Here is an outline of the steps required to create your own custom model.
-
Navigate to the Bot Configuration Screen
- From the main dashboard, click on the "Bots" section in the left-hand menu.
- Select the bot you want to configure or create a new bot.
-
Choose the Model
-
Under the "Model" section, select "custom" from the dropdown menu as shown in the first screenshot.
-
Press the “Settings” button.
-