Changelog

Beta releases

LM Studio 0.4.17

Build 4

[LM Studio Engine Protocol] Fixed a bug where assistant output was replayed during a continuation request
[LM Studio Engine Protocol] Off by default, including for users upgrading from beta builds. You can re-enable it in Settings > Developer.
Add ability to open mermaid diagrams in full screen and save to PNG
Fix bug in /v1/responses with previous_response_id causing excessive prompt processing

Build 3

[LM Studio Engine Protocol] Existing custom Chat Template settings in My Models are now preserved when upgrading
[LM Studio Engine Protocol] Fixed a bug where unsupported GGUF prediction settings could appear in Chat, My Models, and Server configuration panels
- Chat Template is now in Load Parameters > Advanced, moved from Prediction Parameters > Prompt Template
- CPU Thread Pool Size is now in Load Parameters > Advanced, moved from Prediction Parameters > Settings > CPU Threads
- Speculative Decoding settings are now in Load Parameters > Advanced > Speculative Decoding, moved from Prediction Parameters > Speculative Decoding
[LM Studio Engine Protocol] Fixed automatic chat title generation for some reasoning models
[LM Studio Engine Protocol] Fixed a bug where RAG document retrieval could fail with some llama.cpp models
Fix bug where survey failures due to old ROCm versions led to a stuck state of no-GPUs.

Build 2

iGPUs while using Vulkan backend are now visible and disabled by default
AMD Strix Halo machines are now supported via the llama.cpp 2.22.1 runtime
AMD Radeon AI PRO R9600D and R9700 are now supported via the llama.cpp 2.22.1 runtime
Fixed bug where AMD GPUs were not detected due to driver updates.

Build 1

[LM Studio Engine Protocol] Added support for prompt template overrides when loading GGUF models
[LM Studio Engine Protocol] Added support for load-time speculative decoding with vision model support
[LM Studio Engine Protocol] Enabled by default for supported llama.cpp models
[LM Studio Engine Protocol] Fix the bug where model would load on the iGPU while using Vulkan on Windows
Chat to PDF export updated to include styled markdown
Added support for Mermaid diagram rendering in chat markdown

Jun 25, 2026

LM Studio 0.4.16

Build 2

Lm Link no longer requires waitlisting
Updated default context length to 8k tokens

Build 1

Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
- Use LM Link in Locally to take your largest LM Studio models on the go
Security hardening
[GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups

Jun 8, 2026

LM Studio 0.4.16

Build 1

Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
- Use LM Link in Locally to take your largest LM Studio models on the go
Security hardening
[GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups

Jun 4, 2026

LM Studio 0.4.15

Build 2

[CUDA] Added tensor parallelism support for multi-GPU model loading
[llama.cpp] Added a Physical Batch Size advanced load option
Fixed REST API requests hanging when HTTP/2 clients sent upgrade headers
Fixed image attachments appearing in reverse order after sending a chat message
Fixed a bug where double clicking model trigger would re-open it
Fixed tool type error bug while using codex --oss
Fixed invalid role error while using Claude Code. /v1/messages API now supports system messages in the messages array

Build 1

LM Studio Engine Protocol beta 2
- New architecture to enable us to ship more frequent engine updates
- Turn it on in Settings > Developer > Enable LM Studio Engine Protocol
Fixed a bug which dropped prompt cache on every message when using Claude Code, improving performance significantly
Fixed a bug when using theme picker overlay over other modals
Fixed a z-index bug in the Models Table scroll
Security hardening

May 29, 2026

LM Studio 0.4.14

Build 4

Stable release of MTP Speculative Decoding!
- Speeds up generation with models that include built-in multi-token prediction heads
- To try it out, download an MTP-capable model
Fixed an issue with non-MTP speculative decoding error while MTP was enabled
Fixed a bug where lms get gemma4 would not show any results
lms chat now shows which LM Link device each remote model is on

Build 3

Fixed a chat UI bug that could remove whitespace when using MTP

Build 2

Beta release of MTP Speculative Decoding
Fixed token exchange failure for few MCPs in OAuth flow

Build 1

Beta build of LM Studio Engine Protocol

May 22, 2026

LM Studio 0.4.13

Build 1

[MLX] mlx-engine v1.8.1 significantly improves performance and adds parallel predictions for vision-capable models such as Qwen 3.5/3.6 and Gemma 4
Fixed a bug where newlines were compacted in the chat input on paste
Bug fixes and security hardening. This update is recommended for all users.

May 13, 2026

LM Studio 0.4.12

Build 1

Support for Qwen 3.6
Improved style in chat PDF exports
Fixed bug where MCP servers with OAuth would not work on some Windows environments
Improved Qwen 3.5 performance with OpenAI-compatible /v1/chat/completions, /v1/responses and Anthropic-compatible /v1/messages

Apr 17, 2026

LM Studio 0.4.11

Build 1

Support for updated Gemma 4 chat template

Apr 10, 2026

LM Studio 0.4.10

Build 1

Improve Gemma 4 tool call reliability
Add OAuth support for MCP servers

Apr 9, 2026

LM Studio 0.4.9

Build 1

Improve Gemma 4 tool call reliability
Add support for Anthropic-compatible v1/messages output_config.effort (low, medium, high, max)
Fixed a bug where deleting a chat folder would sometimes freeze the UI
Fixed a bug where markdown Link popovers would appear at the top of the window

Apr 2, 2026

Changelog 👾

LM Studio 0.4.17

LM Studio 0.4.16

LM Studio 0.4.16

LM Studio 0.4.15

LM Studio 0.4.14

LM Studio 0.4.13

LM Studio 0.4.12

LM Studio 0.4.11

LM Studio 0.4.10

LM Studio 0.4.9

Changelog