LM Studio 0.4.17LM Studio 0.4.17
Build 4
- [LM Studio Engine Protocol] Fixed a bug where assistant output was replayed during a continuation request
- [LM Studio Engine Protocol] Off by default, including for users upgrading from beta builds. You can re-enable it in Settings > Developer.
- Add ability to open mermaid diagrams in full screen and save to PNG
- Fix bug in
/v1/responses with previous_response_id causing excessive prompt processing
Build 3
- [LM Studio Engine Protocol] Existing custom Chat Template settings in My Models are now preserved when upgrading
- [LM Studio Engine Protocol] Fixed a bug where unsupported GGUF prediction settings could appear in Chat, My Models, and Server configuration panels
- Chat Template is now in Load Parameters > Advanced, moved from Prediction Parameters > Prompt Template
- CPU Thread Pool Size is now in Load Parameters > Advanced, moved from Prediction Parameters > Settings > CPU Threads
- Speculative Decoding settings are now in Load Parameters > Advanced > Speculative Decoding, moved from Prediction Parameters > Speculative Decoding
- [LM Studio Engine Protocol] Fixed automatic chat title generation for some reasoning models
- [LM Studio Engine Protocol] Fixed a bug where RAG document retrieval could fail with some llama.cpp models
- Fix bug where survey failures due to old ROCm versions led to a stuck state of no-GPUs.
Build 2
- iGPUs while using Vulkan backend are now visible and disabled by default
- AMD Strix Halo machines are now supported via the llama.cpp 2.22.1 runtime
- AMD Radeon AI PRO R9600D and R9700 are now supported via the llama.cpp 2.22.1 runtime
- Fixed bug where AMD GPUs were not detected due to driver updates.
Build 1
- [LM Studio Engine Protocol] Added support for prompt template overrides when loading GGUF models
- [LM Studio Engine Protocol] Added support for load-time speculative decoding with vision model support
- [LM Studio Engine Protocol] Enabled by default for supported llama.cpp models
- [LM Studio Engine Protocol] Fix the bug where model would load on the iGPU while using Vulkan on Windows
- Chat to PDF export updated to include styled markdown
- Added support for Mermaid diagram rendering in chat markdown
Jun 25, 2026
LM Studio 0.4.16LM Studio 0.4.16
Build 2
- Lm Link no longer requires waitlisting
- Updated default context length to 8k tokens
Build 1
- Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
- Use LM Link in Locally to take your largest LM Studio models on the go
- Security hardening
- [GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups
Jun 8, 2026
LM Studio 0.4.16LM Studio 0.4.16
Build 1
- Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
- Use LM Link in Locally to take your largest LM Studio models on the go
- Security hardening
- [GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups
Jun 4, 2026
LM Studio 0.4.15LM Studio 0.4.15
Build 2
- [CUDA] Added tensor parallelism support for multi-GPU model loading
- [llama.cpp] Added a Physical Batch Size advanced load option
- Fixed REST API requests hanging when HTTP/2 clients sent upgrade headers
- Fixed image attachments appearing in reverse order after sending a chat message
- Fixed a bug where double clicking model trigger would re-open it
- Fixed tool type error bug while using
codex --oss
- Fixed invalid role error while using Claude Code.
/v1/messages API now supports system messages in the messages array
Build 1
- LM Studio Engine Protocol beta 2
- New architecture to enable us to ship more frequent engine updates
- Turn it on in Settings > Developer > Enable LM Studio Engine Protocol
- Fixed a bug which dropped prompt cache on every message when using Claude Code, improving performance significantly
- Fixed a bug when using theme picker overlay over other modals
- Fixed a z-index bug in the Models Table scroll
- Security hardening
May 29, 2026
LM Studio 0.4.14LM Studio 0.4.14
Build 4
- Stable release of MTP Speculative Decoding!
- Speeds up generation with models that include built-in multi-token prediction heads
- To try it out, download an MTP-capable model
- Fixed an issue with non-MTP speculative decoding error while MTP was enabled
- Fixed a bug where
lms get gemma4 would not show any results
lms chat now shows which LM Link device each remote model is on
Build 3
- Fixed a chat UI bug that could remove whitespace when using MTP
Build 2
- Beta release of MTP Speculative Decoding
- Fixed token exchange failure for few MCPs in OAuth flow
Build 1
- Beta build of LM Studio Engine Protocol
May 22, 2026
LM Studio 0.4.13LM Studio 0.4.13
Build 1
- [MLX] mlx-engine v1.8.1 significantly improves performance and adds parallel predictions for vision-capable models such as Qwen 3.5/3.6 and Gemma 4
- Fixed a bug where newlines were compacted in the chat input on paste
- Bug fixes and security hardening. This update is recommended for all users.
May 13, 2026
LM Studio 0.4.12LM Studio 0.4.12
Build 1
- Support for Qwen 3.6
- Improved style in chat PDF exports
- Fixed bug where MCP servers with OAuth would not work on some Windows environments
- Improved Qwen 3.5 performance with OpenAI-compatible
/v1/chat/completions, /v1/responses and Anthropic-compatible /v1/messages
Apr 17, 2026
LM Studio 0.4.11LM Studio 0.4.11
Build 1
- Support for updated Gemma 4 chat template
Apr 10, 2026
LM Studio 0.4.10LM Studio 0.4.10
Build 1
- Improve Gemma 4 tool call reliability
- Add OAuth support for MCP servers
Apr 9, 2026
LM Studio 0.4.9LM Studio 0.4.9
Build 1
- Improve Gemma 4 tool call reliability
- Add support for Anthropic-compatible
v1/messages output_config.effort (low, medium, high, max)
- Fixed a bug where deleting a chat folder would sometimes freeze the UI
- Fixed a bug where markdown Link popovers would appear at the top of the window
Apr 2, 2026