LM Studio
LM Studio
GUI-first local model runner. Downloads and runs open-weight models via a desktop interface. Targets people who want local inference without command-line setup. Complementary to ollama, not a direct replacement — they serve different workflows.
Current version: 0.4.15. Available for Apple Silicon macOS, x64/ARM64 Windows, and x64 Linux.
Core Features
- Model browser: Search and download from Hugging Face directly in the UI
- Chat interface: Built-in chat UI for interactive use
- Local server: OpenAI-compatible API server at
http://localhost:1234/v1— base URL swap makes it compatible with most tooling - GGUF support: Runs quantized models in GGUF format, the standard for local inference
- Headless CLI:
llmsterprovides a CLI interface for headless/server use cases
Vs. Ollama
| LM Studio | Ollama | |
|---|---|---|
| Interface | GUI + API | CLI + API |
| Model discovery | In-app browser | Pull by name |
| Scripting | Awkward | Native |
| Setup ease | Easier for non-CLI users | Easier for developers |
| Customization | Limited | More configurable |
Note: LM Studio's API runs on port 1234 (http://localhost:1234/v1), not 11434 — that's ollama's port. Keep this straight when configuring tooling.
Choose LM Studio when: you're evaluating models interactively and want a visual interface. Choose ollama when: you're building pipelines or need reliable scripting integration.
Practical Use
Good for: testing model capabilities before committing to a choice, comparing responses from multiple models side-by-side, non-technical users who need local inference, macOS with Apple Silicon (LM Studio has good MLX support for Apple Silicon).
The local API server makes it usable as an ollama substitute in scripts if you prefer the GUI for management:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
Hardware Notes
Same hardware requirements as Ollama — VRAM determines what you can run. LM Studio does a good job surfacing model requirements before download and warns if a model won't fit in available VRAM.
Apple Silicon support is notably good — the unified memory architecture (up to 192GB on M3 Ultra) lets you run larger models than discrete GPU setups with equivalent VRAM. LM Studio uses Apple's MLX framework on Apple Silicon for optimized inference.
Strengths
- Lowest friction for getting started with local models
- Good model discovery UI
- Strong macOS/Apple Silicon optimization (MLX)
- OpenAI-compatible API
Weaknesses
- Not scriptable for pipelines — use Ollama for that
- GUI adds overhead that doesn't make sense for headless/server use
- Slower to update with new model support compared to Ollama
Related
ollama · gemma-4 · deepseek-r1 · rag