LM Studio
## what it does
What it does
LM Studio is a desktop application for discovering, downloading, and running LLMs locally. It wraps llama.cpp under the hood but exposes a clean GUI — a ChatGPT-style chat interface, a model browser connected to Hugging Face, and a developer mode that serves an OpenAI-compatible local API.
3.8 million+ downloads make it one of the most popular ways to run local models without touching the command line.
Installation
Download from lmstudio.ai for macOS, Windows, or Linux. No package manager required.
Key features
Model browser
LM Studio connects to Hugging Face’s model hub and surfaces GGUF-format models with recommended quantization levels based on your hardware. Select a model, click Download — it handles the rest.
Chat interface
A clean ChatGPT-style chat window. Multiple conversation threads, system prompt editor, temperature/top-p sliders, token count display.
Local API server
Enable Developer Mode and LM Studio runs a local server on port 1234:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
response = client.chat.completions.create(
model="local-model", # use the model currently loaded in LM Studio
messages=[{"role": "user", "content": "Hello!"}]
)
The API is compatible with any OpenAI SDK — Python, TypeScript, curl.
Multi-model loading
LM Studio 0.3+ supports loading multiple models simultaneously (hardware permitting) and switching between them without reloading.
vs Ollama
| Feature | LM Studio | Ollama |
|---|---|---|
| GUI | ✓ Built-in | ✗ (third-party UIs) |
| CLI | ✗ Limited | ✓ Full |
| API | ✓ OpenAI-compat | ✓ OpenAI-compat |
| Headless / Docker | ✗ | ✓ |
| License | Proprietary (free) | MIT |
LM Studio is the right choice if you want a GUI-first experience. Ollama is better for server/headless setups and scripted automation.
## platforms
8GB RAM recommended. Apple Silicon M1+ for fast inference on Mac. NVIDIA GPU for Windows/Linux acceleration.
## embed this badge
