Cheapest Mac to run GPT-OSS 120B
117B parameters (5.1B active) · quality index 38 · coding 62 · every Apple Silicon Mac ever sold, compared. It is a mixture-of-experts model: all 117B parameters must sit in memory, but only 5.1B compute per token, which is why it is faster than dense models of similar size.
The cheapest Mac that runs GPT-OSS 120B comfortably is a used Mac Studio M4 Max 16c/40c 128GB (2025) at about $2,511 EST. on the used market — running Q5_K_M quantization at roughly 74 tok/s with up to 75K tokens of context.
Every Mac that runs it, by used price
| Machine | Used price | Runs at | Est. speed | Max context |
|---|---|---|---|---|
| Mac Studio M4 Max 16c/40c 128GB (2025) | $2,511 EST. | Q5_K_M | 74 tok/s | 75K tokens |
| Mac Studio M1 Ultra 20c/48c 128GB (2022) | $2,951 EST. | Q5_K_M | 109 tok/s | 75K tokens |
| Mac Studio M2 Ultra 24c/60c 128GB (2023) | $2,999 EST. | Q5_K_M | 109 tok/s | 75K tokens |
| MacBook Pro M3 Max 16c/40c 128GB (2023) | $3,062 EST. | Q5_K_M | 55 tok/s | 75K tokens |
| Mac Studio M1 Ultra 20c/64c 128GB (2022) | $3,197 EST. | Q5_K_M | 109 tok/s | 75K tokens |
| Mac Studio M2 Ultra 24c/76c 128GB (2023) | $3,249 EST. | Q5_K_M | 109 tok/s | 75K tokens |
| Mac Studio M2 Ultra 24c/60c 192GB (2023) | $3,499 EST. | Q8_0 | 71 tok/s | 98K tokens |
| Mac Studio M2 Ultra 24c/76c 192GB (2023) | $3,749 EST. | Q8_0 | 71 tok/s | 98K tokens |
| MacBook Pro M4 Max 16c/40c 128GB (2024) | $3,772 EST. | Q5_K_M | 74 tok/s | 75K tokens |
| MacBook Pro M5 Max 18c/32c 128GB (2026) | $4,199 EST. | Q5_K_M | 84 tok/s | 75K tokens |
See all 12 machines, other price bases, and live currency conversion →
Run it
ollama run gpt-oss:120b pulls the default (≈Q4) build once you have Ollama installed.
Or skip the hardware
Cloud APIs serve GPT-OSS 120B at about $0.6 per million output tokens. The main site's break-even solver computes the daily usage where owning a Mac becomes cheaper than renting.
Open the interactive guide — speed simulator, TCO, all 75 machines →