Cheapest Mac to run Llama 4 Scout
109B parameters (17B active) · quality index 30 · coding 30 · every Apple Silicon Mac ever sold, compared. It is a mixture-of-experts model: all 109B parameters must sit in memory, but only 17B compute per token, which is why it is faster than dense models of similar size.
The cheapest Mac that runs Llama 4 Scout comfortably is a used Mac Studio M2 Max 12c/38c 96GB (2023) at about $1,874 EST. on the used market — running Q4_K_M quantization at roughly 19 tok/s with up to 43K tokens of context.
Every Mac that runs it, by used price
| Machine | Used price | Runs at | Est. speed | Max context |
|---|---|---|---|---|
| Mac Studio M2 Max 12c/38c 96GB (2023) | $1,874 EST. | Q4_K_M | 19 tok/s | 43K tokens |
| Mac Studio M4 Max 16c/40c 128GB (2025) | $2,511 EST. | Q6_K | 21 tok/s | 70K tokens |
| MacBook Pro M2 Max 12c/38c 96GB (2023) | $2,687 EST. | Q4_K_M | 19 tok/s | 43K tokens |
| Mac Studio M1 Ultra 20c/48c 128GB (2022) | $2,951 EST. | Q6_K | 30 tok/s | 70K tokens |
| Mac Studio M2 Ultra 24c/60c 128GB (2023) | $2,999 EST. | Q6_K | 30 tok/s | 70K tokens |
| MacBook Pro M3 Max 16c/40c 128GB (2023) | $3,062 EST. | Q6_K | 15 tok/s | 70K tokens |
| Mac Studio M3 Ultra 28c/60c 96GB (2025) | $3,139 EST. | Q4_K_M | 40 tok/s | 43K tokens |
| Mac Studio M1 Ultra 20c/64c 128GB (2022) | $3,197 EST. | Q6_K | 30 tok/s | 70K tokens |
| Mac Studio M2 Ultra 24c/76c 128GB (2023) | $3,249 EST. | Q6_K | 30 tok/s | 70K tokens |
| Mac Studio M2 Ultra 24c/60c 192GB (2023) | $3,499 EST. | Q8_0 | 21 tok/s | 139K tokens |
See all 16 machines, other price bases, and live currency conversion →
Run it
ollama run llama4:scout pulls the default (≈Q4) build once you have Ollama installed.
Or skip the hardware
Cloud APIs serve Llama 4 Scout at about $0.3 per million output tokens. The main site's break-even solver computes the daily usage where owning a Mac becomes cheaper than renting.
Open the interactive guide — speed simulator, TCO, all 75 machines →