Cheapest Mac to run Kimi K2.5
1000B parameters (32B active) · quality index 48 · coding 65 · every Apple Silicon Mac ever sold, compared. It is a mixture-of-experts model: all 1000B parameters must sit in memory, but only 32B compute per token, which is why it is faster than dense models of similar size.
No single Mac can hold Kimi K2.5 at practical quantizations — it needs ≈650GB resident. Running it locally means clustering multiple machines (see the cluster planner on the main site), or using a cloud API at about $2/1M output tokens.
Run it
ollama run kimi-k2.5 pulls the default (≈Q4) build once you have Ollama installed.
Or skip the hardware
Cloud APIs serve Kimi K2.5 at about $2 per million output tokens. The main site's break-even solver computes the daily usage where owning a Mac becomes cheaper than renting.
Open the interactive guide — speed simulator, TCO, all 75 machines →