> it also would use less electricity How would it use less electricity? I’d like... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

thih9 67 days ago | parent | context | favorite | on: Ollama is now powered by MLX on Apple Silicon in p...

> it also would use less electricity

How would it use less electricity? I’d like to learn more.

jychang 67 days ago [–]

That's completely not true. LLM on device would use MORE electricity.

Service providers that do batch>1 inference are a lot more efficient per watt.

Local inference can only do batch=1 inference, which is very inefficient.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact