Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Sorry to shatter your bubble, but this is patently false, LLMs are far more efficient on hardware that simultaneously serves many requests at once.

You might want to read this: https://arxiv.org/abs/2502.05317v2



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: