prima.cpp is a new C++ project enabling fast 30-70B LLM inference on diverse home devices. It focuses on efficient, on-device deployment, making powerful models accessible without high-end hardware. This project aims to democratize LLM usage for developers and enthusiasts.
Opening Kapyn…