Microsoft just open-sourced bitnet.cpp, a 1-bit LLM inference framework. It let's you run 100B parameter models on your local CPU without GPUs. 6.17x faster inference and 82.2% less energy on CPUs.
Microsoft just open-sourced bitnet.cpp, a 1-bit LLM inference framework. It let's you run 100B parameter models on your local CPU without GPUs. 6.17x faster inference and 82.2% less energy on CPUs.
github.com
GitHub - microsoft/BitNet: Official inference framework for 1-bit LLMs
