About this tool

llama.cpp Local LLM inference engine running GGUF models on CPU/GPU.

A go-to option in API & Models, llama.cpp suits individuals and teams via its website or API integrations.

Related tools