hoof strips a large language model down to exactly what your task needs — and packages it as a single, offline, dependency-free executable.
We analyse the model, remove everything irrelevant to your task, then wrap it in a self-contained executable.
Tell us the model and the task — translation, summarisation, code, vision, whatever you need.
hoof performs surgical pruning: vocabulary, attention heads, and layers irrelevant to your task are removed.
You receive a single executable. No Python. No GPU. No internet. Double-click and it runs.
Download and run in seconds. No setup required.
Based on LLaMA 3.2 3B
Based on TBD
We work with teams directly to build, optimise, and deliver custom task-specific models. Get in touch.
Make an Enquiry