Example models
Pre-built, pre-reduced, ready to download. Each model is a single executable — no setup, no internet, no GPU required.
Joke Teller
JokesOur flagship demo. A 3B parameter model surgically reduced with Q4K quantization, then LoRA-finetuned on 271 joke examples. Tells coherent jokes, handles follow-ups, and answers general questions — all in a single 2 GB executable with a built-in web UI.
Based on LLaMA 3.2 3B
CPU-only — allow a few seconds per reply.
EN → FR Translator
TranslationA task-specific English-to-French translator. Two layers pruned, Q4K quantization applied, then LoRA-distilled on 800 steps with a French translation dataset. 10/10 eval pass — accurate output with correct grammar and idiom.
Based on Mistral 7B
placeholder
Movie Pitcher
CreativeGenerates structured movie pitches from a genre and optional constraint. Built with ablation-guided layer selection — the first hoof model where compression decisions were driven by KL divergence scores rather than heuristics. 10/10 eval pass across 10 genres.
Based on LLaMA 3.2 3B
placeholder
Code Assistant
CodeA focused coding assistant that writes, explains, and debugs code across Python, JavaScript, Rust, SQL, and Bash. SFT-finetuned on 367 curated examples. 9/10 on our eval suite — correct code, correct language, no padding.
Based on CodeLlama 7B
placeholder
Need a different model or task?
These are examples. We can build custom task-specific models for your exact use case.
Make an Enquiry