Example models

Pre-built, pre-reduced, ready to download. Each model is a single executable — no setup, no internet, no GPU required.

Joke Teller

Jokes

Our flagship demo. A 3B parameter model surgically reduced with Q4K quantization, then LoRA-finetuned on 271 joke examples. Tells coherent jokes, handles follow-ups, and answers general questions — all in a single 2 GB executable with a built-in web UI.

Based on LLaMA 3.2 3B

Original size
~6 GB
Reduced size
1.93 GB
Reduction
68%
Speed
Same as base
Windows

EN → FR Translator

Translation

A task-specific translator that only speaks English and French. Vocabulary pruning strips tokens for every other language, and surgery removes layers the model doesn't need for translation.

Based on TBD

Original size
Reduced size
Reduction
Speed
WindowsmacOSLinux
Coming soon

Code Assistant

Code

A focused coding assistant that writes, explains, and debugs code. Stripped of general chat, creative writing, and multilingual capability — just code.

Based on TBD

Original size
Reduced size
Reduction
Speed
WindowsmacOSLinux
Coming soon

Need a different model or task?

These are examples. We can build custom task-specific models for your exact use case.

Make an Enquiry