Video by Hugging Face via YouTube

A livestream for everyone starting out with open models. We walk through the most robust stack for running AI on your own laptop or server, and answer your questions live.
What we cover:
Merve Noyan on llama.cpp: what it is and how it runs on your hardware, plus the new llama.app and llama barn.
Daniel Han from Unsloth on downloading GGUFs: finding quantized models on the Hub, why quantization matters, and choosing the right size and type for your hardware.
Ben Burtenshaw on selecting your harness: closed harnesses like Claude and Codex, open harnesses like Pi, with a demo using llama server.
Onur Solmaz on Pi PR triage: a demo of Pi and Gemma for automated PR triage.
Plus a live AMA throughout the stream.
Bring your questions about local models, llama.cpp, open coding agents, and anything open or local AI.