Training Agents: Live tutorial on how to fine-tune a coding agent for continual learning

Video by Hugging Face via YouTube
Training Agents: Live tutorial on how to fine-tune a coding agent for continual learning

Training Agents, Session 1: from raw agent traces to a supervised fine-tuning baseline.

In this live session, I’ll set up the first rung of an agentic post-training workflow: SFT. We’ll take public coding-agent traces, turn them into prompt/completion training data, run a small TRL + LoRA fine-tune, and inspect what the first metrics can and cannot tell us.

What we’ll cover:
– Why start with SFT before GRPO or environment RL
– How agent traces become training examples
– Completion-only loss for chat/tool traces
– Running TRL SFT on Hugging Face Jobs
– Keeping experiments reproducible without checking in logs or checkpoints
– What the first eval metrics do and do not prove

Repo:
https://github.com/burtenshaw/training-agents

This is part of the Training Agents series: using coding agents to design, run, monitor, and review post-training experiments, while training models to become better agents.

#TRL #HuggingFace #PostTraining #AIAgents #FineTuning

Source