S SWE-gen DocsVerified SWE task generation

SWE-Lego Live / swegen

Motivation#

Why we built SWE-gen

SWE-gen turns real GitHub pull requests into verified SWE-style coding tasks. It is the task-curation stage of the SWE-Lego-Live pipeline, sitting before trajectory generation, SFT, and RL.

GitHub PRs -> swegen -> trajgen -> sft -> rl

SWE-gen exists because high-quality agent training data needs more than a patch and a repository URL. Each task must have a reproducible container, a clear instruction, a bug-introducing patch, a ground-truth fix, and a test script that separates solved from unsolved attempts.

SWE-gen provides:

Where to go next#