Commit History

Autor SHA1 Mensaxe Data
  Tobi Lutke dc8f5a2335 Strict format validation: every line must be lex:/vec:/hyde: hai 4 meses
  Tobi Lutke 2ad507a86e Add chat template leakage detection to reward function hai 4 meses
  Tobi Lutke 6062dc769f Add named entity extraction to GRPO reward function hai 4 meses
  Tobi Lutke 32706a720f Refactor finetune folder: train/rl scripts with YAML configs hai 4 meses