Tobi Lutke
|
dc8f5a2335
Strict format validation: every line must be lex:/vec:/hyde:
|
4 ヶ月 前 |
Tobi Lutke
|
2ad507a86e
Add chat template leakage detection to reward function
|
4 ヶ月 前 |
Tobi Lutke
|
6062dc769f
Add named entity extraction to GRPO reward function
|
4 ヶ月 前 |
Tobi Lutke
|
32706a720f
Refactor finetune folder: train/rl scripts with YAML configs
|
4 ヶ月 前 |