Tobi Lutke
|
3950055708
finetune: quoted phrases, negation, and entity preservation (#247)
|
3 달 전 |
Tobi Lutke
|
785620467a
refactor: reorder output format to put hyde line first
|
3 달 전 |
Tobi Lutke
|
6062dc769f
Add named entity extraction to GRPO reward function
|
4 달 전 |
Tobi Lutke
|
32706a720f
Refactor finetune folder: train/rl scripts with YAML configs
|
4 달 전 |
Tobi Lutke
|
c35dbd6cbd
Add comprehensive scoring system for query expansion
|
4 달 전 |