This website works better with JavaScript
Home
Esplora
Aiuto
Registrati
Accedi
suby
/
qmd
Segui
1
Vota
0
Forka
0
File
Problemi
0
Pull Requests
0
Wiki
Albero (Tree):
9b3a209a97
Rami (Branch)
Tag
main
oivo
v2.1.0-upstream
v2.1.0
v2.0.1
v2.0.0
v1.1.6
v1.1.5
v1.1.2
v1.1.1
v1.0.7
v1.0.6
v1.0.5
v1.0.0
v0.9.0
Cronologia Commit
Cerca
Autore
SHA1
Messaggio
Data
Tobi Lutke
9b3a209a97
Fix GRPO training: apply chat template to prompts
4 mesi fa
Tobi Lutke
891f3262cf
Fix GRPO reward function to handle think blocks and end tokens
4 mesi fa
Tobi Lutke
8a1c4cdab0
Add 1.7B and 4B GRPO training and GGUF conversion scripts
4 mesi fa