llm-training
Everything on Ground Truth tagged “llm-training” — 1 item.
Two new papers push 'on-policy distillation' to fix privileged teachers and merge specialist skills News
DOPD and MOPD advance on-policy distillation -- training a student on its own outputs -- with DOPD routing supervision to avoid a 'privilege illusion' and MOPD merging multiple specialist RL teachers into one model without cross-domain interference.