Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs
Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs
LLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts.arXiv.org

crandlecan
in reply to ☆ Yσɠƚԋσʂ ☆ • • •Hackworth
in reply to ☆ Yσɠƚԋσʂ ☆ • • •Multispin Physics of AI Tipping Points and Hallucinations
arXiv.org