Friendica Social Network

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

LLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts.

^arXiv.org

#technology

in reply to ☆ Yσɠƚԋσʂ ☆

crandlecan

in reply to ☆ Yσɠƚԋσʂ ☆ • 1 day ago • •

Oh wow. That's truly like playing multidimensional chess

in reply to ☆ Yσɠƚԋσʂ ☆

They used finetuning in the research, but you can definitely see this kind of behavior in the course of regular prompting, particularly as the context starts to fill up. (Possibly related to this paper?)

Multispin Physics of AI Tipping Points and Hallucinations

Output from generative AI such as ChatGPT, can be repetitive and biased. But more worrying is that this output can mysteriously tip mid-response from good (correct) to bad (misleading or wrong) without the user noticing.

^arXiv.org

This entry was edited (1 day ago)

⇧

☆ Yσɠƚԋσʂ ☆ via Technology

☆ Yσɠƚԋσʂ ☆
1 day ago • •

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

crandlecan

Hackworth

Multispin Physics of AI Tipping Points and Hallucinations

☆ Yσɠƚԋσʂ ☆ via Technology

☆ Yσɠƚԋσʂ ☆ 1 day ago • •

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

crandlecan

Hackworth

Multispin Physics of AI Tipping Points and Hallucinations

☆ Yσɠƚԋσʂ ☆
1 day ago • •