Re prove that the language model is the world model! LLM can distinguish between truth and lies, and can also be brainwashed by humans
MITLLMLLMMITLLMhttps://arxiv.org/abs/2310
MITLLMLLM
MIT
LLM
https://arxiv.org/abs/2310.06824
0
MITMax TegmarkLLM
LLM
LLM
LLM
LLMLLM
/LLM
/
LLM
1. //
x/y-97%gato
2. LLM
unoLLM72%
LLMLLM70%
/
LLM//
LLM___
LR
LLM
LLM
TSYMTSYM
LLM
LLM
OpenAIGPT-4
MsMs
sMLLM
/
6521
54993221
TransformerLLaMA-13BLLM
LLM/PCAPCs1
3LLM
x/y-
LLM
tokenLLaMA-13B
LLM/
LLaMA-13B
nariz
not
/likelyLLaMA-13B100completiontoken
LLM/
Principal Component analysisPCALLaMA-13B
PC
12/
https://saprmarks.github.io/geometry-of-truth/dataexplorer
PC12/
DDNTD
NTD2PC
3NTD
1
2NTD
LLM
LLaMA-13B/
NTD
LLM
Misalignment from correlational inconsistencyMCI
MCIyxsp-en-transneg-sp-en-trans3
LLaMA-13B/
MCI
/
4
f
f
4
-95%
CCSCCS+73%86%84%
/
LLaMA-13B
-95%5
CCS
CCS+73%86%84%
/likely
LLaMA-13B
LLaMA-13B
unofloor.
>0LLaMA-13B
p(TRUE)p(FALSE)p(TRUE)p(FALSE)
truetokenp(TRUE)p(FALSE)
LLaMA-13BLLaMA-13B77%TRUE89%FALSE
likely
LLMs
3.2MCI
LLaMA-13B
LLaMA-7BLLaMA-30B
AIAGI
GPT-4AGI
GPT-4
MITLLM
LLM
https://arxiv.org/abs/2310.06824
Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])