啊?
Jason Wei
@_jasonwei
The longest chain-of-thought / reasoning trace I have witnessed was almost twenty minutes long and involved crazy backtracking, constraint verification, and tool use. But in the end, my girlfriend decided to just go with the first outfit that she tried on
由Google翻译自英语
我见过的最长的思维/推理链长达近二十分钟,其中涉及疯狂的回溯、约束验证和工具使用。但最终,我的女朋友决定直接穿她试穿的第一套衣服。
02:45·2025/5/3·16万次查看
Q 75
183
3738
371
最相关的回复V
ShaneGu@shaneguML·5小时
Intuition (value function) >> Logic. Trust your first instinct.
Also you allude to great relationship advice: do not intervene on your girlfriend's CoTs (provide external reward), but always just affirm her all the way (2x her self-verified reward). Learned hard ways...
1
0 11
ll1 2622
@_jasonwei
The longest chain-of-thought / reasoning trace I have witnessed was almost twenty minutes long and involved crazy backtracking, constraint verification, and tool use. But in the end, my girlfriend decided to just go with the first outfit that she tried on
由Google翻译自英语
我见过的最长的思维/推理链长达近二十分钟,其中涉及疯狂的回溯、约束验证和工具使用。但最终,我的女朋友决定直接穿她试穿的第一套衣服。
02:45·2025/5/3·16万次查看
Q 75
183
3738
371
最相关的回复V
ShaneGu@shaneguML·5小时
Intuition (value function) >> Logic. Trust your first instinct.
Also you allude to great relationship advice: do not intervene on your girlfriend's CoTs (provide external reward), but always just affirm her all the way (2x her self-verified reward). Learned hard ways...
1
0 11
ll1 2622
啊?