|
|
|
35:51
|
|
|
|
21:00
|
|
|
|
57:32
|
|
|
|
04:12
|
|
|
|
40:27
|
|
|
|
02:10
|
|
|
|
41:04
|
|
|
|
11:09
|
|
|
|
09:08
|
|
|
|
54:30
|
|
|
|
18:02
|
|
|
|
27:39
|
|
|
|
18:02
|
|
|
|
01:53
|
|
|
|
10:33
|
|
|
|
06:16
|
|
|
|
03:31
|
|
|
|
08:58
|
|
|
|
01:33
|
|
|
|
03:46
|
|
|
|
06:21
|
|
|
|
22:18
|
|
|
|
14:29
|
|
|
|
34:07
|
|
|
|
04:17
|
|
|
|
07:48
|
|
|
|
06:58
|
|
|
|
03:45
|
|
|
|
09:03
|
|
|
|
01:19
|
|
|
|
02:17
|
|
|
|
18:05
|
|
|
|
22:43
|
|
|
|
06:58
|
|
|
|
12:22
|
|
“Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam
|
|
24:14
|
|
|
|
32:12
|
|
|
|
22:28
|
|
|
|
07:21
|
|
|
|
07:16
|