Timeline of AI safety: Difference between revisions
No edit summary |
|||
| Line 649: | Line 649: | ||
| 2025 || {{dts|February 22}} || Publication || A blog post by Thane Ruthenis argues that AI x-risk advocacy should shift focus from framing that is focused on shifting the perspective of knowledgeable insiders (such as ML researchers, the US government, and the heavily online crowd) to persuasion aimed at the broader public. The post gets several comments including a few comments claiming that there is value in continuing to focus on ML researchers.<ref>{{cite web|url = https://www.lesswrong.com/posts/6dgCf92YAMFLM655S/the-sorry-state-of-ai-x-risk-advocacy-and-thoughts-on-doing|title = The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better|last = Ruthenis|first = Thane|date = February 21, 2025|accessdate = June 2, 2025|publisher = LessWrong}}</ref> | | 2025 || {{dts|February 22}} || Publication || A blog post by Thane Ruthenis argues that AI x-risk advocacy should shift focus from framing that is focused on shifting the perspective of knowledgeable insiders (such as ML researchers, the US government, and the heavily online crowd) to persuasion aimed at the broader public. The post gets several comments including a few comments claiming that there is value in continuing to focus on ML researchers.<ref>{{cite web|url = https://www.lesswrong.com/posts/6dgCf92YAMFLM655S/the-sorry-state-of-ai-x-risk-advocacy-and-thoughts-on-doing|title = The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better|last = Ruthenis|first = Thane|date = February 21, 2025|accessdate = June 2, 2025|publisher = LessWrong}}</ref> | ||
|- | |- | ||
| 2025 || {{dts|March 19}} || Publication || METR publishes the paper (and associated blog post by Beth Barnes) "Measuring AI Ability to Complete Long Tasks". This looks at the amount of time it takes skilled humans to complete the tasks that the AIs of today can do, and observes that this amount has been doubling once every 7 months.<ref>{{cite web|url = https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/|title = Measuring AI Ability to Complete Long Tasks|date = March 19, 2025|accessdate = June 3, 2025|last = Barnes|first = Beth|publisher = Metric Evaluation and Threat Research}}</ref> | | 2025 || {{dts|March 19}} || Publication || METR publishes the paper (and associated blog post by Beth Barnes) "Measuring AI Ability to Complete Long Tasks". This looks at the amount of time it takes skilled humans to complete the tasks that the AIs of today can do, and observes that this amount has been doubling once every 7 months.<ref>{{cite web|url = https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/|title = Measuring AI Ability to Complete Long Tasks|date = March 19, 2025|accessdate = June 3, 2025|last = Barnes|first = Beth|publisher = Metric Evaluation and Threat Research}}</ref> The claims in the paper would inform AI forecasting in the AI safety community in the coming months, raising concerns about the imminence of AGI.<ref>{{cite web|url = https://benjamintodd.substack.com/p/the-most-important-graph-in-ai-right|title = The most important graph in AI right now: time horizon|last = Todd|first = Benjamin|date = March 21, 2025|accessdate = June 3, 2025}}</ref> Barnes would later appear on an 80,000 Hours podcast explaining the points from her paper and talking about METR's work and the challenges of testing models for safety in the current environment.<ref>{{cite web|url = https://80000hours.org/podcast/episodes/beth-barnes-ai-safety-evals/|title = #217 – Beth Barnes on the most important graph in AI right now — and the 7-month rule that governs its progress|last = Wiblin|first = Robert|publisher = 80,000 Hours|date = June 2, 2025|accessdate = June 3, 2025}}</ref> | ||
|- | |- | ||
| 2025 || {{dts|April 3}} || Publication || "AI 2027" (at ai-2027.com) by Daniel Kokotajlo, Scott Alexander, Thomas Larsen, Eli Lifland, Romeo Dean is published. This describes a scenario by which superhuman AI might be achieved by 2027 and its catastrophic implications for human civilization.<ref>{{cite web|url = https://ai-2027.com/|title = AI 2027|date = April 3, 2025|accessdate = June 3, 2025}}</ref><ref>{{cite web|url = https://www.lesswrong.com/posts/TpSFoqoG2M5MAAesg/ai-2027-what-superintelligence-looks-like-1|title = AI 2027: What Superintelligence Looks Like|date = April 3, 2025|accessdate = June 3, 2025|publisher = LessWrong}}</ref> It attracts widespread commentary, both within the AI safety community<ref>{{cite web|url = https://intelligence.org/2025/04/09/thoughts-on-ai-2027/|title = Thoughts on AI 2027|last = Harms|first = Max|publisher = Machine Intelligence Research Institute}}</ref><ref>{{cite web|url = https://www.lesswrong.com/posts/drEB34CwA6kHmcuTZ/ai-2027-thoughts|title = AI 2027 Thoughts|date = April 25, 2025|accessdate = June 3, 2025|last = McCluskey|first = Peter|publisher = LessWrong}}</ref> and in more mainstream venues.<ref>{{cite web|url = https://www.vox.com/future-perfect/414087/artificial-intelligence-openai-ai-2027-china|title = One chilling forecast of our AI future is getting wide attention. How realistic is it? Rapid changes from AI may be coming far faster than you imagine.|last = Piper|first = Kelsey|date = May 23, 2025|accessdate = June 3, 2025|publisher = Vox}}</ref><ref>{{cite web|url = https://www.newyorker.com/culture/open-questions/two-paths-for-ai|title = Two Paths for A.I. The technology is complicated, but our choices are simple: we can remain passive, or assert control.|last = Rothman|first = Joshua|date = May 27, 2025|accessdate = June 3, 2025|publisher = The New Yorker}}</ref> | | 2025 || {{dts|April 3}} || Publication || "AI 2027" (at ai-2027.com) by Daniel Kokotajlo, Scott Alexander, Thomas Larsen, Eli Lifland, Romeo Dean is published. This describes a scenario by which superhuman AI might be achieved by 2027 and its catastrophic implications for human civilization.<ref>{{cite web|url = https://ai-2027.com/|title = AI 2027|date = April 3, 2025|accessdate = June 3, 2025}}</ref><ref>{{cite web|url = https://www.lesswrong.com/posts/TpSFoqoG2M5MAAesg/ai-2027-what-superintelligence-looks-like-1|title = AI 2027: What Superintelligence Looks Like|date = April 3, 2025|accessdate = June 3, 2025|publisher = LessWrong}}</ref> It attracts widespread commentary, both within the AI safety community<ref>{{cite web|url = https://intelligence.org/2025/04/09/thoughts-on-ai-2027/|title = Thoughts on AI 2027|last = Harms|first = Max|publisher = Machine Intelligence Research Institute}}</ref><ref>{{cite web|url = https://www.lesswrong.com/posts/drEB34CwA6kHmcuTZ/ai-2027-thoughts|title = AI 2027 Thoughts|date = April 25, 2025|accessdate = June 3, 2025|last = McCluskey|first = Peter|publisher = LessWrong}}</ref> and in more mainstream venues.<ref>{{cite web|url = https://www.vox.com/future-perfect/414087/artificial-intelligence-openai-ai-2027-china|title = One chilling forecast of our AI future is getting wide attention. How realistic is it? Rapid changes from AI may be coming far faster than you imagine.|last = Piper|first = Kelsey|date = May 23, 2025|accessdate = June 3, 2025|publisher = Vox}}</ref><ref>{{cite web|url = https://www.newyorker.com/culture/open-questions/two-paths-for-ai|title = Two Paths for A.I. The technology is complicated, but our choices are simple: we can remain passive, or assert control.|last = Rothman|first = Joshua|date = May 27, 2025|accessdate = June 3, 2025|publisher = The New Yorker}}</ref> | ||
| Line 655: | Line 655: | ||
| 2025 || {{dts|May 15}} (announcement), {{dts|September 16}} (release) || Publication || Nate Soares announces to LessWrong that he and Eliezer Yudkowsky co-wrote a book ''If Anyone Builds It, Everyone Dies'', scheduled for release on September 16, 2025. The blog post talks about how the book is an articulation of what they consider the best arguments so far against building AGI, with endorsements from many individuals. It also asks that people pre-order the book in order to encourage the publisher to print more copies and promote the book more heavily.<ref>{{cite web|url = https://www.lesswrong.com/posts/iNsy7MsbodCyNTwKs/eliezer-and-i-wrote-a-book-if-anyone-builds-it-everyone-dies|title = Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies|date = May 15, 2025|accessdate = June 2, 2025|last = Soares|first = Nate|publisher = LessWrong}}</ref> | | 2025 || {{dts|May 15}} (announcement), {{dts|September 16}} (release) || Publication || Nate Soares announces to LessWrong that he and Eliezer Yudkowsky co-wrote a book ''If Anyone Builds It, Everyone Dies'', scheduled for release on September 16, 2025. The blog post talks about how the book is an articulation of what they consider the best arguments so far against building AGI, with endorsements from many individuals. It also asks that people pre-order the book in order to encourage the publisher to print more copies and promote the book more heavily.<ref>{{cite web|url = https://www.lesswrong.com/posts/iNsy7MsbodCyNTwKs/eliezer-and-i-wrote-a-book-if-anyone-builds-it-everyone-dies|title = Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies|date = May 15, 2025|accessdate = June 2, 2025|last = Soares|first = Nate|publisher = LessWrong}}</ref> | ||
|- | |- | ||
| 2025 || {{dts|May 28}} || || In a wide-ranging interview with Axios, Anthropic CEO Dario Amodei says that AI is likely to have scary near-term impact, including spiking the unemployment rate as humans are replaced by AI as it gets more capable. "You can't just step in front of the train and stop it," Amodei says. "The only move that's going to work is steering the train — steer it 10 degrees in a different direction from where it was going. That can be done. That's possible, but we have to do it now." Amodei's frankness is praised in some circles, but his train analogy attracts criticism since trains do have brakes.<ref>{{cite web|url = https://x.com/binarybits/status/1928495119394652196|title = This quote made me wonder if Dario has ever seen a train.|date = May 30, 2025|accessdate = June 3, 2025|last = Lee|first = Timothy B.|publisher = X}}</ref> | | 2025 || {{dts|May 28}} || || In a wide-ranging interview with Axios, Anthropic CEO Dario Amodei says that AI is likely to have scary near-term impact, including spiking the unemployment rate as humans are replaced by AI as it gets more capable. "You can't just step in front of the train and stop it," Amodei says. "The only move that's going to work is steering the train — steer it 10 degrees in a different direction from where it was going. That can be done. That's possible, but we have to do it now." Amodei's frankness is praised in some circles, but his train analogy attracts criticism since trains run on tracks (making them difficult to steer at arbitrary angles) but do have brakes, making them easier to stop.<ref>{{cite web|url = https://x.com/binarybits/status/1928495119394652196|title = This quote made me wonder if Dario has ever seen a train.|date = May 30, 2025|accessdate = June 3, 2025|last = Lee|first = Timothy B.|publisher = X}}</ref> | ||
|} | |} | ||