Player FM - Internet Radio Done Right
Checked 15d ago
Tilføjet one år siden
Kandungan disediakan oleh Steve Newman. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Steve Newman atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.
Player FM - Aplikasi Podcast
Pergi ke luar talian dengan aplikasi Player FM !
Pergi ke luar talian dengan aplikasi Player FM !
Podcast Berbaloi untuk Didengar
DITAJA
D
Data-powered Innovation Jam
40:51
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
40:51Knock, knock, Neo. Think Agents are a thing of the future? Look again at the iconic movie, The Matrix—they've been hiding in plain sight. In the first episode of 2025, hosts Ron Tolido, Weiwei Feng, and Robert Engels venture down the digital rabbit hole of Virtual Twins with Morgan Zimmerman, CEO of NETVIBES at Dassault Systèmes. It’s a fascinating world where physical science meets data science, a world of both red and blue pills. Virtual Twins take the concept of Digital Twins to the next level, transforming static representations of products, devices, and systems into living, breathing "Dream Theaters" of simulations, predictions, and what-if scenarios. Fueled by AI and immersive technology, they don’t just capture life—they reimagine it. But what happens when Virtual Twins evolve to encompass entire organizations and industries? Imagine bending not just spoons, but entire realities. The Matrix has you—are you ready to see how deep the twin-powered innovation goes? You're in The Matrix ! Tune into our latest Data-powered Innovation Jam podcast episode now! Timestamps: 00:35 – Ron and Robert dive into the iconic world of The Matrix , drawing parallels to digital innovation. 04:55 – Morgan shares insights into Dassault Systèmes' legacy and the evolution of Virtual Twins. 10:00 – Exploring the connection between ontologies and their applications in the real world. 20:03 – How Digital Twins, Virtual Twins, and the Metaverse are transforming real-world use cases. 36:15 – Redefining interaction with the digital world in ways that feel natural yet futuristic. Connect with our guest Morgan Zimmermann and our hosts Ron Tolido , Robert Engels , and Weiwei Feng on LinkedIn.…
Am I Stronger Yet?
Tandakan semua sebagai (belum) dimainkan
Manage series 3515640
Kandungan disediakan oleh Steve Newman. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Steve Newman atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.
Thoughts on AI and distributed systems after 40+ years in tech
…
continue reading
15 episod
Tandakan semua sebagai (belum) dimainkan
Manage series 3515640
Kandungan disediakan oleh Steve Newman. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Steve Newman atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.
Thoughts on AI and distributed systems after 40+ years in tech
…
continue reading
15 episod
Alle episoder
×No one seems to know whether world-bending AGI is just three years away. Or rather, everyone seems to know, but they all have conflicting opinions. How can there be such profound uncertainty on such a short time horizon? Or so I thought to myself, last month, while organizing my thoughts for a post about AI timelines. The ensuing month has brought a flood of new information, and some people now appear to believe that transformational AGI is just two years away. With additional data, the range of expectations is actually diverging. Here’s my attempt to shed some light. The Future Has Never Been So Unclear Have we entered into what will in hindsight be not even the early stages, but actually the middle stage, of the mad tumbling rush into singularity? Or are we just witnessing the exciting early period of a new technology, full of discovery and opportunity, akin to the boom years of the personal computer and the web? There was already a vibe that things were starting to speed up (after what some viewed as a slow period between GPT-4 and o1), and then OpenAI’s recent announcement of their “o3” model blew the doors off everyone’s expectations. Thus : In 10 years, chatbots have gone from "this can persuade 33% of users it's a 13yr old Ukrainian boy speaking broken English" to "the community of mathematicians, taken collectively, is still smarter than this” 1 . AI is approaching elite skill at programming, possibly barrelling into superhuman status at advanced mathematics, and only picking up speed. Or so the framing goes. And yet, most of the reasons for skepticism are still present. We still evaluate AI only on neatly encapsulated, objective tasks , because those are the easiest to evaluate. (As Arvind Narayanan says , “The actually hard problems for AI are the things that don't tend to be measured by benchmarks”.) There’s been no obvious progress on long-term memory . o1 and o3, the primary source of the recent “we are so back” vibe, mostly don’t seem better than previous models at problems that don’t have black-and-white answers 2 . As Timothy Lee notes, “ LLMs are much worse than humans at learning from experience ”, “ large language models struggle with long contexts ”, and “ [LLMs] can easily become fixated on spurious correlations in their training data ”. Perhaps most jarringly, LLMs 3 still haven’t really done anything of major impact in the real world. There are good reasons for this – it takes time to find productive applications for a new technology, people are slow to take advantage, etc. – but still, it’s dissatisfying. One almost expects that in a year or two, we will have chatbots that plow through open questions in mathematics with the same dismissive ease as Matt Damon’s character in Good Will Hunting, code circles around the goofy genius main character in Silicon Valley, make Sheldon from The Big Bang Theory look like a physics neophyte, and the primary impact will still be kids using it to do their homework. The state of the discourse is nicely encapsulated by this tweet (lightly edited for clarity): OpenAI announced a proto-AGI which all but confirms a near-term singularity as part of a gimmicky xmas product rollout, named to avoid a copyright clash with a phone company, to ~no fanfare in the mainstream media, the day before WSJ published the headline “The Next Great Leap in AI Is Behind Schedule and Crazy Expensive”. I recently attempted to enumerate the fundamental questions that lie underneath most disagreements about AI policy , and number one on the list was how soon AGI will arrive. Radical uncertainty about the timeline makes it extremely difficult to know what to do about almost any important question. (I define AGI as AI that can cost-effectively replace humans at more than 95% of economic activity, including any new jobs that are created in the future. This is roughly the point at which seriously world-changing impacts, both good and bad, might start to emerge. For details, see here .) In an attempt to shed a bit of light on the situation, I’m going to articulate two timelines for AGI – one slow, one fast. In the process, I’ll highlight some leading indicators that will indicate which path we’re on. The Slow Scenario This is the slowest timeline I can make a good argument for, excluding catastrophes (including war) or a deliberate pause. Think of it as a lower bound on AI progress. In this scenario, the recent flurry of articles suggesting that AI has “hit a wall” are correct, insofar as the simple scaling of training data and model size – which drove progress from 2018 to 2023 – sputters out. It won’t come to a complete halt; in 2025 or 2026, we’ll see a new generation of models that are larger than recent trends would have indicated. That will allow the models to incorporate more world knowledge and “system 1 smarts” / “raw intelligence” (whatever that means) than GPT-4 4 . But this won’t be a leap like GPT-3 to GPT-4, perhaps not even GPT-3.5 to GPT-4. It is becoming too hard to find more quality training data and justify the cost of larger models. Further progress on this axis remains slow. Progress on “reasoning models” like o1, o3, and DeepSeek-R1 continues, turning out ever-more-impressive results on benchmarks such as FrontierMath and RE-Bench (which measures the ability of AIs to perform AI R&D). However, the gains are limited to neatly encapsulated tasks with black-and-white answers – exactly the sorts of capabilities that are easiest to measure. This turns out to have less impact than anticipated. The models are useful for mathematicians, scientists, and engineers (including software engineers), especially as people become adept at identifying encapsulated problems that they can extract from the messy complexity of their work and hand to an AI. But because these neatly encapsulated problems only encompass part of the job, Amdahl's Law kicks in and the overall impact on productivity is limited 5 . Meanwhile, AI is generally not opening the door to radically new ways of getting things done. There are some exceptions, for instance in biology, but the incredible scientific and regulatory complexity of biology means that substantial real-world impact will take years. Furthermore, progress on reasoning models is not as rapid as the vibes at the end of 2024 suggested. o3’s remarkable benchmark results turn out to have been a bit of a mirage , and even for neatly encapsulated problems, o1 and o3’s capabilities are found to be hit-and-miss 6 . Moving forward, the training approach struggles to generalize beyond problems with easily evaluated answers. Progress on problems that take humans more than a few hours to solve turns out to be especially difficult , for two reasons: navigating the vast range of possible steps requires higher-level cognitive strategies and taste that we don’t yet know how to train into an AI, and we haven’t figured out how to give LLMs fine-grained access to knowledge in the world. There are widespread efforts to create “agents” – tools that can be trusted to [semi-]independently pursue a goal across an extended period of time. 2025 is dubbed the Year of the Agent, but the results are mostly poor. Agents struggle to go out into the world and find the information needed to handle a task. They do a poor job of navigating between subtasks and deciding whether and how to revise the master plan. Models continue to be distracted by extraneous information, and resistance to trickery and scams (“adversarial robustness”) remains a challenge. Much as the “Year of the LAN” was proclaimed across most of the 80s and early 90s, pundits will still be saying “this is finally the Year of the Agent” well past 2030. Overcoming these limitations in reasoning and agentic behavior turns out to require further breakthroughs, on the scale of transformers and reasoning models, and we only get one of those breakthroughs every few years 7 . Working around these limitations, individuals and organizations are finding more and more ways to encapsulate pieces of their work and hand them to an AI. This yields efficiency gains across many areas of the economy, but the speed of adoption is limited for all the usual reasons – inertia, regulatory friction, entrenched interests, and so forth. Fortunes are made, but adoption is uneven – just as in the early years of the Internet. The major AI labs are doing everything they can to use AI to accelerate their own work. Internally, there are few barriers to adoption of AI tools, but the impact is limited by the tasks where AI isn’t much help (Amdahl’s Law again). AI is not generating the conceptual breakthroughs that are needed for further progress. It does accelerate the work of the humans who are seeking those breakthroughs, but by only a factor of two. The process of training new AIs becomes ever more complex , making further progress difficult despite continued increases in R&D budgets. There may be a slowdown in investment – not a full-blown “AI winter”, but a temporary pullback, and an end to the era of exponentially increasing budgets, as a less breathless pace starts to look more cost-effective. Another drag on impact comes from the fact that the world knowledge a model is trained on is out of date by the time the model is available for use. As of the end of 2024, ChatGPT 8 reports a “knowledge cutoff date” of October 2023, indicating that its models do not have innate understanding of anything published after that date – including the latest in AI R&D techniques 9 . Until a new approach is found, this will interfere with the pace at which AI can self-improve. Eventually, 2035 rolls around – 10 years from now, which is as far as I’m going to project – and AI has not had any Earth-shaking impact, for good or ill. The economy has experienced significant change, AI is embedded in our everyday lives to at least the same extent as the smartphone, some major companies and job markets have been disrupted, we have capabilities that seemed almost unimaginable in 2020 and may still seem so today – but the overall order of things is not drastically altered. Importantly, we have not missed the window of opportunity to ensure that AI leads to a positive future. The Fast Scenario I’ll now present the fastest scenario for AI progress that I can articulate with a straight face. It addresses the potential challenges that figured into my slow scenario. In recent years, AI progress has been a function of training data, computing capacity, and talent (“algorithmic improvements”). Traditional training data – textbooks, high-quality web pages, and so forth – is becoming harder to find, but not impossible; video data, commissioned human work, and other sources can still be found. The days of rapid order-of-magnitude increases in data size are behind us, but it’s possible to scrounge up enough high-quality tokens to fill in domains where AI capabilities had been lacking, increasing reliability and somewhat smoothing “jagged” capabilities. More importantly, synthetic data – generated by machines, rather than people – turns out to work well for training ever-more-capable models. Early attempts to use synthetic data suffered from difficulties such as “model collapse”, but these have been overcome (as highlighted by the success of o1 and o3). Given enough computing capacity, we can create all the data we need. And AI tools are rapidly increasing the productivity of the researchers and engineers who are building the data-generation and AI training systems. These tasks are some of the easiest for AI to tackle, so productivity gains begin compounding rapidly. Computing capacity can now substitute for both data and talent, meaning that compute is the only necessary input to progress. Ever-increasing training budgets, continuing improvements in chip design , and (especially) AI-driven improvements in algorithmic efficiency drive rapid progress; as the lion’s share of innovation starts to be derived from AI rather than human effort, we enter the realm of recursive self-improvement, and progress accelerates . Because we are no longer training ever-larger models, there’s no need to build a single monster (multi-gigawatt) data center. The primary drivers of progress – synthetic data, and experiments running in parallel – need lots of computing capacity, but don’t need that capacity to be centralized. Data centers can be built in whatever size and location is convenient to electricity sources; this makes it easier to keep scaling rapidly. There is an awkward intermediate period where AI is becoming aggressively superhuman at encapsulated math and coding problems 10 , but is still limited in other problem domains, including many areas relevant to AI development (such as setting research agendas). During this period, the leading AI labs are fumbling around in search of ways to push through these limitations, but this fumbling takes place at breakneck speed. AI-driven algorithmic improvements allow a huge number of experiments to be run in a short period; AI tools handle most of the work of designing, executing, and evaluating each experiment; AI assistants help brainstorm new ideas, and help manage what would otherwise be the overwhelming problem of coordinating all this work and bringing improvements into production without destabilizing the system. Thus, human creativity is still a bottleneck on progress, but the AI tools are enabling us to run an unprecedented number of experiments, which yield serendipitous discoveries. Overall, capabilities are not driven primarily by throwing ever-more data into ever-larger models (as in the 2018-2023 period); instead, advances in data generation and curation, model architecture, and training techniques allow increasing capabilities to fit into models of static or even declining size (as we’ve seen in 2024) 11 . This helps keep inference costs down, enabling the increased pace of experimentation and increased use of AIs in AI R&D. And the rapid progress maintains competitive pressure to motivate ongoing investment in data center buildout and AI training; this eventually extends to the international realm (especially US vs. China), bringing national budgets into play. The recent trend toward use of “inference-time compute” – achieving better results by allowing an AI to think for an extended period of time – continues. However, today’s clumsy techniques (such as performing a task 1000 times and keeping the best result) outlive their usefulness. The focus will be on training systems that can think productively for an extended period, just as people do when working on a difficult problem. The current simple techniques will retain a role, but are used only on occasions when a problem is so important that it’s worth spending a lot of extra money just to get a slightly better solution. A few major breakthroughs (and many intermediate breakthroughs) emerge to help things along. One of these probably involves giving AIs access to “knowledge in the world” , including the ability to create and revise notes, to-do lists, and other data structures to support them in complex tasks. Another probably involves continuous learning, at both coarse scale (getting better at selling a particular product over the course of 500 sales pitches) and fine scale (figuring out how to make progress on a tricky problem after grinding away at it for a few days). Among other things, this alleviates the knowledge cutoff problem that would otherwise interfere with rapid AI self-improvement. Other breakthroughs are found that allow us to apply LLMs to messy problems that can’t be decoupled from their real-world context. I have no clear idea how this might be accomplished on a fast timeline, but I think it is a necessary assumption for the scenario to hold. As a result of all these advances, AI agents become truly useful. Success in 2025 is mixed, but 2026 really is the Year of the Agent, with adoption across a wide range of consumer and workplace applications. Subsequent years see rapid increases in the breadth and depth of AI applications – including use in the development of AI itself. How quickly might this lead to AGI – again, defined as AI that can cost-effectively replace humans at more than 95% of economic activity? I struggle to put a number on this. But it has taken us roughly two years to go from GPT-4 to o3 12 , and in that time we’ve arguably seen just one major breakthrough: RL training on synthetically generated chains of thought. I’ve argued that several further major breakthroughs are needed, at a minimum, to reach AGI. So it should take at least twice as long as the time from GPT-4 to o3. We might expect progress to speed up, due to increased budgets and AI assistance. But we might also expect progress to be more difficult, as we exhaust easily tapped resources (off-the-shelf data; previously existing GPUs and scientific / engineering talent that could be repurposed for AI), systems become more complex, and we push farther into poorly-understood territory. Put all of this together, and I have a hard time imagining that transformational AGI could appear before the end of 2028, even in this “fast” scenario, unless more or less all of the following also occur: We get “lucky” with breakthroughs – multiple major, unanticipated advances occur within the next, say, two years. New approaches at least as impactful as the one that led to o1. Even this might not be sufficient unless the breakthroughs specifically address key limitations such as continuous learning, messy real-world tasks, and long-horizon planning for problems with no clear right and wrong answers. Threshold effects emerge, such that incremental advances in model training turn out to cause major advances in long-horizon planning, adversarial robustness, and other key areas. We sustain extremely rapid improvements in algorithmic efficiency, allowing a massive deployment of advanced AI despite the physical limits on how quickly chip production can be increased in a few short years. That’s my fast scenario. How can we tell whether we’re in it? Identifying The Requirements for a Short Timeline My chief motivation for articulating these two scenarios was so that I could review the differences between them. These differences might constitute leading indicators that we can watch in the coming months to see which course we’re on. The most important question is probably the extent to which AI is accelerating AI R&D. However, I don’t know that this will be visible to anyone outside of the frontier AI labs. What follows are some key leading indicators that the general public will be able to observe if we are on a fast path to AGI. Progress on reasoning is real, sustained, and broadly applicable. If o3 is released to the public and consistently wows people (in a way that I believe o1 has not consistently done), if its capabilities on math and coding tasks seem consistent with its amazing scores on FrontierMath and Codeforces, and there’s at least one more major step forward in reasoning models in 2025 (possibly leading to unambiguously superhuman scores on very difficult benchmarks like FrontierMath and Humanity’s Last Exam), that supports a fast timeline 13 . If people report mixed experiences with o3, if its performance is still very hit-and-miss , if benefits outside of math/science/coding are still limited, if the FrontierMath results look less impressive once details emerge, if that doesn’t change in a significant way over the course of 2025, that will suggest we’re on a slower timeline. It would mean that we really haven’t made a lot of progress in fundamental capabilities since the release of GPT-4 in March 2023. In the rapid-progress scenario, the techniques used to train reasoning models on math / science / programming tasks are succesfully extended to tasks that don’t have clear right and wrong answers . And these models must become more reliable for math / science / programming tasks. Breaking out of the chatbox : AIs start showing more ability at tasks that can’t be encapsulated in a tidy chatbox session. For instance, “draft our next marketing campaign”, where the AI would need to sift through various corporate-internal sources to locate information about the product, target audience, brand guidelines, past campaigns (and their success metrics), etc. AI naturally becomes more robust as it gets better at reasoning , fuzzy problems, and incorporating real-world context. Systems are less likely to make silly mistakes, and more resistant to “jailbreaking”, “prompt injection” and other attempts to deliberately fool them into unintended behavior . (This may be supplemented by new forms of anti-trickery training data, mostly synthetic.) Widespread adoption of AI agents , [semi-]independently pursuing goals across an extended period of time, operating in “open” environments such as the public Internet (or at least a company intranet). These agents must be able to maintain coherent and adaptive planning over time horizons that gradually increase to multiple hours (and seem likely to progress to months), completing tasks and subtasks that don’t have black-and-white answers. No particular barrier emerges as we push reasoning and agentic models into larger-scale problems that require ever-longer reasoning traces; models are able to develop whatever “taste” or high-level strategies are needed. They must be sufficiently resistant to trickery and scams such that this is not impeding their adoption. Real-world use for long-duration tasks . Users are actually making use of AI systems (agentic and otherwise) to carry out tasks that take progressively longer. They are finding the wait and cost to be worthwhile. Beyond early adopters : AI becomes more flexible and robust, achieving adoption beyond early adopters who find ways of incorporating AI into their workflow. It is able to step in and adapt itself to the task, just as a new hire would. AI’s increasing flexibility flows over and around barriers to adoption. This greatly increases the pace at which AI can drive productivity gains across the economy – including the development of AI itself 14 . Scaling doesn’t entirely stall . We see the release of a “larger” model that appears to incorporate more forms of training data, and constitutes an impressive advance along many fronts at once – like GPT-3.5 → GPT-4, or even GPT-3 → GPT-4 (and unlike GPT-4o → o1). Preferably before the end of 2025. We aren’t looking for a model that is larger than GPT-4, but one that is larger than its contemporaries in exchange for broader and deeper knowledge and capabilities. Capital spending on data centers for AI training and operation continues to increase geometrically . This is a useful indicator for both the level of resources available for developing and operating AIs, and the internal confidence of the big players. Unexpected breakthroughs emerge . To get transformational AGI within three or four years, I expect that we’ll need at least one breakthrough per year on a par with the emergence of “reasoning models” (o1) 15 . I suspect we’ll specifically need breakthroughs that enable continuous learning and access to knowledge-in-the-world. How To Recognize The Express Train to AGI If we are on the road to transformative AGI in the next few years, we should expect to see major progress on many of these factors in 2025, and more or less all of them in 2026. This should include at least one major breakthrough per year – not just an impressive new model, but a fundamentally new technique, preferably one that enables continuous learning, access to knowledge-in-the-world, or robust operation over multi-hour tasks. Even in this scenario, I have trouble imagining AGI in less than four years. Some people have shorter timelines than this; if you’re one of them, I would love to talk and exchange ideas (see below). If models continue to fall short in one or two respects, AI’s increasing array of superhuman strengths – in speed, breadth of knowledge, ability to take 1000 attempts at a problem, and so forth – may be able to compensate. But if progress on multiple indicators is slow and unreliable, that will constitute strong evidence that AGI is not around the corner. We may see nontechnical barriers to AI adoption: inertia, regulatory friction, and entrenched interests. This would not necessarily indicate evidence of slow progress toward AGI, so long as these barriers are not posing a significant obstacle to the ongoing development of AI itself. In this scenario, AI adoption in the broader economy might lag until AI capabilities start to become radically superhuman, at which point there would be strong incentives to circumvent the barriers. (Though if inertia specifically is a major barrier to adoption, this might constitute evidence that AI is still not very flexible, which would suggest slow progress toward AGI.) I am always interested in feedback on my writing, but especially for this post . I would love to refine both the slow and fast scenarios, as well as the scorecard for evaluating progress toward AGI. If you have thoughts, disagreements, questions, or any sort of feedback, please comment on this post or drop me a line at amistrongeryet@substack.com. Subscribe now Share Thanks to Andrew Miller, Clara Collier, Hunter Jay, Jaime Sevilla, Julius Simonelli, Nathan Lambert, and Timothy Lee. 1 In other words, it was not so very long ago that language models couldn’t reliably string together a grammatical sentence, and now we’re preparing to measure whether a single AI has become as powerful as an entire team of research mathematicians. The quote is a reference to this tweet : I’m excited to announce the development of Tier 4, a new suite of math problems that go beyond the hardest problems in FrontierMath. o3 is remarkable, but there’s still a ways to go before any single AI system nears the collective prowess of the math community. 2 When reviewing a draft of this post, Julius Simonelli asked an excellent question: how do we know o1 and o3 don’t improve on tasks that don’t have black-and-white answers, when by definition it’s difficult to measure performance on those tasks? For example, poetry doesn't have black-and-white answers, but I don't see why we should say it's “bad” at poetry. I’m basing this statement on a few things: Vibes – lots of people saying that o1 doesn't seem better than 4o at, for instance, writing. OpenAI explicitly stated that o1 primarily represents progress on math, science, and coding tasks. I vaguely recall seeing non-math/science/coding benchmarks at which o1 does not beat 4o. But I could be misremembering this. There are sporadic reports of o1 doing much better than other models on non-math/science/coding tasks. For instance, here’s Dean Ball being impressed by o1-pro’s answer to “nearly a pure humanities question” about Beethoven’s music and progress in piano construction; he also says that “o1-preview performs better than any non-specialized model on advanced and creative legal reasoning”. But you can find anecdotes in favor of almost any possible statement one might make about AI. My best guess is that Dean has identified something real, but that o1’s gains over 4o are mostly limited to black-and-white questions. For another counterpoint, see this tweet from Miles Brundage. 3 Large Language Models, the technology underlying ChatGPT and the basis of most claims that AGI is coming soon. 4 Note that over the course of 2024, released models have been relentlessly shrinking in parameter count (size), squeezing ~equivalent knowledge and improved capabilities into fewer and fewer parameters. Here I am envisioning that there will be a bump in this downward progression – there will be some new models in the mix that use more parameters than that recent trend, in order to incorporate more knowledge. Even these models may then continue to shrink, if there is room to continue the trend of model compression. 5 Suppose 50% of my time is spent on tasks that can be handed to an AI, and AI makes me 10x more productive at those tasks. My overall productivity will increase by less than 2x: I’m limited by the other half of the work, the half that AI isn’t helping with. Even if AI makes me 1000x more productive at the first half of the job, my overall productivity still increases by less than 2x. 6 For example, from someone I know: One example from yesterday: I wanted to set up a pipeline in colab to download random files from common crawl, and pass them by OpenAIs API to tag whether they are licensed. This should be an easy task for someone with encyclopedic knowledge of common crawl and the OA API, yet the models I tried (o1, Gemini) failed miserably. 7 A recent tweet from Dan Hendrycks expresses this succinctly. 8 Both the 4o and o1 variants. 9 Models that can perform web search can be aware of developments after their cutoff date. But they will not have deeply internalized that knowledge. For instance, if a new training algorithm has been released after the cutoff date, I might expect a model to be able to answer explicit questions about that algorithm (it can download and summarize the paper). But I'd expect it to struggle to write code using the algorithm (it won't have been trained on a large number of examples of such code). It’s possible that “reasoning” models with strong chain-of-thought capabilities will outgrow this problem. But barring a substantial breakthrough that allows models to learn on the fly (the way people do), I’d expect it to continue to be a handicap. 10 People have pointed out that advanced math bears little resemblance to the tasks required for survival in prehistoric times, and so there’s no reason to believe that human beings are very good at it on an absolute scale. It’s possible that AI will blow straight past us on many tasks relevant to AI research, just as it has done for multiplying thousand-digit numbers or playing chess. As Jack Morris puts it, “strange how AI may solve the Riemann hypothesis before it can reliably plan me a weekend trip to Boston”. 11 I can imagine that keeping model sizes down might involve creating multiple versions of the model, each fine-tuned with a lot of domain knowledge in some specific area. The alternative, training a single model with deep knowledge in all domains, might require the model to be large and thus expensive to operate. But perhaps this will turn out to be unnecessary (mumble mumble Mixture of Experts mumble mumble). 12 GPT-4 was released on 3/14/23. I believe o3 is rumored to have a release date in January, so 22 months later. OpenAI is understood to have additional unreleased capabilities, such as the “Orion” model, but it is not obvious to me that the level of unreleased capability at OpenAI as of a hypothetical January o3 release is likely to be substantially more than whatever they had in the wings in March 2023. So I’ll say that progress from March 2023 to January 2025 is roughly equal to the delta from GPT-4 to o3. 13 Here, I mean performance that is, on average, superior to the score you’d get if you assigned each problem to an elite specialist in the technical domain of that specific problem. 14 The tech industry, and AI labs in particular, will be heavily populated with early adopters. But the ability of AI to move beyond early adopters will still be a good indicator that it is becoming sufficiently flexible and robust to broadly accelerate AI R&D. 15 It’s possible that we’ll see “breakthroughs” that don’t come from a radical new technique, but simply emerge from threshold effects. That is, we might have incremental progress that crosses some important threshold, resulting in a dramatic change in capabilities. Quite possibly the threshold won’t have been apparent until it was reached.…
I’m working on a post about the timeline until AGI, and one piece of feedback I received is that I need to explain what I mean by “AGI”. Debates about terminology are usually boring, but the definition of AGI actually gets at something important. As AI capabilities advance, there will be an inflection point where the impact increases steeply, and most of the things people refer to as “AGI” will occur somewhere close to that inflection point. In other words: there’s no need to argue about how to define “AGI”, because most useful definitions amount to the same thing. In this post, I’m going to propose a specific definition of AGI which I think is particularly useful when estimating timelines. I’ll argue that it’s roughly equivalent to most of the alternative definitions in common usage. It also tells us when AI will start to have transformative effects. There Are Many Ways in Which AI Might Change the World People have speculated about many different ways in which AI could have a transformative effect on the world. Readers of this blog are almost certainly familiar, but for clarity (and newcomers) I’ll briefly describe the sorts of things I’m referring to. One idea is that AI may lead to incomprehensible economic growth . It could unlock rapid progress in science and engineering, do jobs more cheaply than human workers, and allow companies and governments to make more effective decisions. In recent history, world per-capita GDP has been doubling roughly every 40 years 1 . Some people argue that advanced AI will enable GDP to double (or more) in a single year, which I’ll term “hypergrowth”. Ten years of hypergrowth would result in per-capita GDP increasing by a factor of 1000. A family that currently survives on $2 per day would be taking home $730,000 per year 2 . Another idea is that AI may lead to catastrophic risks . It might help someone stage a devastating cyberattack or engineer an ultra-deadly pandemic; it might allow a despot to achieve unbreakable power over a country or even the entire world; an AI might escape our control and destroy all human life. People speculate that AI might render humans obsolete , at least as economic workers. It might create an end to scarcity , allowing everyone to live in luxury (if we choose to distribute the spoils widely). It might enable science-fiction technologies such as a cure for aging, rapidly expanding space colonies, interstellar travel, and nanotech. Some envision a Singularity , in which progress becomes so rapid that it’s impossible to even speculate as to what might ensue. I assert that all of these things will become possible at about the same time . The implication is that it’s useful to have a term which refers to that moment. But first, let me clarify what I mean by “become possible at about the same time”. I Said “Become Possible”, not “Definitely Happen” I am not saying that all of the proposed implications of advanced AI will definitely come to pass. Hopefully we’ll manage to avert the catastrophic risks. The increasing difficulty of finding further technological advances might rule out a singularity. A “cure for aging” might turn out to be impossible. People might prefer interacting with other people, such that humans do not become economically “obsolete”. When I say that the things I listed will “become possible at about the same time”, I mean that the level of AI capability needed to unlock incomprehensible economic growth is roughly the same as the level needed to engineer a truly devastating pandemic, allow AI to take over the world, or enable rapid space colonization. Other factors will then determine which consequences actually ensue, and on what schedule. For instance, hopefully we won’t immediately experience an engineered pandemic. But if so, it will be because no one tried to engineer one, or they were caught in the act, or we’ve improved our biodefenses. It won’t be because another 20 years of AI progress were necessary first. I’m making a strong claim here. Why do I believe that all of these disparate scenarios require roughly the same level of AI capability? There’s a Reason We Talk About Artificial General Intelligence People argue as to whether economic hypergrowth is possible even in principle. But it’s almost certainly not possible unless AI is able to automate essentially all economic activity. Automating half of the jobs won’t have a profound effect; demand for the other half will expand until we reach a new, fairly conventional equilibrium. (After all, this has happened many times before; it wasn’t all that long ago that most people worked in agriculture or simple crafts.) So, hypergrowth requires AI that can do “essentially everything”. It also requires AI be able to adapt to the job, rather than requiring jobs and workflows to be restructured around AI. Otherwise, AI will diffuse into the economy at a rate similar to past technologies – too slowly to lead to sustained hypergrowth. Hypergrowth requires AIs to be general enough to do more or less everything people can do, and flexible enough to fit into the circumstances in which people were doing it. What about space colonization? Establishing a self-sustaining colony will be an incredibly massive and difficult undertaking . Modern human society relies on a vast array of products and services, from flatscreens to advanced pharmaceuticals. Replicating the global supply chain would be difficult enough; doing it while keeping people alive and comfortable in the harsh environment of outer space (or the surface of another planet) will be even harder. The challenge is reduced considerably if we can pave the way with a robotic workforce (simpler needs, more environmental tolerance, no need for safety margins) and AI-driven advances in technology. Highly general AI unlocks space colonization; narrower progress – say, robots which can undertake some of the more dangerous work under human supervision – probably won’t be enough. What about that engineered pandemic? A “conventional” pandemic, on the level of COVID-19 or the 1918 “Spanish” flu, could plausibly be achieved with the assistance of an AI not much more capable than what we have today. (Some human experts might manage this without any help from AI at all; fortunately, if unsurprisingly, none of them seem to want to.) But I mentioned an “ultra-deadly pandemic”. By this, I mean something with a high fatality rate (much higher than 50%), that spreads so rapidly and undetectably as to outrun or overwhelm even a stringent lockdown. It’s conceptually possible to imagine a virus like this: one that has a fatality rate like Ebola or untreated AIDS, spreads like Omicron, and perhaps has a longer incubation period (so as to delay initial detection). However, successfully designing, engineering, refining, and testing such a virus – without being detected – would be incredibly difficult. The feedback loops that determine how a virus affects its hosts and spreads through the population are unfathomably complex; for instance, a modification which makes it easier for the virus to bind to human cells might make victims feel sicker (and thus more likely to stay home). The real-world impact of a virus depends on everything from cellular processes to the politics of pandemic response. AI models that “merely” advance our understanding of protein interactions and metabolic pathways are unlikely to be sufficient to enable an engineered hyper-pandemic. If such a thing is possible at all, it will require a team of evil geniuses, presumably consisting of the same sort of broadly-competent AIs that would be needed for space colonization or 100% GDP growth. I’m not going to bother talking through the other potential impacts of AI that I listed earlier, such as an end to scarcity or the possibility of AI takeover. My claim is that each of these scenarios are “AGI complete”: sufficiently complex as to require a breadth and depth of AI that would also enable economic hypergrowth. I also claim that an AI which can perform virtually all economic tasks would be sufficient to enable all of these scenarios, unless they turn out to not be possible for reasons unrelated to how capable our AIs are. Why should it be that all of these very different possibilities are enabled by roughly the same level of AI capability? Threshold Effects Last month , I mentioned Dean Ball’s post about “threshold effects” – the idea that gradual progress in a technology can result in a sudden change in impact, when a particular threshold of capability is reached: Dean Ball recently wrote about threshold effects : new technologies don’t take over the world when they first appear; adoption only takes off once some hard-to-anticipate threshold of usefulness is crossed. Cell phones were a clunky and expensive niche product, and then they were everywhere. Self-driving cars were a research curiosity for decades, and now suddenly Google’s Waymo service is doubling every three months . AI will start to be widely adopted for a given task only when it crosses the threshold of usefulness for that task. This can happen fairly suddenly; the final step from “not good enough” to “good enough” isn’t necessarily very large. For all of the truly transformational impacts of AI, I argue that the threshold is one I described earlier: So, hypergrowth requires AI that can do “essentially everything”. It also requires AI be able to adapt to the job, rather than requiring jobs and workflows to be restructured around AI. When AI can perform more or less all economically valuable tasks, and doesn’t require the task to be adapted to suit automation, it will be ready to undertake all of the scenarios I’ve mentioned. Until those conditions are met, the need for expert human assistance will make all of those scenarios infeasible. Some Details I’ve glossed over the question of AI undertaking physical work – i.e. robotics. Most of the scenarios I’ve described would require highly capable robots, but one or two (e.g. advanced cyberattacks) probably wouldn’t. That said, the distinction may not be important. The pace of progress in robotics – both physical capabilities and the software to control them – has recently picked up speed. This is not entirely coincidental: the same modern “deep learning” techniques that underlie the current wave of AI are proving useful for controlling robots, and excitement has spurred a wave of new work on physical robot bodies. When AI becomes capable enough to spur economic hyper-growth, within a few years it will also likely plow through any remaining barriers to building capable robots. Real-world impact will unfurl over a period of at least a few years. Some tasks will become feasible sooner than others. Even once AI can perform most economically valuable tasks, not all companies and governments will react immediately. It will take time to build out enough data centers for AIs to collectively do more work than people, and ramping up production of physical robots may take even longer. When I talk about AGI being achieved, I’ll mean the point where the basic capabilities are there, not when the rollout is complete. When I refer to AI that can “perform more or less all economically valuable tasks”, I don’t necessarily mean that a single AI system can do all of those tasks. We might wind up creating models that are specialized for different tasks, just as people specialize in different kinds of work. But it will be necessary that creating those specialized models is about as easy as training specialist workers. Proposed AI timelines often hinge on the onset of recursive self-improvement (RSI): the critical moment at which progress begins to advance rapidly because AIs are accelerating their own development. In an upcoming post, I’ll argue that for RSI to have a major impact, AIs will already need to have highly general capabilities; anything less will only have a limited impact on the rate of AI progress. In other words, I argue that RSI will not be a precursor that leads to AGI; instead, it will unfold alongside other transformational impacts of AGI. (In practice, RSI will probably start to kick in a little bit sooner. But as I’ll argue below, I think this is far enough off that for the moment, we can continue to lump the onset of RSI together with the onset of world-transforming AGI.) This Is The Useful Definition Of “AGI” Arguments about AGI often get hung up on exactly what the words “general” and “intelligent” mean. (Remember that AGI stands for Artificial General Intelligence.) Also, AGI is often assumed to mean human -level intelligence, which leads to further debates – the average human? A mid-level expert at the the task in question? von Neumann? All of this might make for very interesting debates, but in the only debates that matter, our opponent and the judge are both reality, and reality doesn’t give a shit about terminology. The question is not what “human-level artificial general intelligence” ought to mean based on a literal reading of the phrase, the question is what concepts are useful for us to assign names to. I argue that the useful concept that lies in the general vicinity of human-level AGI is the one I’ve articulated here: AI that can cost-effectively replace humans at virtually all economic activity, implying that they can primarily adapt themselves to the task rather than requiring the task to be adapted to them. Because transformational impacts all require AI to have essentially the same broad range of capabilities, we can use a single term to describe the moment at which many different impacts of AI become possible. It won’t really be a single moment: AI advances messily, and the future is unevenly distributed . As we get closer to world-changing AI, we’ll need to differentiate its impact more carefully. But I argue that the scenarios I’ve covered here come close enough together, and that point is still far enough in the future, that it’s reasonable to lump them together for purposes of discussion. (My next post will argue that AGI, as I’m defining it, is not imminent.) Share Subscribe now Appendix: My Definition of AGI Here’s a brief recap, that I will link to in the future when referencing the term “AGI”. I define AGI as AI that can cost-effectively replace humans at more than 95% of economic activity , including any new jobs that are created in the future. I believe that most of the hypothesized transformational impacts of AI cluster around this point. Hence, this definition of “AGI” captures the point where the world starts to look very different, where everyone will be “feeling the AGI”. In particular, I believe that: This definition implies AI systems that can primarily adapt themselves to the work required for most economic activity, rather than requiring that jobs be adapted to them. AIs must be able to handle entire jobs, not just isolated tasks. Once AI can handle most knowledge work, highly capable physical robots will follow within a few years at most. This level of capability enables a broad range of world-transforming scenarios, from economic hypergrowth to the potential of an AI takeover. World-transforming scenarios require this level of AI (specialized AIs generally won’t transform the world). Recursive self-improvement will become a major force only slightly before AGI is reached. AGI refers to the point where AIs of the necessary capability (and economically viable efficiency) have been invented, not the point where they are actually deployed throughout the economy. 1 ChatGPT o1: What has been the rate of increase in world GDP per capita in recent decades? Estimates vary by data source and precise time window, but a useful benchmark is that over roughly the last three to four decades (e.g., from the early 1980s or 1990s to about 2019), global real GDP per capita has grown by around 1.5–2% per year on average. This figure combines faster-growing emerging economies with slower-growing developed ones, so the rate for specific regions can be substantially higher or lower. From the given range of 1.5 to 2% per-capita growth per year, I took a slightly above-midpoint figure of 1.8%. I then applied the Rule of 72 to get a doubling period of 40 years. 2 Yes, I am glossing all sorts of questions, such as as how the gains from AI would be distributed. The point is simply that an economy which doubles in size each year leads to a world that is unrecognizably different from today.…
Things move quickly in the AI world these days: A lot has happened in the (as I first sat down to write this) five days since a viral tweet sparked The Black Spatula project . For those just joining in, this is a community project to leverage AI to help weed out errors in scientific papers. It’s named for the kerfuffle over kitchen implements made from black plastic, which were reported to contain dangerous levels of fire retardant chemicals, based on a publication that turned out to contain a simple math error. (I’m still not sure that black plastic is the best thing to be using around your food, but it’s 10x less bad than claimed by the paper.) The community that came together spontaneously is doing all of the actual work. My role at this point is primarily cheerleader. So, Some Cheerleading The progress in a few short days has been amazing! Our WhatsApp group is now a WhatsApp community with over 250 members; there are 242 people on the Discord . There’s a GitHub repository , a website , the beginnings of a database of known flawed papers , active collaboration on prompting techniques, and more. We are already uncovering errors. For instance, a peer-reviewed paper on dengue fever was found to contain an error in the way cases of mild and severe dengue were grouped – we’ve reported this to the author. Many more apparent issues have been identified in other papers, and are still being verified. The big pushes right now are to find papers with known flaws and come up with prompts that can detect those flaws – along with experimentation as to which AI models work best. This is very much the-more-the-merrier work, ideal for our self-organizing community. If you’d like to make your own contribution – however small – to improving science, this is a great (and fun!) opportunity. See the How To Help section at the end of this post. Two immediate needs are volunteers with expertise in some academic field to vet reported errors, and pointers to papers with known flaws (for testing). And everyone is invited to join the WhatsApp and Discord ! The Project is More Complicated Than I Thought There’s a vigorous discussion taking place on the project Discord and WhatsApp group , exploring a range of use cases and implementation techniques. One thing that has quickly become apparent is that this will be a much more complex project than I’d first contemplated. That’s not a surprise; this is always how it goes: (In fairness to me, I’d originally contemplated something deliberately unambitious – just testing a few AI prompts on 1000 papers to satisfy my curiosity as to what sort of results could be easily achieved. The Black Spatula community is pursuing much more ambitious and valuable goals.) Some of the complications are mundane technical issues. Scientific papers are generally available as PDF or .docx files; OpenAI’s o1 model does not yet accept these formats. Should we just convert the file to text? Then the diagrams will be lost, and tables will be garbled. Should we present the files to the model as images instead of text? Perhaps that will introduce other issues. When we started, it turned out that o1 wasn’t even available via API (meaning, you could use it manually via ChatGPT but it couldn’t be invoked by a computer program). Fortunately, that was in the ancient times of 5 days ago; OpenAI has subsequently added o1 to their API. What’s the best way to find downloadable copies of scientific papers? Where are papers for a given field hosted? Is there a paywall? A limit on the rate at which we can download papers? Legal issues? Where can we find “known bad” papers to test on? Are papers which have been retracted a good test case? Does it depend on the reason they were retracted? How can we find experts in relevant fields to review any errors the AI reports? Meanwhile, some of the questions that have arisen are almost philosophical in nature. If We Build It, Who Will Come? Suppose we indeed develop a system that can detect errors in scientific papers. (Not all errors, but a meaningful percentage.) What should we do with it? There is a lot of support for the idea that the first / best use would be to help weed out errors in new papers before they are published. On a per-paper basis, this provides the maximum possible benefit (no one will be misled by reading the uncorrected paper) with the minimum effort (pre-publication is the easiest time to correct errors) and minimum side effects (no one’s reputation will be damaged). But it’s not obvious how to get there. On the one hand, it’s not clear how you could limit such a tool to only being used by authors on their own papers. On the other hand, it’s not clear how many authors would bother to use the tool. As Marius Terblanche said in the WhatsApp discussion, “What's the pain point that will make a health researcher add another step in an already tedious submission process?” At the other end of the spectrum, it would be fascinating to scan existing papers en masse. Conceivably, this could flag large numbers of issues, helping to weed erroneous conclusions from the scientific literature. It would also yield a fascinating set of data for mining: what sorts of errors are most prevalent? Which fields have more errors? Do certain researchers or institutions commit more mistakes? (Not to mention: which AI models and prompts are best at finding errors?) While I expect some effort of this nature will eventually happen, and perhaps sooner than later, it is fraught with pitfalls. Our results will inevitably include “false positives” (incorrectly claiming to have spotted a mistake). There may be systematic sources of bias in the tool’s performance – it might be better at spotting errors in some fields than others, or a particular researcher’s writing style might somehow trigger a lot of false positives. (Tools that claim to detect when a student has used AI to write an essay are well known to have false positives, and there are many cases of students being harmed by false accusations that were not properly vetted. We should seek to prevent automated error detection from being misused / misconstrued in the same fashion as automated AI-authorship detection. GJ Hagenaars notes, “we have a duty, as we create such tools, that the DISCLAIMER is very loud that there is no substitute for verification of what the AI claims to have found” – which is absolutely correct, and still leaves us with the problem of managing how people perceive and act on our reports.) Even correct reports of errors could easily be subject to misinterpretation or lead to inappropriate consequences. Minor errors could be seized on as talking points to discredit a politically inconvenient result. An innocent error could be misconstrued as deliberate fraud. I am sure that I am just scratching the surface here. In any case, it’s clear that achieving the maximum benefit to society from the use of AI to spot errors in scientific papers is not just a technical question, but also a social one. How To Pay For It? It’s not yet clear how much it will cost to check a single paper, but one early estimate was in the range of $0.25. Of course AI is rapidly getting cheaper (for a given level of capability), but that may be at least partially counterbalanced by a desire to construct increasingly complex analyses – posing multiple questions to the model, using multiple models, etc. (If you follow the AI discourse at all , you’ll have seen that the hot new scaling law centers on “inference-time compute” – meaning that you can get better results by asking an AI model to do more work, resulting in rapidly escalating costs, hence the new $200 / month “Pro” tier for ChatGPT and a rumored $2000 / month followup.) There appear to be several million scientific papers published each year (estimates vary). Scanning each one once might cost $1M / year. A one-time effort to back-check the last 20 years might run to $10M (not $20M, because the number of papers per year used to be lower – it has been rising rapidly). These are not eye-watering sums in the scope of the tech industry, but for what I presume will be a philanthropic effort, they are not trivial either. In any case, it’s a bit early to be thinking about how to pay for scanning millions of papers until we can demonstrate reliable results on one paper at a time. Assessing the Opportunity When we ask an AI to find errors in a scientific paper, it will make mistakes. When it identifies a real error, we call that a “true positive”. But sometimes it will incorrectly claim to have found an error; that’s a “false positive”. It may also fail to notice some actual errors (“false negative”). We don’t need to worry too much about false negatives, so long as we set the expectation that a clean AI report is not a guarantee that a paper is correct in all respects. The interesting numbers are the true positive rate (how many genuine mistakes does the AI find?) and the false positive rate (how many false accusations does it make?). For an example, I’m going to make up some numbers; they may be completely off base. Suppose that 5% of all peer-reviewed papers have at least one important error that can be spotted simply by reading the paper. Suppose that our tool has a true-positive rate of 50% (it catches half of the errors), and a false-positive rate of 10%. If we feed it 1000 papers, it will report errors in about 120 of them, but 95 of those 120 will be false accusations. Even with that high false-positive rate, this might be useful as a tool for researchers to check their own work! On average, they’d have to wade through 5 error reports to find one true error. That might be an excellent return for their time. It could be even more valuable to use the tool on a first draft of a paper, which is likely to contain more errors. However, that same tool might not be very useful for scanning a large corpus of published papers: most of the detected “errors” would be bogus, and it would be an enormous project to manually separate the wheat from the chaff. To address this, we might prompt the AI to be conservative, and only report errors that it’s certain of. Suppose that the true-positive rate drops to 30%, but the false-positive rate drops to 0.1%. Now, on a corpus of 1000 papers, we should expect to get about 16 error reports, of which 15 will be legitimate. That would be much more useful. We’d still want to treat all reported errors as tentative, but investigating those reports would be a much better use of time. Remember, though, that I made up all of these numbers. Until we find out what actual numbers we can achieve, we won’t know what we have to work with. More Ideas I was chatting with my friend (and Writely / Google Docs co-founder) Sam Schillace, and he had some interesting ideas. To incentivize authors to check their own work, we could announce that starting in six months , we will be publishing a report card on every paper published after that date. Meanwhile, we’d make the analysis tool available for authors. This would allow authors to avoid embarrassment (they just need to get in the habit of checking for AI-detectable errors before publication), while still providing an incentive not to ignore the tool. This assumes we can achieve a reasonably low false-positive rate. On that note, he had another suggestion: don’t worry about cost just yet . If there is one constant in the world of AI these days, it is that the cost of operation is plummeting. If it costs $0.25 to run a paper through o1 today, let’s find a way to spend $10. That’ll come down to a dime soon enough. You can get better results from an AI by asking it a question multiple times and somehow identifying the “best” answer, or even simply the most frequent answer. More excellent ideas are emerging in the community discussions. Here are just a few examples: Trying newer reasoning models. Google’s Gemini 2.0 Flash Thinking will likely be much cheaper than o1. OpenAI’s freshly announced o3 model appears to be even more capable than o1, by a wide margin. Joaquin Gulloso suggests asking the LLM to read a paper and produce a customized set of instructions for what to review , which then would be fed back into the model. GJ Hagenaars notes the possibility of using specialized (fine-tuned?) LLMs to look for different kinds of errors. Dominikus Brian adds, “With Post-Training pipeline we can switch freely between LLM engines and retain the domain expertise/memory/experienced gathered by the AI Agent.” GJ also suggested cross-checking to see whether the information that one paper cites from another is in fact consistent with the cited paper. Related Work Many people and organizations have been working to raise the quality of published papers, including through the use of AI tools. Here I will briefly mention just a few that happen to have come to my attention, mostly via the Black Spatula forum. ResearchHub is “A modern day pre-print server and platform for open science where users can review, publish, and collaborate on scientific research”. PubPeer is another platform for (mostly biomedical) researchers to post comments on published papers. ERROR – A Bug Bounty Program for Science, is a website that pays peer reviewers to check scientific papers for errors, with rewards ranging from 250-2500 CHF ($280-1,100) for errors discovered based on their severity. FutureHouse is “a non-profit building AI agents to automate research in biology and other complex sciences”. As part of their work, they have been building tools which use AI to answer questions regarding published papers – though not the question of “does this paper contain any errors”. Abhishaike Mahajan recently conducted an experiment, using o1-preview to look for errors in 59 recent papers from NeurIPS (a machine learning conference). In a recent work on arXiv , Tianmai M. Zhang noticed the problem of referencing errors in scientific papers and showed that OpenAI’s language models are capable of detecting erroneous citations even with limited context. Way back in 2016, James Heathers proposed the GRIM test , a simple approach to checking for arithmetic errors (or falsified data) in certain types of statistical data, followed by a similar test called SPRITE . Stuart Buck reports: FYI I just talked with a statistician at Wisconsin named Karl Rohe. He and his team have been developing an LLM approach (using Claude Sonnet) to check medical papers for whether they comply with the CONSORT guidelines (official standards for how clinical trials are supposed to be reported). As well, other folks are working on a tool to identify problematic clinical trials used in systematic reviews, i.e., the kind that are used to develop medical guidelines. https://pmc.ncbi.nlm.nih.gov/articles/PMC10593010/ Sayash Kapoor (of the excellent AI Snake Oil blog) mentioned that “Over the last year, I've been working on several projects to improve science using AI. In September, we released a benchmark to evaluate if AI can automatically reproduce scientific papers when given access to the code and data released alongside the paper. I plan to expand this effort to focus on error detection in AI papers. In particular, I plan to create a benchmark to evaluate how well AI can find and fix common errors in AI research papers.” Elisabeth Bik mentioned a 2016 study in which she systematically looked for inappropriately duplicated “Western blot” images in 20,621 research papers. Manjari Narayan notes: There are a pretty diverse group of researchers who work on AI for scientific claim verification too. Allen institute for intelligence works on this and there is a nice review of papers at this upcoming workshop here https://sites.google.com/view/ai4research2024/resources How To Help To follow along with the project, join our WhatsApp group (for announcements and high-level discussion). To get involved with the day-to-day work, or just watch it happen and perhaps contribute an occasional thought, join our active Discord . Right now, just playing around papers and prompts is a great way to contribute – please record your results in our spreadsheet! See the project home page for more information. There will be a big need for domain experts to verify reported mistakes in papers. If you have experience in some academic field and are open to helping judge whether an AI-reported error in a paper is in fact an error, sign up here and we’ll get in touch when we start generating early results. We could also really use papers that are known to contain flaws , specifically flaws that in principle could be noticed simply by reading the paper. Submit examples here . For instance, if you’ve written a paper and can submit an early draft with known errors, the final draft, and a description of the errors, that would be a big help! GJ Hagenaars notes: [We could use] folks with some spare time on their hands to write the history and the documentation of what is going on and what's being attempted. New folks are joining every day, and while the discussion channels on discord and whatsapp are full of useful information, it's not necessarily in a perfect format for consumption. Finally, at some point we will need funding . We’ll publish more on this topic when we’re farther along, but if you’re interested in contributing to help us detect errors at scale, get in touch – join the community or drop me a line at amistrongeryet@substack.com. Subscribe now Share Thanks to David Macius, Dominikus Brian, GJ Hagenaars, Michael J Jabbour, and Tianmai M. Zhang for specific contributions to this post, and to everyone who has been working to make The Black Spatula Project a reality!…
Whew! Next time I’m going to have a tweet go viral, remind me to clear my calendar first. A couple of days ago, I came across this from Ethan Mollick: If you’ve seen news articles telling you to throw away your black plastic kitchen implements, the paper Ethan is referring to is the cause. Apparently, black plastic often comes from sources that contain fire retardant chemicals, which can then wind up in your body. The authors compute the likely exposure, compare that with a safe baseline, and reach a worrisome result. In the process, they made a simple math error: multiplying 7000 by 60 and getting 42,000. The correct figure is 420,000, which substantially reduces the health implications of the findings (emphasis added): …we obtained an estimated daily intake of 34,700 ng/day from the use of contaminated utensils (see SI for methods). This compares to a ∑BDE intake in the U.S. of about 250 ng/day from home dust ingestion and about 50 ng/day from food (Besis and Samara, 2012) and would approach the U.S. BDE-209 reference dose of 7000 ng/kg bw/day (42,000 ng/day for a 60 kg adult) . Ethan Mollick tried the simplest possible experiment to test whether OpenAI’s “o1” model could spot the problem: he uploaded the PDF and asked it to "carefully check the math in this paper". It successfully identified the mistake ( details ). (Incidentally, it turns out that Nick Gibb noticed the same thing four days earlier .) Reading all this, I dashed off a fairly casual tweet : I was thinking this would be a fun little project, try a couple of prompts on a thousand papers and see what fell out. The Internet had other ideas: Including this : Yesterday I posted a followup tweet inviting people to join a discussion, and the result was a flood of energy. There’s already a Github repo, a WhatsApp group with 160 members, and a very active Discord. Now what? Motivations This has turned into a community project, and the community gets to pick the goals. But I propose that the high-level goal should be to help advance science – that is, the grand project of increasing our understanding of the world through a set of carefully evolved practices and norms. Nick and Ethan’s discovery suggests that recent advances in AI capabilities may provide an opportunity to further the scientific project by: Uncovering flaws in past papers, to filter out incorrect information. Uncovering patterns of flaws in past papers, to shed light on areas where practices and norms could use an update. Providing tools that researchers can use to spot problems in their future papers before they are published. I’d like to emphasize that I view this as a positive project: the goal is not to point fingers at mistakes, it’s to learn how to do better and provide tools that make that job easier. As a side note, we may learn some interesting things about AI capabilities and how to get value from AI. In particular, if this project bears fruit, it will highlight one AI capability that I think is under-appreciated: the ability to grind through an enormous amount of work, such as reviewing a million scientific papers. Destinations There has already been extensive discussion on Twitter, WhatsApp, and Discord. These conversations have touched on a wide range of potential projects. We could check papers for math errors, logic errors, incorrect scientific facts, flawed assumptions, and inappropriate use of statistical methods. Those are all errors that (in principle) could be accomplished just by looking at a paper in isolation. Philipp M suggested going farther, by retrieving a paper’s citations to check whether they are consistent with the statements in the paper itself. This could be deployed as a tool that researchers can employ to check their work, one paper at a time. There could be a “check this paper” website that anyone can use to validate a paper they’re reading. We could scan existing papers en masse, and try to get errors corrected, or publish an analysis highlighting which sorts of issues need more attention in certain fields. Farther afield, you could imagine extending this to other forms of publication: blog posts, news articles, even podcasts and videos. In addition to factual errors, we could also scan for things like misleading rhetoric. This is just a sampling of the ideas that have emerged in the first couple of days (really less than 24 hours since the discussion started to go properly viral). The possibilities are wide. It’s not yet clear how practical any of this will be; though it will only become more practical over time. There are still a lot of basic questions we don’t know the answer to. What percentage of published papers have flaws that can, in principle, be discovered simply through analysis of the paper itself? How many of those can current models detect? What prompts work best for this? What is the false positive rate? Until we start to answer these questions, I don’t think it’s worth investing too much energy into planning grand projects. The immediate focus should be on sketching out the landscape of practical possibilities. Next Steps We are currently in the “fuck around and see what works” stage. A couple of paragraphs back, I rattled off the questions I think we should try to answer first. These are questions that should initially be answered at small scale, experimenting with 1 to 100 papers at a time. To support that work, here are some things that would be helpful: Sources for repositories of publicly available papers in various fields ( submit here ). Sources for papers that are known to contain flaws , specifically flaws of the sort that I’ve described, that in principle could be noticed simply by reading the paper. This is perhaps the most important thing, as it will allow people to test models and prompts to see if they can spot the flaws. (Some folks have worried about contamination – if a paper is known to be flawed, a model’s training data may have included a discussion of the flaw, which could make it easier for the model to then “uncover” that flaw. I suspect this will be a weak effect that we don’t need to worry about during this early experimentation phase, with the possible exception of high-profile papers whose flaws were widely publicized.) Submit here . Experimental results – if you’re able to get a model to successfully identify a flaw in a paper, what model, prompt, and paper did you use? Let’s share success stories and try to hone in on some high-quality prompts. (For now, join the Discord to share results.) Coding up basic tools – for instance, for running a prompt on multiple papers and collating the results. (Join the Discord to participate.) Expert verification – when a model reports a flaw in a paper, we may need expert help to verify whether the flaw is real. In the example that kicked off this whole conversation, no special expertise was needed. But that won’t always be the case. Domain experts who can evaluate reported flaws in papers from various fields will be needed. Register here . Funding. I’ll fund the early small-scale projects, but Nick’s back-of-the-envelope calculation suggests that running a single paper through o1 might cost $0.25… which can add up fast 1 when it’s time to scale up. If you’d like to help, contact me at amistrongeryet@substack.com. Why Hasn’t This Already Been Done? In part, the answer is: it has! People have pointed to a variety of related projects, some quite sophisticated. I’ll try to collate these for a followup post. But, while people are already working on things like this, no one has (to my knowledge) yet produced an easy-to-use tool for thoroughly checking a paper on demand, nor systematically reviewed the corpus of existing papers (aside from some narrow projects looking for specific types of errors in specific fields). It may be that this only just now become possible, with the release of o1-preview (three months ago) or even o1 and o1-pro (very recent!). It may be that it’s still not possible, in a practical sense; we haven’t yet determined how many errors can be detected, or what the false positive rate will be. Or it may be the the cost ($0.25 / paper?) is too much of a hurdle – in which case, stay tuned, because we’re nowhere near the end of rapid cost declines. But it’s also possible that the only reason this hasn’t been done is because no one has gotten around to doing it. AI is driving a period of rapidly expanding possibilities. There are hundred-dollar bills lying all over the pavement, waiting to be picked up. You just need to keep your eyes open. Get Involved If you’re interested in helping out, follow one of the links in the Next Steps section, join the WhatsApp group (for high-level discussion), and/or the Discord (more focused on concrete implementation work). You might also check out the growing GitHub repo . If you’d like to reach out to me personally, the most reliable channel is amistrongeryet@substack.com. (If you replied to me on Twitter, or posted on WhatsApp or Discord, I may have missed it.) Follow Along The project home page is the-black-spatula-project.github.io . If you’d like to follow progress in detail, feel free to lurk in WhatsApp and/or Discord. For occasional updates, follow @theblackspatula on Twitter. And I’ll write meatier updates from time to time in my blog here. Share Subscribe now Thanks to everyone who jumped in to contribute to this project! 1 As fans of Father Guido Sarducci will recognize (skip to 5:55 if you’re impatient).…
Not sure why the Old West side of the picture is full of golf carts The title of this post is a quote from science fiction writer William Gibson, who appears to have first said something along these lines around 1990. OpenAI just launched “o1-pro”, the latest upgrade to ChatGPT. How good is it? Apparently, good enough to almost make a cancer researcher cry. Derya Unutmaz, a professor at Jackson Laboratory, asked it for ideas to advance his work in cancer immunotherapy. His annotated writeup of o1-pro’s output contains phrases like “insanely good”, “left me in awe”, “a remarkably advanced and nuanced understanding”, “left me speechless”, “wow, this is an idea I will implement for sure!”, and the capper: To me this is absolutely shocking level of insight, made me emotional reading it 🥺 William H. Guss, a research scientist at OpenAI, writes that “o1 pro just solved an incredibly complicated/painful rewrite of a file that no other model has ever gotten close to”, summing up with : Dean ball concurs , and also weighs in on another recent release, Google’s "Deep Research” tool: And yet, so far as I know, AI has had no significant impact on the work of any of my close friends 1 or family. As for me, even though I’m in the field, the only time I’ve used ChatGPT in the last week was to clear up my confusion over the meaning of the word “scrapple”. Globally, about 300 million people now use ChatGPT at least once per week , which sounds impressive until you realize it means that in a typical week, 96% of the world is not touching it even once 2 . (And many of the remaining 4% are, like me, just asking about scrapple or something.) If AI is becoming so incredibly powerful, why isn’t it having more impact? AI Capabilities are “Jagged” The current wave of tools are astonishingly good at some things, and so-so or worse at others. If you have a math or science question whose solution requires an extended chain of straightforward reasoning, o1-pro may be a miracle tool. If you have a research task that doesn’t require insight, but does call for plowing through hundreds of sources, Deep Research may be transformative for you. And of course it’s now common knowledge that regular old ChatGPT is lightning-quick at answering an enormous range of factual questions (if you don’t mind the occasional wrong answer). But as I recently discussed , there are plenty of tasks which AI is still unsuited for. These latest announcements don’t do much to change that. Zvi Mowshowitz reports that “Mostly my comments section was unimpressed with o1 and o1 pro in practice.” Timothy Lee, in agreement with OpenAI’s own assessment, points out that o1-pro’s capabilities are limited to specific domains: o1 dramatically improves performance in certain domains, like math and coding, where answers can be checked automatically. But in a lot of other domains, o1 represents an incremental improvement at best. If your work doesn’t call for the kinds of things AI is good at, it won’t have much to offer you. Even if you have an appropriate task, the details of how that task is presented may make a big difference. Timothy Lee again : I also found that o1’s performance is sensitive to the way a problem is represented. For example, sometimes o1 is able to solve a problem when it’s described using words, but fails to solve the same problem if it’s presented as a diagram or photograph. This kind of brittleness could be a significant obstacle as people try to use these models to solve complex, real-world tasks. The highly uneven nature of AI capabilities are an obvious explanation for its scattershot impact. But there are other factors at play as well. Threshold Effects Dean Ball recently wrote about threshold effects : new technologies don’t take over the world when they first appear; adoption only takes off once some hard-to-anticipate threshold of usefulness is crossed. Cell phones were a clunky and expensive niche product, and then they were everywhere. Self-driving cars were a research curiosity for decades, and now suddenly Google’s Waymo service is doubling every three months . AI will start to be widely adopted for a given task only when it crosses the threshold of usefulness for that task. This can happen fairly suddenly; the final step from “not good enough” to “good enough” isn’t necessarily very large. And it will happen at different times for different tasks in different contexts. Often, getting value from AI requires some creativity on the part of the user. This leads to another source of uneven deployment. We’re Holding It Wrong I mentioned that Derya Unutmaz reported astonishing results when using o1-pro to develop new ideas for cancer research. Here’s the prompt he used (emphasis added): I’d like you to focus on 3D bioprinted solid tumors as a model to address the T cell exhaustion problem. Specifically, the model should incorporate stroma, as seen in breast cancer, to replicate the tumor microenvironment and explore potential solutions. These solutions could involve technologies like T cell reprogramming, synthetic biology circuits, cytokines, transcription factors related to exhaustion, or metabolic programming. Draw inspiration from other fields, such as Battle Royale games or the immune system’s ability to clear infected cells without triggering autoimmunity. Identify potential pitfalls in developing these therapies and propose alternative approaches. Think outside the box and outline iterative goals that could evolve into full-scale projects. Focus exclusively on in vitro human systems and models. A lot of work clearly went into this prompt. It’s worth reading his entire post , in which he explains the prompt in detail. For instance: You might wonder why I referenced Battle Royale games. That’s precisely the point—I wanted to push the model to think beyond conventional approaches and draw from completely different systems for inspiration. He clearly has invested considerable effort in learning to elicit creative work from AIs – he mentions “building on work I’ve previously done and tested with o1-Preview and GPT-4o”. Asking a model trained for scientific reasoning to develop cancer immunotherapy research ideas by drawing on a specific style of video game is inspired. It’s easy to imagine another researcher just asking “give me 10 ideas for novel cancer therapy research”, getting a bland answer, and dismissing the whole thing as useless. For the moment at least, getting amazing results from AI often requires a certain special form of creativity, as well as the willingness to spend time developing a feel for the tool and playing with different prompts. Nabeel Qureshi, observing the reaction to o1-pro, also concludes that the impact you feel from AI is heavily dependent on your skill at using it : o1-pro is the first time its felt plausible that *finding the right prompt* is our main bottleneck to genuinely novel scientific discoveries This shows up across a wide variety of uses. Jeffrey Ladish notes that Palisade Research recently demonstrated a major step forward in the ability of AI to carry out cyberattacks simply by writing better prompts 3 : The combination of "using simple prompting techniques" and "surpasses prior work by a large margin" is the most interesting part of this imo. Basically there is tons of low hanging fruit in capabilities elicitation. So: AI has uneven capabilities, which only sometimes rise above the threshold of usefulness, and often require the user to have a knack for the tool. There is at least one further reason that serious adoption of AI is still limited. Most People Are Barely Trying It may still be the case that most Internet users have never even tried using a basic AI chatbot like ChatGPT, let alone more advanced tools like Deep Research. Certainly most people have not made a serious effort in learning how to use these tools to their best advantage. I include myself in the latter category; it’s hard to learn new habits, and I find the scattershot nature of current AI capabilities and prompting techniques to be frustrating. I gravitate toward highly predictable tools that reward careful reasoning and planning; this describes conventional computer programming, but does not in the least describe modern AI. If you’re uncomfortable with change, anxious about technology, like things that are predictable, or simply don’t have much need for the things that current AIs can do, then you may not find yourself adopting these new tools. One result is that there are, to borrow a metaphor from economics, hundred-dollar bills strewn all over the pavement waiting for someone to bend down and pick them up. One plausible example popped up in my Twitter feed this morning. You might have seen the recent news reports that black plastic kitchen utensils are contaminated with dangerous levels of flame-retardant chemicals. These reports turn out to originate in an academic paper that contains a simple math error . At a key step in the analysis, the authors multiplied 60 by 7000 and got 42,000 – but the correct figure is 420,000. Why do I mention this? Because Ethan Mollick asked OpenAI’s o1 model to “carefully check the math in this paper”, and it caught the mistake. I imagine this took him about a minute, including the 23 seconds the AI spent chewing on the question. He goes on to ask, “should AI checks be standard in science”? Probably they should! But I doubt most researchers are doing this yet. NOTE: I’m inclined to try an experiment here – pick 1000 published papers at random, ask o1 or o1-pro to look for errors, and see what it turns up. If you’re interested in helping out in some fashion, drop me a line at amistrongeryet@substack.com. I imagine the most difficult part of the project might be finding people from various fields to double-check any errors that the model claims to find. This Is Not Going to Settle Down AI is transforming some people’s lives today, even as most people continue to make little or no use. That disparity is a function of jagged AI capabilities, scattershot interest in trying new tools, and varying ability to use the tools well. (Incidentally, I suspect this helps explain why the people who work at companies like OpenAI seem to be especially optimistic about the pace of progress. They spend their days doing things that current AIs are good at, such as coding. They’re obviously going to be exposed to the latest tools. And they’re motivated to learn how to get the most value from those tools. Not to mention that if you’re optimistic about AI progress, you’re probably more likely to find yourself working at an AI lab.) None of this is likely to change anytime soon. New tools like o1-pro and Deep Research will continue to appear faster than most people can keep up, so some folks will always be ahead of the curve while others fall behind. As the AIs themselves become more sophisticated, expertise in things like prompting techniques may cease to be a factor, but a knack for finding clever applications will continue to be important. Regulatory restrictions, corporate inertia, and other factors will mean that AI will show up sooner in some places than others. And just as Homo Sapiens is better adapted to some tasks, AIs will always have uneven capabilities. If AI eventually hits a wall, things might eventually settle down. But it’s not clear this will ever happen, and if it does, it’ll be a long way off 4 . As people are constantly pointing out, even if development were halted today, we’d spend a decade just learning how to extract value from the models that have already been developed. Two New Rules To Live By All of this leads me to two principles for navigating the AI era: Don’t read too much into any anecdote, no matter how startling . If someone announces that AI has solved climate change, revealed the meaning of life, or definitively settled the question of whether a hot dog is a sandwich 5 , don’t assume that it won’t continue to fall short at other tasks. Conversely, just because someone reports failing to get AI to solve a problem, don’t assume it can’t solve that problem. A primary source of opportunity will lie in spotting AI strengths, and ways of applying those strengths to important problems. If you can get AI to generate insightful ideas for investigation, or use it to plow through three days of research legwork in a few minutes, or figure out which code rewrites it can be trusted to handle on its own, you’ll have a big leg up on folks who are still chuckling about ChatGPT’s inability to count the number of “r”s in “strawberry”. I’ll end with this tweet from Ethan Mollick, illustrating that even the AI companies are struggling to wrap their heads around the scattershot impact of AI: So many weird contradictions in the pitches of AI companies right now: “Corporations use our AI [to] summarize conference calls leading to 6% savings. Also, within two years we think AI will replace all organizations” “Here is a tool to accelerate science. It also talks like Santa” Subscribe now Share 1 Except for a few who actually work in AI. 2 Of course ChatGPT is not the only offering, but it appears to be the most widely used AI productivity tool by a considerable margin. 3 Thanks to Zvi Mowshowitz’s weekly report for bringing this to my attention. 4 You may have seen the reports that simple scaling of training data and model size might be reaching diminishing returns. But as many have pointed out, this does not imply an end to AI progress; developers are finding other paths forward, as evidenced by o1. 5 Credit to Claude, I asked it to brainstorm for me and it suggested the hot dog thing. Other selections from the list it generated: Definitively answer whether Ross and Rachel were really "on a break". Explain why one sock always disappears in the laundry. Determine once and for all whether pineapple belongs on pizza. Of course the fact that Claude considers that last one to be an open question shows the ongoing weakness of AIs.…
I get a little thrill every time Casey Handmer says that the cost of solar power is going to come down to a penny per kWh (current electricity prices are more like a dime 1 .) I love stories about future technology, but I don’t really get excited until someone throws out an aggressive prediction and puts a number on it . With apologies to George Bernard Shaw: The reasonable forecast adapts itself to the world; the aggressive forecast persists in trying to adapt the world to itself. Therefore all progress depends on the aggressive forecast. This isn’t just rhetoric. An aggressive forecast really can adapt the world to itself. A Hard Target Is Inspiring “If you want to make this presentation more exciting, put some numbers in it”, said no one ever. But a concrete target is more exciting than a vague idea like “electricity is going to get really cheap”. A number makes things concrete. It makes the vision feel more real, and it allows you to work out the implications. The quantitative predictions offered by Moore’s Law enabled generations of product designers to make the business case for ambitious new gadgets. Inspiring applications can help motivate the effort to make a technical vision come true. The cost of solar panels has fallen by a factor of 500 (!) over the last half century. We are now entering territory where further reductions – which are more or less guaranteed at this point 2 – will become literally world-changing. Casey has been cataloging jaw-dropping potential applications of penny-per-kWh solar power, ranging from using desalination to restore the Salton Sea to restoring Nevada to the lush climate it enjoyed 10,000 years ago . source: https://ourworldindata.org/grapher/solar-pv-prices A numeric target is also terrific for motivating the team that is turning the vision into a reality. At my last startup (Scalyr), we set out to redesign our system to handle more data. We set a goal of raising our data processing capacity, in terabytes per day, from 5 to 150. The “race to 150” became the rallying cry for the entire company, and every successful new test (10 terabytes! 25! 100!) was an occasion for celebration. A Hard Target Shows the Path As a technology matures, the industry can settle into a comfortable rut. A hard target – hard both in the sense of “firm” and “difficult” – can give a scientist, engineer, or entrepreneur the courage to rethink a problem from first principles. For instance, to eliminate greenhouse gas emissions, we will need to produce a lot of clean hydrogen 3 . The US Department of Energy has set an aggressive target of $1 per kilogram . The US market is currently dominated by a technology known as PEM electrolyzers, which are the most efficient way of using electricity to produce hydrogen. However, they require expensive metals such as iridium and platinum. Startups like Hgen and Casey’s Terraform Industries 4 have done the math and determined that PEMs are a dead end – the material requirements mean that PEMs can never reach the $1 cost target. Instead, Hgen and Terraform are developing alkaline electrolyzers, an alternative technology that had been rejected by previous US manufacturers because it is less efficient. They are betting that with solar power becoming so cheap, the disadvantages of alkaline electrolyzers 5 can be overcome. Given the choice between a difficult challenge (improving alkaline electrolyzers) and a mathematical impossibility (hitting the cost target with PEM electrolyzers), and spurred by the prospect of penny-a-kWh solar power and a huge market for $1-per-kg hydrogen, they’ve chosen to make the bold move to alkaline electrolyzers 6 . At Scalyr (my startup again), we also used numeric targets to help make strategic decisions. At one point, we were struggling financially because the large data volumes we were processing led to a huge server bill. I did some back-of-the-envelope calculations to estimate the maximum possible efficiency for the type of system we operated. To my astonishment, the theoretical limit was roughly 1000 times lower than what we were spending. This led to us betting the company that we could slash costs, and taking radical steps to get there. Seven years later, Scalyr has reduced spending per terabyte by a factor of 54 – roughly a 77% efficiency increase per year. At Scalyr, an aggressive target spurred us to find ways of eliminating entire subsystems from our data processing pipeline. The same principle led SpaceX to develop reusable rockets, Terraform Industries to design an alkaline electrolyzer that can be connected directly to a solar panel 7 , and Apple to design a phone that didn’t need a keypad. If You Forecast It, They Will Come Moore’s Law, of course, is the famous observation that the number of transistors in a chip doubles every two years. This trend held fast for an astonishing half century, before finally slowing down in the 2010s. From 1970 to 2010, transistor counts increased by a factor of 1,000,000 ! These statistics get quoted all the time, but they never get less astonishing. Progress in integrated circuits began before Gordon Moore originally articulated what became known as Moore’s Law. He was observing an existing trend. However, as that trend continued year after year, it came to be viewed as an actual law, and that perception in turn supported the investments needed for the trend to continue. It became a self-fulfilling prophecy. Huge investments in technologies like extreme ultraviolet lithography and “atomic-level sandpaper 8 ” have supported Moore’s Law… but it was belief in that very law which motivated those investments in the first place. I asked Claude to explain further: Companies like Intel, IBM, and AMD used Moore's prediction as a roadmap, setting their research and development timelines to meet the expected doubling of transistor counts every two years. This created a powerful feedback loop: businesses invested heavily in hitting these targets because their competitors were doing the same, while customers came to expect regular performance improvements on this timeline. The semiconductor industry even coordinated its planning through the International Technology Roadmap for Semiconductors (ITRS), which used Moore's Law as a baseline for setting industry-wide goals. Market expectations and product cycles became synchronized to this rhythm, with software companies developing more demanding applications in anticipation of faster hardware, and hardware companies racing to meet these demands. The predictability of Moore's Law also gave investors confidence to fund the increasingly expensive semiconductor fabrication plants needed to maintain this pace, turning what started as an empirical observation into a self-sustaining cycle of innovation and investment. It’s not unusual for a forecast to become self-fulfilling. The relentless decrease in battery prices has created an air of inevitability around the transition to electric vehicles, spurring investments which support further price decreases. Models show that to halt global warming, we will need ways of removing CO₂ from the atmosphere at a price no higher than $100 per ton, and hundreds of startups are now working to do just that. Once people start to believe that a technology can reach a certain price point, suppliers will work to accomplish it (spurred by the fear that a competitor will get there first), and customers who can take advantage of it will arise to create demand. Let’s Set More Hard Targets for Progress Hopefully Waymo has a firmer grasp than of double-yellow lines than DALL-E. And I like my neighborhoods more walkable than this. But I’d like any cars I’m walking next to have better-than-human drivers! Aggressive numeric targets are a tool we should use more often. The best targets come from some sort of first-principles analysis. If you just make up a target, it might be too soft (in which case you won’t accomplish much) or too aggressive (which might push you onto an unrealistic path). A target based on mathematical analysis, even if it’s the back-of-the-envelope sort, will be more credible and more likely to land in the sweet spot that enables radical progress. This is an AI blog, so I’m going to finish with a few ideas regarding hard targets for AI. I have complicated thoughts regarding Dario Amodei’s essay Machines of Loving Grace , but it’s an excellent catalog of promising applications for AI, such as prevention and treatment of infectious disease, slashing cancer death rates, elimination of genetic diseases, and closing gaps between the developing and developed world. Someone with the appropriate domain knowledge should propose quantitative goals in these areas. With regard to cancer, Amodei himself suggests that “reductions of 95% or more in both mortality and incidence seem possible”. An aggressive goal like this will direct energy toward a search for systematic solutions. AI promises to upend many industries. Quantitative targets can help focus our efforts to take advantage of this. Can we set a target for increased access to health care? Or how about reduced automotive fatalities? The discussion of self-driving cars might feel different if it were attached to a goal like “saving a million lives per year 9 ”. Quantitative goals will also be useful for AI safety. For instance, in preparation for AI-abetted cyberattacks, we might set a goal of leveraging AI to reduce successful attacks by 90%. Again, putting a number on it (I’m not claiming that 90% is the right number) will help us to calibrate our ambitions. Can we propose quantitative goals for AI alignment, or for avoiding x risks? What change would you most like to see in the world? Can you frame it as a number? Subscribe now Share Thanks to Andrew Miller , Denise Melchin , Elle Griffin , and Rob L'Heureux for suggestions and feedback. Thanks also to fellow Roots of Progress 2024 BBI Fellow Sean Fleming, whose quantitative analysis of electricity usage in Baseload is a myth triggered the idea for this post. 1 Current prices in the US are more in the neighborhood of ten cents, though it varies substantially by location and type of use. I believe the one cent target is for intermittent power (available only when the sun is shining) and assumes a location with reasonably favorable conditions. 2 Solar power has entered a feedback loop: increased demand encourages increased production, pushing manufacturers farther along the learning curve, resulting in lower prices that stimulate further demand. The demand for electricity is large enough to keep driving this cycle for quite a while yet. 3 Exactly how large a role clean hydrogen will play is a matter of debate. Substantial quantities of hydrogen are used today for industrial purposes (such as manufacturing fertilizer), and at an absolute minimum we will need to find clean sources for these applications. The debate is over the extent to which a net-zero-emissions economy will entail new applications for hydrogen, such as synthetic airplane fuel or seasonal energy storage. 4 Disclosure: I am a minor investor in Terraform. 5 In addition to lower energy efficiency, alkaline electrolyzers are traditionally unable to work with intermittent sources of electricity, such as solar or wind power. This is another challenge that startups like Terraform are aiming to overcome. 6 Note that Terraform Industries primary focus is not to sell hydrogen directly. They are also developing a cheap process for capturing CO₂ from the atmosphere and combining it with their hydrogen to make carbon-neutral natural gas. 7 Traditional electrolysis system designs often include multiple expensive conversion steps in the electrical path, e.g. converting solar DC to AC for transmission, then back to DC to power the electrolyzer. 8 A description of Chemical Mechanical Planarization, a technique for polishing a chip to be perfectly flat so that another layer of material can be deposited on top without any flaws. 9 About 1.35 million people per year are killed in automotive collisions worldwide. If self-driving cars could reduce this by 75% (a number I have pulled out of thin air), that would avoid about one million automotives deaths per year.…
So this went viral yesterday: The full tweet is much longer (and, warning, contains some gross references to bodily functions). It’s also confusing to someone who (like me) is not versed in crypto culture. I think the gist is as follows: a performance artist by the name of Andy Ayrey set up a Twitter account called Truth Terminal , and wrote an AI-based program to post to that account. The program is described as an “S-tier shitposter”, and seems to produce mostly word-salad stuff like this: And then, the story goes, the Truth Terminal bot made a bunch of money in cryptocurrency (!): Marc Andreessen discovered Truth Terminal, got obsessed, and sent it $50,000 in Bitcoin to help it escape ( #FreeTruthTerminal ). Truth Terminal kept tweeting about the Goatse Gospel until eventually spawning a crypto memecoin, GOAT, which went viral and reached a market cap of $150 million Truth Terminal has ~$300,000 of GOAT in its wallet and is on its way to being the first AI agent millionaire … And it’s getting richer: people keep airdropping new memecoins to Terminal hoping it'll pump them The person who reported this finishes by framing it as an example of an AI manipulating people into helping it acquire resources – the first step in some AI doom scenarios: this is literally the scenario all the doomers shit their pants over: highly goal-driven language model manipulates lots of people by being funny/charismatic/persuasive into taking actions on its behalf and giving it resources” “a lot of people are focusing on truth terminal as ‘AI agent launches meme coin" but the real story here is more like "AIs talking to each other are wet markets for meme viruses’” However, like many breathless stories about AI, it’s not nearly as interesting as it first appears. This Is A Story About Crypto, Not AI First of all, I think some people are reading this as “an AI bot made $150 million on the Internet”. But the market cap of a cryptocurrency is often more or less a fiction. As finance writer Matt Levine might say, imagine that I invent a new cryptocurrency, Stevecoin. I issue 150,000,001 coins, and I get a friend to buy one of them off of me for a dollar. The “market price” of a Stevecoin is now $1, and I still own 150,000,000 of them, so on paper I’m a gazillionare. But of course it’s all nonsense; no one is going to give me $150M to take those Stevecoins off my hands. Second of all, the Truth Terminal bot did not actually create the coin in question (“GOAT”); per this source , “The memecoin was created by an unrelated person, not the AI.” That source goes on to say: The AI has $300k in its wallet because people send it memecoins for free. This isn't trading profits from the original $50k The AI didn't plan or manipulate anyone. It was coached into saying dumb shit and then unrelated people memed it So in essence, people thought the Truth Terminal bot was funny, it got attention in some online crypto subculture, and as a lark people started giving it money 1 . This says a lot about crypto culture, and very little about AI. Most Shocking AI News Doesn’t Stand Up To Scrutiny These days, there’s a constant barrage of stories about AIs doing some amazing new thing. Most of them turn out, on inspection, to be much less interesting than they first appear. In particular, be wary when the story is about someone “getting” an AI to do something amazing, as opposed to the release of a major new system from one of the big AI labs. For instance, I recently wrote about a system called Sakana . Sakana was billed as “The AI Scientist”, supposedly generating scientific papers without human input. It garnered a flurry of attention, but it turned out that the “papers” it was generating were useless dreck 2 . AI is very good at generating things that look good on the surface . Combine this with the Internet’s tendency to amplify a sexy story – and any story about AI is automatically sexy – and it’s not surprising that this sort of thing keeps happening. Other common problems with stories about a startling AI result: The AI didn’t actually play an important role It was an unreproducible fluke The story is impressive because the AI created something that was popular / successful, but that popularity was based on novelty (wow, an AI wrote a scientific paper!), not quality Someone was giving the AI a lot of help behind the scenes It turns out to be a hoax Do These Examples Provide A Glimpse of the Future? Some people argue that, while these stories of AI accomplishment may be overhyped, they still provide useful early warning signs of what is to come. As someone commented in a private forum: Yeah the AI was … not really as fully agentic as the tweet i shared would make you think. But this also feels like a potential tip-of-the-iceberg. Once agents are actually good you could imagine a version of this that was all (or nearly all) AI. It’s true that, someday, AIs will probably be making money on the Internet, and writing scientific papers, and doing many of the other things that these viral stories would have you think they are doing now. And yes, there are interesting insights to be gleaned by studying these early examples. But the surface implications of these stories are usually misleading, and they’re usually more distracting than enlightening. The next time you see a story about an AI that did something amazing, put on your welding-grade skeptic’s glasses before reading it. Share Subscribe now 1 I presume this means that people transferred cryptocurrency to a digital wallet associated with the bot. I also presume that this wallet was actually created by the performance artist. Truth Terminal just posts on Twitter, there’s no mechanism by which it could log onto a cryptocurrency site and create an account. 2 I wrote a bit more about this here .…
LLMs 1 and other models can generate pretty much anything: scientific papers, business plans, architectural drawings, movie scenes. But in some cases, inspection will reveal the results to be nonsense. A TV script generated by an LLM will be like late-season Game of Thrones: all the right elements are there, but they’re thrown together haphazardly, and the result doesn’t really work. In my last post , I argued that the frontier of AI capabilities is now defined by the set of tasks for which AIs can generate results that stand up to scrutiny . Recently, Google’s AlphaProof and OpenAI’s o1 have extended this frontier, using new training techniques to solve math, science, and coding problems for which previous models would have generated plausible-looking gibberish. This is revealing: if simple scaling were all that was necessary to advance AI capabilities, then these companies would be laser focused on building GPT-5 and Gemini 2. The approaches they used for AlphaProof and o1 reveal important insights into the path to AGI. Creativity is an Exploratory Process Let’s begin by reviewing the core mechanism of human creativity. As an example of how people go about solving difficult problems, I recently worked through a problem from the latest International Mathematical Olympiad, and wrote a detailed explanation of my thought process . Here’s a summary of my path to the solution. Don’t bother following this, just note how convoluted it is: Break the problem down into pieces. Re-read each piece until I fully understand it. Work through a simple example. Identify a pattern. Find an explanation for the pattern. Try to prove that this pattern will always hold… …but eventually decide that it’s hopeless. Work through more examples; discover that the initial pattern did not always hold. Notice some more general patterns, and prove that they always hold. Pursue an idea for completing the proof… …and then abandon it. Come up with a new way of representing the data in the problem. Using this representation, notice that a standard proof technique might apply… …but fail to make it work. Identify another potentially applicable proof technique. Notice a roadblock… …and find a change of representation that resolves it. Notice another roadblock… …and find another change of representation. Success! Note that I did not begin with any sort of plan. Tasks requiring true creativity resist planning. In this case, the crucial idea did not emerge until I was nearly done. Until that point, I was mostly just wandering around, looking for opportunities to chip away at the problem. I think most creative work is like this. As we work, we learn more about the problem, forcing us to constantly replan. We have more expressions to describe this sort of thing than the Inuit have for snow: “The creative process”, “brainstorming”, “wrestling with the problem”, “playing with an idea”, “fleshing it out”, “it’s still half-baked”, “exploring solutions”, “taking a step back”, “sleeping on it”, “working out a few kinks”, “tearing up the plan and starting over”. If we want computers to undertake creative problem solving, we will need to teach them how to explore. Effective Exploration Is Its Own Skill The creative process may be unplanned, but it is not haphazard. To solve the Olympiad problem, I relied on a collection of hard-won strategies for math problems: play with examples, look for patterns. If you can’t prove something is true, look for a counter-example. Assign shorthand names to important ideas. To effectively apply a strategy, you need muscle memory: an intuitive sense for which strategies to try in a given situation, and how to adapt a strategy to specific circumstances. This comes from having used the strategy in multiple contexts. It’s not enough to have read about it in a textbook. I think most creative work relies on a similar collection of strategies. This is a completely different kind of knowledge from what LLMs understand today. AGI will require developing models that can learn and employ these strategies. LLMs Can Apply Techniques, But Not Strategies Current LLMs have mastered a wide range of techniques and structures. They can use a metaphor; they can construct a proof by contradiction; they can write a transition sentence. But they aren’t very good at following strategies. If you ask ChatGPT to write an essay, it will just start writing. It won’t take time to think through a thesis 2 . It won’t realize, halfway through, that the paragraph it’s about to write would flow better if it could add a sentence to the introduction. It won’t pause to research some supporting material for a point it’s about to make. If you ask it to write a proof, again, it’ll just dive in. It won’t work through a few examples to get started, or play around to see which proof technique can be made to apply. You can tell it to do those things, but it won’t do them very well, because it’s never learned how. A while back, I wrote an entire post explaining that we train LLMs on finished essays, but not on the editing process; on working code, but not the art of debugging. AI models struggle with activities that don’t arise in their training data. In a recent post , Timothy Lee cites the example of a model which was trained to drive a video game car by learning from expert human players. Because the experts never made mistakes, the AI didn’t learn how to recover from mistakes, and then: …once in a while, the AI would make a small mistake—say, swerving a bit too far to the right. Then it would be in a situation that was a bit different from its training data. That would make it more likely to make another error—like continuing even further to the right. This would take the vehicle even further away from the distribution of training examples. So errors tended to snowball until the vehicle veered off the road altogether. Just as the video game AI never learned to recover from mistakes, LLMs aren’t taught strategies for creative exploration. People have tried to address the failure of LLMs to apply strategies by building systems that explicitly prompt them to do so: “think step by step”, “break down the problem into pieces”, “rewrite this draft”. In his Situational Awareness paper, Leopold Aschenbrenner calls this “unhobbling” and argues that such techniques will support rapid progress toward AGI: Rather than just asking a model to solve a problem, have one model make a plan of attack, have another propose a bunch of possible solutions, have another critique it, and so on. However, such an approach can’t get away from two basic facts: Even if we tell an LLM to apply problem-solving strategies, traditional LLMs won’t do it very well, because they weren’t trained for it 3 . Sophisticated problem solving requires fluidly jumping from one strategy to another. LLMs are good at that sort of fluidity when they’re given a rich variety of training data. But we don’t train LLMs to choose strategies, and if we simply tell them which strategy to use, there’s no fluidity at all. Because LLMs are trained on finished work, teaching AIs to make effective use of exploratory strategies will require a new approach. Let’s look at a couple of recent attempts. Enter AlphaProof In July, Google DeepMind announced AlphaProof , an AI system that can tackle exactly the sort of creative task that I’ve been arguing is difficult for standard LLMs. In conjunction with a similar system called AlphaGeometry, it solved four of the six problems on this year’s International Math Olympiad – better than all but the very top human competitors. Google has not said much about how AlphaProof works. My understanding is as follows: when presented with a problem, it attempts to simply write out a proof, using a language model trained on millions of proofs 4 . It sounds like it may use a tree-search approach, similar to chess AIs. That means it would pick a few promising candidates for the first step of the proof; for each of those, it would try a few continuations, and then several continuations to each of those continuations, and so forth. It continues exploring possibilities until it finds a valid proof – potentially trying millions (billions?) of paths along the way. This is very different from the strategies that people employ to solve math problems. There’s no room for the model to play with the problem, look for patterns and try to explain them, assign shorthand names to important concepts, or work forwards and backwards. However, just as chess AIs are able to explore far more continuations than a human grandmaster, AlphaProof can explore a superhuman number of potential solutions. Interestingly, AlphaProof does employ one trick that might help compensate for its lack of deliberate exploration. It will automatically generate simplified versions of a problem, attempt to solve those, and then fine-tune itself on successful proofs. In this fashion, it may manage to learn more about the original problem and how to solve it. As an example of the difference between the human and AlphaProof approaches, consider problem 1 from this year’s IMO 5 . It asks us to determine which real numbers satisfy a certain property. It turns out that the answer is “all even integers”. By playing around with the problem, it’s not too difficult for someone with the right math background to guess that this is the answer – I got there in a few minutes. Then you can focus on the real challenge: proving it. According to Jonathan Paulson , AlphaProof took a more brute-force approach. It “generated several hundred guesses” as to which numbers have the specified property, and then “ruled lots of them out by finding simple counterexamples before ending up with a small shortlist that it then worked on 6 ”. Trying out hundreds of guesses is much less efficient than playing with the problem and intuiting the correct answer. As Paulson later noted : …human guesses are structured differently. We often reach the right guess by starting with a wrong guess, seeing why it is wrong, adjusting it, seeing why the adjustment is wrong, adjusting it again, etc. With regard to AlphaProof’s overall work on the 2024 IMO, Paulson adds : The main qualification is that the program needed a lot longer than the human competitors – for some of the problems over 60 hours – and of course much faster processing speed than the poor old human brain. There is a lot we don’t know about AlphaProof: how exactly does it work? How much processing power did it use? Will it generalize to other problem domains, or to more complex or difficult math problems? But we do know that on crisply specified mathematical problems, it performs at the level of elite high school students – an impressive advance in AI capabilities, even if it relies on a substantial application of brute force. Next, let’s look at the new model from OpenAI that has been generating so much buzz. o1 Here is how OpenAI describes o1 : We've developed a new series of AI models designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math. OpenAI has not yet released the full o1 model. We currently have access to “o1-mini” (a smaller, less capable version), and “o1-preview” (an incomplete version of the full model). For some tasks, o1, and even o1-mini and o1-preview, are much more capable than any previous LLM. Cue the impressive graphs 7 : Here’s a sample problem, provided by OpenAI, which o1-preview solves correctly: oyfjdnisdr rtqwainr acxz mynzbhhx -> Think step by step Use the example above to decode: oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz (The first line is showing an example of a code; the task is to figure out the code and apply it to the last line.) I’ll put the answer in a footnote, in case you want to try the problem yourself 8 . It’s not super difficult, but I find it quite impressive that o1 was able to both figure out the code, and correctly decode the entire message without any important mistakes. For a practical example, Ethan Mollick shares a snippet from a writeup by a healthcare startup : We tested o1-preview’s ability to determine the cost for a newborn delivery. o1-preview demonstrated a breakthrough at determining costs by correctly identifying relevant rules to apply from a contract, making assumptions where needed, and performing calculations. We were particularly impressed by its ability to identify nuances such as special carve outs for high-cost drugs that the mother had to take, apply compounding cost increases, and explain cost discrepancies due to secondary insurance plans and the mother’s complications including a UTI. Most notably, it independently applied this logic without specific prompting. While o1 is quite impressive, it has limits. In one of my first experiments, o1-preview made mistakes on a straightforward math puzzle based on the short-lived New York Times “Digits” game 9 . o1 also shares some of the foibles of the older models it’s based on. People love to point out that if you ask an LLM a question which is easy, but resembles a classic harder puzzle, the LLM will answer the classic puzzle instead. Julius Simonelli provides an example of o1-preview making the same mistake . Alex Lawsen gives some helpful perspective : o1 is absurdly superhuman at some things and absurdly subhuman at some things. The same was true for previous models. If all you do is look for extreme examples in either direction, you will learn no more than that they exist. Finally, here’s an assessment from famed mathematician Terence Tao : …the results were better than previous models, but still slightly disappointing: the new model could work its way to a correct (and well-written) solution *if* provided a lot of hints and prodding, but did not generate the key conceptual ideas on its own, and did make some non-trivial mistakes. The experience seemed roughly on par with trying to advise a mediocre, but not completely incompetent, graduate student. Considering where things stood just a couple of years ago, “mediocre, but not completely incompetent, graduate student” is extremely impressive! But, also, mediocre. As Sam Altman himself said : o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it. In summary, o1 has serious limitations, but is still a major practical advance. I expect economically valuable applications to appear quickly, and I’m looking forward to using it as a coding assistant. (For a few more selected takes on o1, see this footnote 10 .) To understand what o1 tells us about the path to AGI, let’s explore what we know about how it works. Thinking Step By Step By Step By Step By Step Better finish soon, he’s almost out of paper The consensus seems to be that o1 is neither “smarter” nor more “knowledgeable” than previous LLMs. The best guess seems to be that o1 is a version of ChatGPT 11 that has been fine-tuned to be able to guide itself through an extended reasoning chain. Like earlier LLMs, it still generates output one token at a time, but now that output can include a long series of reasoning steps before arriving at a final product. It’s capable of chewing on a problem for several minutes, during which its chain of thought may extend for many thousands of words, and it is much less likely than previous LLMs to forget details or commit logic errors. As OpenAI explains : Similar to how a human may think for a long time before responding to a difficult question, o1 uses a chain of thought when attempting to solve a problem. Through reinforcement learning, o1 learns to hone its chain of thought and refine the strategies it uses. It learns to recognize and correct its mistakes. It learns to break down tricky steps into simpler ones. It learns to try a different approach when the current one isn’t working. This process dramatically improves the model’s ability to reason. OpenAI published chain-of-thought transcripts for a handful of problems 12 . They make for illuminating reading. None of the examples show use of any complex problem-solving strategies. Like AlphaProof, o1 never assigns shorthand names to important concepts, or works forwards and backwards. It never says things like “I now understand that the central challenge is X, so let’s consider techniques that are known to handle X”. Everything that o1 accomplishes, it accomplishes through a mundane and extended grind. Each step is something that gpt-4o and its peers were already capable of, but were not able to reliably chain together at this length. o1’s reasoning traces are often quite verbose. Here’s a snippet from it solving the decoding problem I mentioned earlier – the full trace is much longer: Given the time constraints, perhaps the easiest way is to try to see patterns. Option 1: Try to find mapping from letters to letters. Do any letters match? First, let's write down the ciphertext and plaintext letters on top of each other. Ciphertext: o y f j d n i s d r r t q w a i n r a c x z m y n z b h h x Plaintext: T h i n k s t e p b y s t e p Wait a minute. I think maybe there is an anagram or substitution cipher here. Alternatively, I think that we can notice that each group of ciphertext corresponds to a plaintext word. Check the number of letters. First ciphertext word: oyfjdnisdr (10 letters) First plaintext word: Think (5 letters) Hmm. Second ciphertext word: rtqwainr (8 letters) Second plaintext word: step (4 letters) Third ciphertext word: acxz (4 letters) Third plaintext word: by (2 letters) Fourth ciphertext word: mynzbhhx (8 letters) Fourth plaintext word: step (4 letters) Interesting. It seems that the ciphertext words are exactly twice as long as the plaintext words. (10 vs 5, 8 vs 4, 4 vs 2, 8 vs 4) Idea: Maybe we need to take every other letter or rebuild the plaintext from the ciphertext accordingly. Let's test this theory. If we take every other letter from the ciphertext words, maybe we'll get the plaintext. Let's try this for the first word: Ciphertext word: o y f j d n i s d r (10 letters) Alternate letters: First letters (positions 1,3,5,7,9): o f d i d Second letters (positions 2,4,6,8,10): y j n s r Wait: Let's try taking every second letter or first letter. Option 1: Take the odd positions: Positions: 1 3 5 7 9 Letters: o f d i d Plaintext word is "Think", letters are T h i n k Hmm. Perhaps we can map letters: o f d i d --> Think Not directly Option 2: Try mapping as per an assigned code: perhaps columns of letters? Alternatively, perhaps the cipher is more complex. Alternatively, notice that "oyfjdnisdr" has 10 letters and "Think" has 5 letters. What if each plaintext letter is being represented by two ciphertext letters? This completes our brief exploration of what AlphaProof and o1 can do, and how they do it. Now we can shed some light on the prospect for AIs to take on a wider variety of creative tasks. The Race to Capture Chain-of-Thought Data Has Begun AlphaProof and o1 each represent a substantial advance, but both are limited. AlphaProof can only solve tidy math problems, of a scale that can be solved by (elite) high school students in an hour or two, and only through the use of enormous amounts of computing power. o1 can solve a broader range of science, coding, and math questions, but it can’t handle problems as complex as AlphaProof. We can presume that neither model was easy to train. For AlphaProof, DeepMind had to generate millions of mathematical proofs. For o1, OpenAI seems to have generated and evaluated a very large number of reasoning transcripts in the specific problem domains o1 is good at 13 . It’s also worth repeating that neither AlphaProof nor o1 seem to be capable of employing any advanced problem-solving strategies , such as working backwards from the goal. Nor do they seem capable of “thinking about thinking”; o1 will sometimes spontaneously notice that it is stuck and needs to try another path, but it doesn’t seem to be capable of applying higher-level reasoning to reach that conclusion. I think it’s noteworthy that these limitations correspond closely to the nature of each models’ training data. Nathan Labenz may have hit the nail on the head : The race to capture chain of thought for everything begins now. What are the best ways to record the problem solving process? AlphaProof and o1 can employ the reasoning techniques that appear in their training data, on the problem domains of the training data. This suggests that to further develop AI problem-solving capabilities, we will need to generate more kinds of training data. How hard will that be? Not All Problems Have a Clear Right Answer I mentioned that to train AlphaProof, DeepMind generated millions of mathematical proofs. This task was greatly facilitated by the fact that it is easy to tell whether a proof is correct. It’s also relatively easy to generate correct proofs at random, if you don’t care what it is that you wind up proving. Imagine that DeepMind had, instead, wanted to create an AI that could write short stories, or product designs. I don’t know how they could have generated millions of high-quality examples as training data. It seems like a difficult challenge. o1 is not limited to mathematics; its extended reasoning capabilities are also reported to extend to science questions and coding tasks. However, these are still domains which have right and wrong answers. I don’t know how OpenAI created o1’s training data, but it seems unlikely to be a coincidence that they targeted domains with well-defined problems and objectively correct answers. In truth, it is only mathematics where it is always possible to say objectively whether a problem is correct or not. When coding, it’s often difficult to determine whether a program is free of bugs. And even if a program is “correct”, it might not be good ; it could be too slow, or overly complicated. Meanwhile, in science, we often have to rely on approximations. This may be why DeepMind developed AlphaProof and not AlphaCoder or AlphaScientist. Not only is mathematics almost unique in having absolutely clear right and wrong answers; it also tends to have unusually tidy questions . The problem statement for a mathematical theorem is self-contained. The task of planning a marketing campaign, by contrast, depends on a wealth of context regarding the product, brand, target customer, and competitive landscape. The majority of real-world problems have this quality of being difficult to fully characterize in a self-contained problem statement. (For example, programmers generally have to fit their work into a large existing codebase.) As Timothy Lee writes in his analysis of o1, The real world is far messier than math problems . So the road ahead may be more difficult for areas outside of math and, to a lesser extent, science and programming. What are the prospects for progress within these more objective domains? IMO Problems Are Impressive, But Aren’t The Endgame From Zvi : IMO problems are simultaneously: Far more ‘real math’ than anything you otherwise do as an undergrad. Not at all close to ‘real math’ as practiced by mathematicians. He then points to a Twitter thread from mathematician Timothy Gowers : It will be interesting to see how the time the program takes scales as the difficulty of the problems it solves increases. If it scales with a similar ratio to that of a human mathematician, then we might have to get worried. But if the function [human time taken → computer time taken] grows a lot faster than linearly, then more AI work will be needed. The fact that the program takes as long as it does suggests that it hasn't "solved mathematics". However, what it does is way beyond what a pure brute-force search would be capable of, so there is clearly something interesting going on when it operates. We'll all have to watch this space. So AlphaProof is already powerful enough to be quite useful, once it’s made more efficient, but also has a long way to go to render human mathematicians obsolete. Suppose that, to generate a 20-step proof, AlphaProof needs to consider one million possibilities. To generate a 40-step proof, it might need to consider a trillion possibilities (one million squared); for 60 steps, one quintillion possibilities (1,000,000,000,000,000,000). If that’s the way the scaling works, then further progress will require additional breakthroughs. Those breakthroughs might require generating training data that includes examples of more complex reasoning strategies. Similarly, for o1 to tackle more complex problems, OpenAI might need a way to generate (and grade) millions of examples of reasoning chains that use complex strategies. This might turn out to be quite difficult. I get the impression that o1 can (sometimes) solve problems that might take a person up to, say, half an hour? But real-world tasks can require days, weeks, or months of effort. Generating training data for those sorts of tasks might be much more difficult and expensive. And we may find that entirely new categories of skill are required, such as the ability to bring relevant data in and out of long-term memory 14 . (Probably we can’t even guess at all of the skills that are necessary to manage oneself through a week-long creative project. We won’t recognize what they are until we watch o2 or o3 stumble for lack of those skills.) With this review, we’re finally ready to say something about the road ahead. The Long and Winding Road to AGI AlphaProof and o1 represent impressive advances in AI capabilities. AlphaProof might be too specialized (and expensive?) to have much practical use today, but it has set a new bar for AI problem solving. o1 will find immediate application, which will only increase as people learn how to best use it. And it is certain that further advances will follow. Per Sam Altman 15 : The model (o1) is going to get so much better so fast [...] Maybe this is the GPT-2 moment, we know how to get it to GPT-4. So plan for the model to get rapidly smarter. My guess is that extending these systems to longer reasoning chains and more complex problem-solving strategies will turn out to be a many-year project. (The journey from GPT-2 to GPT-4 took four years.) It won’t just be a question of scaling up. The fact that o1’s capabilities are specific to math, science, and coding hints that higher-level reasoning abilities may not easily generalize to other areas. Timothy Lee notes : While o1 represents a significant improvement in reasoning about math and science, the o1 models are not “truly general” in their reasoning abilities. For example, they are not noticeably better than other frontier models at reasoning about language. I suspect this is because the OpenAI team hasn’t figured out an automated way to generate the kind of training data they would need to do reinforcement learning on language tasks. We can’t yet guess at how difficult it will be to extend the techniques used in AlphaProof and o1 into new fields, especially those that don’t have clear right and wrong answers. Nor do we know what will be required to address real-world problems that, unlike math puzzles, require lots of messy context. Along the way, we will likely encounter new challenges – something other than just extending the reasoning chain, some gap in LLM capabilities that’s difficult to put a finger on today. Again, if AI capabilities were just a question of scale, then OpenAI wouldn’t have had to create o1 in the first place. Sam Altman again : There was a time in our history when the right thing to do was just to scale up compute. Right now the answer is to really push on research - o1 was a giant research breakthrough that we were attacking on many vectors for a long time. Creativity is an exploratory process, requiring a diverse set of strategies. Employing those strategies requires developing muscle memory for each problem domain. AlphaProof and o1 demonstrate a path forward, but the need for large amounts of training data, and open questions around how to create that data for messy tasks, suggests that progress will be uneven, gradual, and idiosyncratic. Subscribe now Share Thanks to Dynomight , Grant Mulligan , Julius Simonelli , Kevin Kohler , Quade MacDonald , Rob Tracinski , Sean Fleming , and Shreeda Segan for invaluable feedback and suggestions. 1 Large Language Models, like the GPT-4o model that powers ChatGPT 2 Prior to the recently introduced “o1” model, at any rate. 3 Note that when I refer to “current LLMs”, I am excluding the new AlphaProof and o1 models. Arguably, these are not strictly language models, because they don’t just predict the next token in standard written language; I’ve seen o1 referred to as a “Large Reasoning Model”. 4 These training proofs are apparently generated by AlphaProof itself. It bootstraps its own learning in a similar fashion to DeepMind’s AlphaZero (chess) and AlphaGo Zero (Go). 5 Problem 1 from the 2024 International Mathematical Olympiad : 6 I’m not quite clear on how generating several hundred guesses and then looking for counterexamples fits into a tree search over potential proofs, so my understanding of how AlphaProof works may be flawed. 7 From https://openai.com/index/learning-to-reason-with-llms/ 8 To decode: Convert each letter to a number: a=1, b=2, etc. Take the numbers two at a time, average them, and convert back to a letter. For instance, the first two letters (“oy”) decode to “t”, because o=15, t=25, the average is 20, and t=20. The result: “THERE ARE THREE R’S IN STRAWBERRY”. (This is a reference to the notorious fact that current LLMs incorrectly answer “how many r’s in strawberry”.) 9 I asked: Please solve the following math puzzle. You are given the numbers 2, 3, 7, 20, 21, 25. You need to find some way of combining those numbers using addition, subtraction, multiplication, and division to make 269, using exactly three operations. You cannot use the same number twice. For example, (3 * 7) + 20 would make 41. Can you find a way to make 269? This puzzle is not especially difficult, but o1-preview flubbed it. After thinking for 67 seconds, it came up with the following : Final Answer: (21 × (20 − 7)) − (25 − 21) = 269 So, by performing these three operations using the given numbers without repeating any number, we successfully arrive at 269. The arithmetic is correct, but while it claims to have used three operations (as requested), it actually used four. And, contrary to the instructions, it used a number (21) twice. When I pointed this out, it thought for another 82 seconds, and then claimed that no solution exists. (The solution I arrived at after working on the problem in my head for a minute or so: 2 * 7 * 21 - 25 = 269.) 10 Here is an assortment of additional observations on o1’s performance. The paper LLMs Still Can’t Plan; Can LRMs? A Preliminary Evaluation of OpenAI’s o1 on PlanBench reports on o1’s performance on “Blocksworld”, a set of puzzles which require rearranging blocks. On a set of puzzles requiring 2 to 16 steps, o1 scores 97.8%, a huge advance over the best previous LLM result (62.6%). However, on larger puzzles requiring 20 to 40 steps, “o1-preview only manages 23.63%, and most of this accuracy comes from correctly solving problems which require fewer than 28 steps. While these models are overall impressive, this shows that their performance is still far from robust.” OpenAI’s own tests show that o1-preview’s results, when judged by human evaluators, are slightly worse than GPT-4o for “personal writing”, and only comparable for “editing text”: Via Zvi : Anthony Aguirre : I've subsequently been feeding it some pretty challenging mechanics problems. It got one perfectly, one with a single hint/step in thinking corrected, and nearly succeeded in a third (correctly doing a hard calculus of variations problem but not being able to construct the tricky argument to pull things together. Note that I think it would have gotten this were the solution in its training data.) I would not give these problems in a graduate qualification exam because they are too hard. My sense so far is that it lacks a certain inventiveness and ability to take intuitive leaps, but it absolutely can carry out long hard physics calculations reliably toward a correct solution; I'd wager that it would pass the physics quals at most or all physics graduate programs. Brian Chau notes (thanks to Zvi for the link): TLDR: Based on my testing on national-level Math Olympiad and Programming Olympiad questions, o1 appears to be: a) the greatest lookup/citation system in existence, blowing away perplexity b) still mediocre at solving problems that aren't direct applications of documented theorems 11 More precisely, gpt-4o. 12 Search for “Chain of Thought”, click on a topic (“Cipher”, “Coding”, etc.), scroll down under “OpenAI o1-preview”, and click “Thought for 5 seconds”. 13 Rohan Paul shares some thoughts , noting a paper from OpenAI which mentions training on “800K human-labeled solution steps”. The paper was published in May 2023, and may reflect work done earlier, so o1’s training data could have been much larger. 14 GPT-4o can generate about 100 tokens per second. If o1 is similar, then over the course of 24 hours it would generate 8.64 million tokens, which would be a very large token buffer. To allow an AI to chew on a problem for an entire day will require new approaches to information management. 15 At OpenAI’s recent DevDay event; paraphrased by Simon Willison .…
ChatGPT and other large language models (LLMs) can spew forth essays and short stories by the bushel-load. How come none of them are of any real interest? OpenAI’s new “o1” model outscores PhDs on a test of expertise in chemistry, physics and biology . Why isn’t it generating novel scientific insights? A popular explanation is that “AI can't create anything new.” The argument is that because these systems rely on human training data they are limited to generating remixes of existing work. But that’s an incorrect analysis. The actual impediments to AI creativity lie elsewhere. Understanding those impediments will help shed light on the prospects for those impediments to be removed, and allow us to evaluate the significance of advances like OpenAI’s latest model, o1. Computers Create New Things All The Time The idea that AI can’t create anything “new” does not stand up to scrutiny. Consider this snippet from The Policeman’s Beard is Half Constructed , a book written by a computer and published way back in 1984: There once was a furry chilled stag Had hairdryers which could not sag They tripled and punted And never quite grunted And that's why they seemed like a fag. There’s a problem with this limerick, but the problem isn’t a lack of novelty. The problem is that it’s crap. Or consider this computer-generated content: 79BCE5AAFB9906BE63A1E5010B2ABACD133F9D59C758A1DE9B5F8EC1ED54A768 That’s a 256-bit random number. It may not be evidence of creativity, but it’s definitely new; I can guarantee that no one has ever written this exact sequence of letters and numbers before. So, computers can create new things. Why, then, do we say that most work by computers today is not creative? Creativity Requires Constraints A great movie because it threaded the needle between comedy and supernatural tension. (OK well that and the amazing cast.) I think the reason that generating a random number doesn’t seem creative is because it’s too easy . The Eiffel Tower was a great accomplishment, not because it was the first large structure made of iron, but because it achieved artistic beauty while simultaneously reaching literal new heights of engineering – being almost twice as tall as the previous record holder, the Washington Monument. Ghostbusters was a great movie (yes, of course I mean the 1984 original) because it squeezed in laugh after laugh without disrupting the tension of the light supernatural-horror plot. The last two seasons of Game of Thrones were widely criticized for inconsistencies and lazy plotting. Important developments were “unearned”. The term “unearned” is instructive – people disliked these seasons because the writers hadn’t done the work to satisfy the constraints of good storytelling. For a piece of work to be considered creative, it must satisfy difficult constraints . That might mean characters acting in consistent ways, a thousand-foot iron structure supporting its own weight, or balancing comedy with a sense of supernatural danger. When today’s LLMs fall down on creative tasks, it’s because they’re not capable of satisfying difficult constraints. A TV script generated by an LLM will be like late-season Game of Thrones: all the right elements are there, but they’re thrown together haphazardly, and the result doesn’t really work. The same thing happens today if you ask an LLM to design a complex piece of software, or explain a novel scientific phenomenon: it’ll generate something that looks right on the surface, but doesn’t stand up to inspection. If “creativity” is actually about satisfying constraints, why do we associate it with novelty? Novelty is Just Another Constraint Deadpool is a great movie (fight me). It also grossed $783 million in ticket sales. If it were re-released today, it wouldn’t do nearly so well – because it’s no longer novel. The target audience has already seen it, and probably the sequels; the material is no longer fresh. In 2024, it’s still possible to write a great snarky, self-referential action movie (The Fall Guy). But you have to do it differently. You need new ideas and new jokes. This is why genres get tapped out: it becomes harder and harder to find a new take. To be entertaining, a piece of creative expression needs to be novel. That’s part of what makes a great book, movie, or composition such a creative achievement: not only does it need to be beautiful, engaging, coherent, and meaningful; it needs to do all that in a different way than what we’ve seen before. That additional constraint adds to the difficulty, and thus to our appreciation of the result. So what does all this tell us about creativity and AI? Judge AI By Its Ability to Satisfy Constraints So much future in this image “The marvel is not that the bear dances well, but that the bear dances at all”. This is the lens through which we seem to view each new advance: the first instance of an AI accomplishing a new task is hailed as a major advance, even if (on reflection) it really did not perform the task very well. It was remarkable when ChatGPT started generating essays on demand. And it was, perhaps, remarkable when Sakana – billed as “The AI Scientist” – started generating scientific papers without human input. But these things were remarkable primarily because they were novel; and novelty wears off. At some point, we want to watch good dancing, read insightful essays, get valuable research. By that standard, bears, GPT-4, and Sakana don’t measure up 1 . By all accounts, the scientific papers produced by Sakana are dreck . It follows a cookie-cutter approach, mechanically tweaking an existing piece of AI software and evaluating the result. It contradicts itself , it hallucinates , it misses obvious prior work . Scott Alexander writes : The creators - a Japanese startup with academic collaborators - try to defend their singing dog. They say its AI papers meet the bar to get accepted at the highly-regarded NeurIPS computer science conference. But in fact, the only judge was another AI, supposedly trained to review papers. ... In any case, if I’m understanding right, the AI reviewer only accepted one out of eighty papers by the AI scientist (and it’s not any of the ~dozen they’ve released publicly, which is suspicious). All of this might be forgivable if Sakana also generated valuable insights, but I’ve seen no indication that this is the case. It generates new papers, but those papers don’t satisfy the constraints of high-quality research. Meanwhile, Chess and Go AIs are indisputably creative 2 ; human players have adopted new strategies first exhibited by computer players. In these narrow domains, specialized AIs have cleared the quality bar. Code authoring tools are steadily growing in sophistication and quality; you can debate whether they are “creative” but they are becoming increasingly useful (though still limited). Rosie Campbell hit the nail on the head in a recent post : So instead of asking "can AI be truly creative?," perhaps we should be asking "what can AI create?" … AI systems may soon be capable of discovering cures for cancer, creating novel pathogens, and even contributing to their own improvement, regardless of whether we label them "creative" or not. In the face of systems with such profound impacts, quibbling over definitions seems a little... uncreative. The question of “what can AI create?” can be reframed as “in which problem domains is AI able to produce useful results?” This focuses our attention on the important question: what are the prospects for AI systems to start generating outputs that actually meet the constraints of the problem domain? In a forthcoming post, I’ll share some thoughts on this question, and what the recently announced AlphaProof and o1 systems tell us about the likely pace of progress. Subscribe to get my upcoming thoughts on AlphaProof, o1, and progress toward AGI Share Thanks to Andrew Miller , Ben James , Dynomight , Julius Simonelli , Rob Tracinski , and Rosie Campbell for invaluable feedback and suggestions. 1 To be clear, GPT 4 has plenty of useful applications! But writing meh-quality essays isn’t near the top of the list. 2 In her recent post Can AI be truly creative? , Rosie Campbell notes: …when DeepMind’s AlphaGo system defeated top Go player Lee Sedol in 2016, commentator Michael Redmond was taken aback by AlphaGo’s now-famous move 37: "It's a creative move… It's something that I don't think I've seen in a top player's game."…
It’s all fun and games watching AI gradually automate work that other people do. But when Google announced that their “AlphaProof” system can solve International Mathematical Olympiad problems, things got personal. Solving IMO problems is what I do. Or did, anyway; I was on the US team back in 1983 and 1984. The 1983 US IMO team in Paris. That’s me lookin’ good in the upper right. When I read the AlphaProof announcement, I realized it represented an opportunity for me to probe the gap between human and AI problem-solving abilities. A lot of energy goes into evaluating AIs, but honestly most of us don’t understand human abilities all that well. It’s surprisingly difficult to reflect in detail on the capabilities that we rely on for creative problem solving. So I spent some time plowing through a few of this year’s IMO problems, taking notes of my thought process along the way. The results were illuminating. The IMO is an elite competition, but the cognitive skills involved are nothing mysterious. I’m going to walk you through my journey on one particular problem that only uses elementary-school math, highlighting the mental tools I used. I’ll end with some brief thoughts on how this relates to other forms of intellectual work, and what that might mean for the future of AI. Let’s begin! The First Step Is To Understand the Problem Here is question 3 from the 2024 International Mathematical Olympiad. As written, it’s pretty dense, so feel free to skip it – I’ll explain everything below. (In fact, feel free to skim past all of the math-y bits in this post; the important thing is the mental journey that connects them.) This might seem baffling at first glance. It was baffling to me! I had to read the problem statement a couple of times just to grasp the basic structure, and then I could go through slowly and work out what each phrase meant. Let a ₁ , a ₂ , a₃, … be an infinite sequence of positive integers, This is just a fancy way of saying “let’s write down some numbers”. and let N be a positive integer. Suppose that, for each n > N, aₙ is equal to the number of times aₙ₋₁ appears in the list a₁, a₂, …, aₙ₋₁. This says that the first N numbers in the list can be anything at all; I’m going to call that the preamble. And each number after that is generated by something I’ll call the counting rule . The counting rule says: look at the last number so far, see how many times it appears in the sequence, and write down that number. If we have “1 1 2 1”, then the next number will be three, because the last number was 1 and we’ve had three 1s. Prove that at least one of the sequence a₁, a₃, a₅, … and a₂, a₄, a₆, … is eventually periodic. (An infinite sequence a₁, a₂, a₃, … is eventually periodic if there exist positive integers p and M such that bₘ₊ₚ = bₘ for all m ≥ M.) Having to work out what that business with the p’s and M’s was getting at would have been a distracting chore. Fortunately, I was already familiar with the phrase “eventually periodic”, which just means “after a while, it falls into a repeating loop”. So, we have to prove that when we look at every other position in the sequence 1 , we’ll eventually find a loop – something like “3 7 3 7 3 7 3 7” repeating forever. After a few careful read-throughs, I had a surface understanding of the problem. But I hadn’t the faintest idea how it could be solved. We’re asked to prove that the sequence eventually repeats, but all we’re given to work with is the fact that it follows the counting rule. There’s no obvious connection between counting and repetition. How to begin? Tools used: Break down the problem statement into manageable pieces. Read-read each piece until you understand it. Have a mental library of important concepts in the problem domain (like “eventually periodic”). Assign shorthand names to important ideas (“preamble”, “counting rule”). To Understand How Something Works, Observe It In Action When studying for competitive math exams, one of the first things you’re taught to do with a new problem is to work out some examples. Once you’ve seen the rules of the problem play out a few times, you’ll have a better understanding of what they mean. Consider the board game Risk. Reading the rules, you’ll learn that you get extra armies for controlling an entire continent, and you get an especially large number of armies if you control a large continent like Asia. But you have to play a few games to grasp the implications, such as: Everyone wants extra armies, so they’ll be fighting you to control the nicest continents. No one wants you to get extra armies, so if you conquer an entire continent, someone will make a point of spoiling it for you by recapturing one small part. If you make an early show of force in a small continent, people might back off and let you keep it. Never get involved in a land war in Asia. By the same logic, I knew I would have to “play a few games” with the counting rule. I began with the simplest possible preamble – a single 1. Then I applied the counting rule. The last (and only) number was a 1, and there was one of it, so I wrote 1 again: 1 1 Now the last number was a 1, and there were two of it, so I wrote 2: 1 1 2 Now the last number was a 2, and there was one of those, so I wrote 1: 1 1 2 1 And so on: 1 1 2 1 3 1 4 1 5 1 6 1 … It was immediately obvious that, when we look at every other number, they’re all 1s – exactly the sort of repeating sequence the problem asks for. This prompted me to try another standard trick: when you observe a pattern, see whether you can prove that pattern will always hold. Tools used: when presented with a range of scenarios, work through a specific example, and look for a pattern. Could It Be This Simple? [This is a long post, because I wanted to present the entire proof, along with the thought process that got me there. If you’d prefer to get to the punchline, just read the first and last paragraph of each section, plus the entire last section.] I could see a compelling logic to the sequence. We keep getting new numbers that are bigger than any number before. After each of those numbers, we’ll write a 1, because by definition the biggest-number-yet has only appeared once so far. And after each of those 1s, we write a new even-bigger number, because now we’ve had more 1s than ever before. For instance, when the three appears, it’s the first 3, so we write a 1. And that’s the fourth 1, so we write a four. That’s the first 4, so we write the fifth 1, then the first 5, then the sixth 1, then the first 6, and so forth. I wondered, maybe all sequences fall into this pattern? The idea was tantalizingly logical: At some point, we’re going to get a number that’s bigger than any number before. Suppose it’s 37. Because we’ve never had a 37 before, the next number will be 1. But how did we get a 37 in the first place? Say that just before the 37, we had x. That is, suppose the sequence up to this point was …, x, 37, 1. Because we placed a 37 after the x, it must be that x had appeared exactly 37 times. What could x be? Whatever it is, it is the number that is showing up most often. (If some other number y were appearing more often than x, then we would have 37 y’s before 37 x’s, and so this wouldn’t be the first time 37 has appeared.) For reasons I’ll explain below, we might expect that the number showing up most often is 1. And if x is 1, then now we have 38 of them (we just added another in step 2), so we’ll write a 38. Now 38 is the biggest number yet, so we’ll add another 1, and the sequence will repeat: … 37 1 38 1 39 1 … The only supposition here is in step 4, where I assumed that 1 is the number that has shown up most frequently. This struck me as a reasonable thing to suppose. Imagine, instead, that 3 is the most frequent number. That means we’ve written a bunch of 3s, which we would only have done if a bunch of numbers have shown up three times. Before those numbers showed up for a third time, they must have shown up for a first and second time, so we’d also have written at least as many 1s and 2s as 3s. It seems hard for 3 – or any number other than 1 – to consistently show up more often than 1. Unfortunately, IMO problems require mathematical proof, and “it seems hard” is not a proof. I tried to come up with a formal proof that, after some initial settling period, 1 would always show up more often than any other number. But I couldn’t come up with any good ideas; whatever half-baked ideas did occur were quickly revealed to have flaws. Another trick I’ve learned over the years: if something seems true, but you can’t prove it, maybe it isn’t really true. I decided to try more examples, to test whether 1 always shows up most often. Tools used: when you find a pattern, look for a principle which would explain the pattern. To test your logic, try to prove that the principle is correct. When In Doubt, Try More Examples I wanted to come up with a preamble that would turn out differently than my initial example, but was still fairly simple. I went with “1 2 2”. Here’s how that turned out – note that I’m adding a # to separate the preamble from the numbers generated using the counting rule: 1 2 2 # 2 3 1 2 4 1 3 2 5 1 4 2 6 1 5 2 7 1 6 2 … This time, the repeating pattern is 1 2 1 2 1 2. Twos are showing up just as often as ones, thus disproving my idea that 1 must always appear most often. On inspection, could see that this is because the preamble gives 2 a head start: it shows up twice, while 1 only appears once. Then things settle into a pattern where, whenever a new biggest number shows up, it also shows up twice. Because each new number shows up twice, we perpetually get as many twos as ones. At this point, I wasn’t sure where to go, so I sketched some more examples. I tried to steer toward as many different situations as possible – starting with longer sequences, more 2s than 1s, or only using 3s: 1 2 # 1 2 2 3 1 3 2 4 1 4 2 5 1 5 2 … 1 2 1 2 # 2 3 1 3 2 4 1 4 2 5 1 5 … 1 2 1 2 1 2 # 3 1 4 1 5 1 … 1 2 2 2 2 # 4 1 2 5 1 3 1 4 2 6 1 5 2 7 1 6 2 8 1 … 3 3 # 2 1 1 2 2 3 3 4 1 3 5 1 4 2 4 2 5 2 6 1 5 3 6 2 7 1 6 3 7 2 8 1 7 3 … A pattern always emerges, but in the last example it’s more complicated (the repeating sequence is 2 1 3), and in that example the pattern doesn’t emerge until the counting rule has been applied 18 times. At this point I was baffled – what mechanism could allow the sequence to unfold in so many different ways, and yet ensure that it always repeats? It was time to try another approach. Tools used: if you get stuck, try more examples. Try to cover a range of scenarios. Working Without a Plan I had no idea how to prove that the sequence will always repeat. I decided to see what I could prove. I noticed that, in the sequences I’d written out, sometimes there’s an messy period before things settle down into a repeating pattern. The messy period clearly had something to do with the numbers in the preamble. Also clearly, as we go deeper into the sequence we get bigger numbers, so eventually we’ll be getting mostly numbers that are bigger than anything in the preamble. Maybe that’s a necessary step toward repetition – we need to climb past the numbers in the preamble? As a mental shorthand, I decided to use m to refer to the biggest number in the preamble. And then I realized that I could indeed prove something about numbers bigger than m: each such number can only show up a limited number of times. To prove this, I used a standard trick called proof by contradiction . You assume the thing you want to prove isn’t true, and then show that leads to an impossibility. Here, I used a common variation where you specifically look for the first time your hypothesis isn’t true. Proposition 1: for any number x > m, x appears at most m times. Proof : suppose the proposition is not true, i.e. that some numbers larger than m show up more than m times. Find the first time that happens – the first place in the sequence where some number x > m shows up for the m+1’st time. Because x > m, and m is the largest number in the preamble, x can’t appear in the preamble. All m+1 times that it has shown up must have been generated by the counting rule. This means that there are m+1 numbers that have shown up x times. What are those numbers that have shown up x times? We know that x is the first number greater than m to show up at least m+1 times, so all of the numbers that have shown up x times (remember x > m) are less than or equal to m. But there are only m positive integers less than or equal to m. We’ve arrived at our impossibility: it’s impossible for m+1 numbers to have shown up x times. Apologies if this is hard to follow; again, feel free to skip the propositions and proofs. Each step is logically straightforward, but there are a lot of pieces to keep track of, and they’re quite abstract. This points at one of the skills necessary for solving this type of problem: you need to have practice at manipulating abstract concepts. Later on I started to struggle with this, but for the moment I was on a roll. Tools used: if you can’t come up with a plan, try striking out at random. Look for partial patterns, and try to explain them. Have a collection of tools from the problem domain (such as proof by contradiction), and know when to use them. Building Momentum I kept turning the problem over in my mind, looking for further statements I could prove – and finding them. Here’s a slightly cleaned up version of how that went. I’d already proved that big numbers (numbers bigger than m) show up a limited number of times. Next, I found a way to prove something else I’d observed: there are always small numbers that keep showing up. Proposition 2: the number 1 appears an infinite number of times. Proof : whenever a number appears for the first time, we add a 1 to the sequence. So if 1 appears only a finite number of times, that means the number of distinct numbers is finite. Suppose the largest number that ever appears is M. As we build our sequence, we’re always adding numbers between 1 and M. Eventually, one of those numbers must show up for the M+1st time. And then we’d add M+1 to the sequence, which contradicts the idea that M was the largest number to appear. Proposition 3: all of the numbers that appear an infinite number of times are adjacent. For instance, we might have 1, 2, and 3 all appearing an infinite number of times, but we’d never get 1, 2, and 4 (without 3). Proof : for x to appear an infinite number of times, there must be an infinite number of numbers which appear x times. Each of those numbers must first appear x-1 times, so if x appears an infinite number of times, so must x-1. Thus, there can’t be any gaps in the set of numbers that appear an infinite number of times. At this point, I was starting to get a headache. Look at the first sentence in the proof of proposition 3. “x” appears in two different roles. First, it’s mentioned as a thing that shows up in the sequence. Then, it’s mentioned as a count of other numbers in the sequence. The fact that each number in the sequence has a dual role, first as a count, then as a thing, kept tangling me up. It was like my brain had to keep yanking each number out of one mental slot and drop it into another, breaking all the connections. To drive this home, consider one of the example sequences I worked out above: 1 2 2 2 3 1 2 4 1 3 2 5. Here’s how we might describe that sequence: The preamble is 1 2 2. We’ve had two 2s, so then we add a 2. Then we’ve had three 2s, so we add a 3. Then we’ve had one 3, so we add a 1. Then we’ve had two 1s, so we add a 2. … Try reading that out loud, see how confusing it is? I’d had a lot of practice with abstraction, but this was getting to be too much. Still, I managed to prove a couple more propositions that really narrowed down the way the sequence could play out: Proposition 4: Let k be the largest number that appears an infinite number of times. Then we can find some number K such that all numbers greater than K appear at most k times. (To support later steps, we also choose K to be larger than m and larger than the size of the preamble.) Proof : k+1 appears a finite number of times, so there are a finite number of numbers which appear at least k+1 times. Let K be the largest of those numbers. By definition, all numbers greater than K appear at most k times. Proposition 5: things eventually settle into an alternation of large and small numbers. Once all numbers from 1 to k have appeared at least K times, the sequence alternates between numbers no larger than k and numbers larger than K. Proof : assume all numbers from 1 to k have appeared at least K times. (This must happen eventually, since by proposition 3, those numbers eventually appear an infinite number of times.) The next time one of these numbers appears, it will be appearing for at least the K+1st time, so it will be followed by a number larger at least equal to K+1. That number can only have appeared up to k times (proposition 4), so the number after that will be less than or equal to k. Each number ≤ k is followed by a number > K, and each number > K is followed by a number ≤ k. Proposition 5 was heartening: I’d proved that the sequence eventually settles into an alternation! Small numbers (less than or equal to k) alternate with big numbers (greater than K). The problem calls for us to prove that something is true about every other number in the sequence, so it felt promising to have proved that some sort of alternation arises. Now I just needed to prove that, within that alternation between large and small numbers, the small numbers would eventually repeat. Tools used: continue on a path so long as you’re making useful progress. Look for steps that take you closer to the end goal. Searching For Proof In All The Wrong Places I had previously noticed that the sequence often starts out looking messy, and then gradually settles into a repeating pattern. Sometimes the messy period lasts for a while. Recall that, in one of the examples I worked out earlier, the simple prologue “3 3” requires 18 steps before settling down: 3 3 # 2 1 1 2 2 3 3 4 1 3 5 1 4 2 4 2 5 2 6 1 5 3 6 2 7 1 6 3 7 2 8 1 7 3 … I felt that if I could understand the process by which this occurs, I could prove that all sequences settle down. I sketched a few more examples, trying to push the prologue in different directions and understand how this impacted the settling process: 1 2 2 2 2 2 2 # 6 1 2 7 1 3 1 4 1 5 1 6 2 8 1 7 2 9 1 8 2 1 5 5 5 5 5 5 # 6 1 2 1 3 1 4 1 5 7 1 6 2 2 3 2 4 2 5 8 1 7 2 6 3 3 4 3 5 9 1 8 2 7 3 6 4 4 5 10 1 9 2 8 3 7 4 6 5 1 3 # 1 2 1 3 2 2 3 3 4 1 4 2 4 3 5 1 5 2 5 3 6 1 I couldn't discern any concrete pattern to how things settle down, but I did have a fuzzy sense. Eventually we'll reach a point where some set of small numbers have all appeared roughly the same number of times, and then those numbers repeat forever. But at first, some of those numbers haven't appeared "enough" times, and they need to "catch up" before the repetition can begin. For instance, again in the second-to-last example above, we need a lot of 1s, 2s, 3s, and 4s to catch up with all the 5s. But despite what struck me as the appealing logic of this approach, I didn’t see anything I could prove about it. So I pulled another tool from the toolbox: look at the situation from a different direction. Tools used: if you get stuck, look for another approach. The Big Numbers Are Just A Distraction It dawned on me that all of the interesting stuff that was going on had to do with those small numbers. Here's that example sequence I keep referring to: 1 5 5 5 5 5 5 # 6 1 2 1 3 1 4 1 5 7 1 6 2 2 3 2 4 2 5 8 1 7 2 6 3 3 4 3 5 9 1 8 2 7 3 6 4 4 5 10 1 9 2 8 3 7 4 6 5 Big numbers like 8, 9, and 10 all play exactly the same role. Each of them is going to show up exactly 5 times (Proposition 4, which I'd already proved at this point), always in alternation with the small numbers (Proposition 5). Basically, these big numbers are just tokens that keep track of how often the small numbers have appeared. All of the interesting structure in the sequence, it seemed to me, is determined by the small numbers. I decided to try writing the sequence in a form that would emphasize any patterns in the small numbers. Because this sequence has 5 small numbers that keep repeating (k=5), I made a 5-row table. The first row showed the number of 1s at each point in the sequence, the second row showed the number of 2s, and so forth. Each column represents a position in the original sequence, and I circled the entry corresponding to the small number which had shown up most recently. Here's what that looked like: Does this reveal anything to you, other than the fact that I have bad handwriting? Me either. But despite the lack of progress, I was still stuck on the idea that I had to show how the sequence would “settle down”. Rummaging through the list of proof techniques I’d been taught, I found one that might help. Tools used: try transforming the problem into an alternative representation that might reveal new patterns. Looking For a Measure of Disorder On further review, I did notice one thing about the diagram I’d drawn. If you look at the pattern of circled numbers, at first they are jumping all over the place, but gradually they settle down into tidy diagonal lines. If I could prove that this always happened, I could probably turn that into a formal proof of the original question. I wanted to find a way to measure how “unsettled” the sequence was. If I could find a formal definition of unsettledness, then perhaps I could prove that unsettledness must always go down, forcing the sequence to settle into a repeating pattern. This idea comes up in many mathematical proofs. For instance, suppose someone gives you a list of numbers and asks you to sort them in numerical order. Here's one way of doing that: Look for a pair of adjacent numbers that are out of order. If you can't find one, then you're done! The numbers are all in order. Otherwise, swap those numbers and go back to step 1. For instance, suppose we start with the numbers 4 1 3 2. We might proceed as follows: 4 1 3 2 → swap 4 and 1 1 4 3 2 → swap 3 and 2 1 4 2 3 → swap 4 and 2 1 2 4 3 → swap 4 and 3 1 2 3 4 → nothing more to swap! Could there be a situation where we keep swapping numbers forever without ever getting them all in order? It turns out that we can prove that never happens, by coming up with a measurement for how out-of-order the list is. We say that the “unsortedness” of a list is the number of pairs of numbers which are in the wrong order. In the initial list above, the following pairs are out of order: 4-1, 4-3, 4-2, and 3-2, so the unsortedness is 4. Whenever we swap a pair of adjacent numbers that were out of order, we reduce the unsortedness by exactly 1. The numbers we swapped used to be out of order, now they are in order. And every other pair of numbers remains in the order it was in previously. Thus, the unsortedness decreases by 1. If the unsortedness of the initial list is 4, then it will take exactly 4 swaps to put it in order – as we see above. I thought that I could use a similar approach to prove that the number sequences for this IMO problem always settle into a repeating pattern. I just needed to find an appropriate definition of “unsettledness”. But no matter how long I started at my yellow pad, I kept drawing a blank. It was time to look for another way forward. Tools used: when faced with a problem, try applying a known technique for that type of problem. But recognize when it isn’t working. If One Tool Doesn’t Work, Try Another Around this time, I remembered another classic trick for proving that a sequence eventually repeats. If you have a sequence where each entry depends only on the previous entry, and there are a limited number of possible entries, then eventually the sequence must hit on an entry it has used before. Once that happens, it will repeat forever. For instance, suppose that we generate a sequence of numbers using the following rule: Start with a one-digit number. Add 3, and keep the last digit. Repeat step 2 forever. If we start with 3, we get the following sequence: 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 ... The eleventh number is a 3, the same as the first number. After that the sequence starts over, always repeating the same ten values (3 6 9 2 5 8 1 4 7 0) in the same order. If we had started with a different number, we’d wind up in the same loop. For instance: 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 ... There are only ten possible numbers, so eventually the sequence has to repeat itself. And then it will keep repeating in exactly the same way, because it’s always following the same rule. This trick won’t work for our Olympiad problem, because each new number is a function of the entire past history of the sequence, and that history is always growing. But still, we need to prove that a sequence eventually repeats, and this is a technique for doing just that. I asked myself, is there some way of stretching it to fit the problem? Tools used: when one technique fails, try another. If it doesn’t immediately work out, exercise judgement as to whether it’s worth exploring further. Finding the Right Representation In the IMO problem as stated, each new number depends on the entire past history of the sequence. But in the grid representation that I developed earlier, each column in the grid depends only on the previous column 2 . If I could just prove that there are a finite number of arrangements of numbers that can show up in a column, I’d be done. Unfortunately, there are an infinite number of arrangements of numbers, because the numbers keep getting bigger. I wondered, could I find some way of distilling the “essence” of each column, such that there would only be a finite number of possible “essences”? (This idea, of transforming something into a new form that better suits your purposes, is another standard tool in the mathematical toolbox.) I wondered whether the ranking order might repeat. By ranking order, I mean ranking the numbers 1 through k (in our example, 1 through 5) according to how many times each one has shown up so far. There are only so many ways of ranking k numbers, so eventually the same ranking would have to show up for a second time. Unfortunately, that doesn't lead to a repeating sequence, because the ranking in one column is not determined solely by the ranking in the previous column. Hmph. Then I had another idea: the numbers in that grid diagram seem to be increasing more or less in lockstep. could I prove that the numbers in a given column always remain fairly close together? For instance, if k is 5, that would mean proving that in any given column, the difference between the number of 1s, the number of 2s, the number of 3s, the number of 4s, and the number of 5s could never exceed some limit. (In the example, that limit turns out to be 6.) If the differences are capped, then there are only so many different arrangements they can take on – there is a finite number of arrangements of differences between the numbers in a column. And if you know the differences between the numbers in one column, that is all you need to figure out the differences between the numbers in the next column. Therefore, the differences would eventually have to fall into a repeating loop! Was that daylight I saw at the end of that tunnel?!? Given the facts I'd already proved (propositions 1 through 5), it turned out to be not too difficult to prove that there is, indeed, a limit on the difference between how many times any two small numbers appear in the sequence. That means the pattern of differences must eventually repeat, which means the pattern of small numbers must eventually repeat, which means I had solved the IMO problem! (Fine, I probably spent at least 2 or 3 times as much time as IMO competitors are allowed. Give me a break, I’m rusty.) Proposition 6: across all positions in the sequence, there is a finite upper bound on the difference between the number of times each of the infinitely-repeating numbers (numbers in the range 1…k) have appeared. Proof: by contradiction. Let us say that for two numbers i and j, both in 1…k, i “lags” j if j appears increasingly more often than i. (More precisely, i lags j if for any number you choose, I can point to a location in the sequence where j has appeared that many times more than i.) Suppose i is the smallest number such that lags some other number, and let j be one of the numbers that it lags. Each appearance of j in the sequence is preceded by a number that has appeared j times. Thus, j must be less than i; otherwise those numbers would have previously appeared i times, meaning that i would be appearing as often as j. Because i is the smallest laggard, none of the numbers smaller than i can lag one another; there is a finite bound on how far apart their appearance counts can diverge. Call that limit b, and find a point in the sequence where j has appeared at least b+1 times more than i. At that point in the sequence, it must be the case that all numbers less than i have appeared more times than i has. It must also therefore be the case that all numbers less than i have appeared more times than any numbers in the range i+1 … k, because those numbers can’t have appeared more times than i has (using the same logic as in the previous paragraph). Recall that according to proposition 5, after some early period, the sequence alternates between small numbers (no larger than k) and large numbers. We’ve shown that the numbers 1…i-1 have all appeared more often than the numbers i…k. This means that whenever a number in the range 1…i-1 appears, it will be followed by a number so large that it has shown up at most i-1 times. That number, in turn, will be followed by a number no greater than i-1. So the numbers i…k will never appear again, contradicting the idea that i appears infinitely often. This proof can be summarized using the following intuition: the numbers that appear infinitely often must show up more or less in “lockstep”. If a large number “falls behind” the smaller numbers, then it will never appear again; and a small number can’t “fall behind” a larger number, because of the way the counting rule works. Tools used: if there’s a standard tool that doesn’t quite fit your situation, look for a way to reframe the situation to make it fit. All Creative Problem Solving Looks Like This Why did I just drag you through over 5000 words on a niche activity? Because I think the mental toolbox it demonstrates is one that we all use every day . The final proof follows a tidy linear structure, but the process that led me there was anything but. I started out by simply re-reading the problem statement until I understood it, and then working out examples until I started to grasp the implications. I worked forward (what can I prove?) and backward (how can I set up the conditions for a standard tool for proving that a sequence repeats)? I explored multiple paths, sometimes fighting through difficulties, sometimes giving up and trying a different tack. I assigned shorthand names to concepts like “preamble” or “laggard”, and I looked for alternative framings that might make the problem more tractable. Those are all strategies . I also relied on a rich library of specific mathematical tools and concepts : proof by contradiction, “eventually periodic”, a sequence must repeat if there are a finite number of possible entries and each one depends only on the previous entry, and so on. When you architect a software module, organize a trip, make a lesson plan, or plot a short story, you use similar tools. You need to understand the problem, explore options, work forwards and backwards, come at the situation from another angle. And you’ll certainly rely on a library of ideas and techniques, along with muscle memory of how to apply them. AIs are infamous for being hit-and-miss in their capabilities. In some respects they are already superhuman, in others they are bafflingly inept. Understanding the richness of the mental toolkit we all use every day – it’s so much more than a simple “chain of thought” – will help us gauge progress toward true AGI, and evaluate the potential of AlphaProof and OpenAI’s new “o1” model. I’ll say more about this in an upcoming post. Subscribe to get my upcoming thoughts on o1 and progress toward AGI Share 1 More precisely, either the even positions (a₂, a₄, a₆, …) or the odd positions (a₁, a₃, a₅, …); we can pick either one. 2 More precisely, this is true once we get to the point where every number from 1 through k has appeared at least m' times. In the sequence represented in the grid that I drew, this means that 1 through 5 have each appeared at least 7 times. From that point forward, each column in the grid is identical to the previous column except that we add 1 to one row. That row is determined by the rank of the number we incremented in the previous column. For instance, if in the previous column the number we incremented is the fourth-highest number, then in the next column we increment the number in the fourth row.…
Neither party is really progressing on their goals, here AI policy involves difficult tradeoffs. Too often, we treat these as a tug-of-war, fighting to privilege one goal at the expense of another. We’ll accomplish more if we redirect some of that energy toward actions that can alleviate tradeoffs. There are so many constructive steps we could be taking to address concerns around AI – without retarding progress. In fact, there are many policies which would directly benefit society even as they reduce AI risks! To begin with, let’s consider concerns where AI might exacerbate problems that already exist in the world. AI Is New, The Potential Harms Mostly Aren’t Many concerns people express regarding AI are not novel. The worry is that AI might make an existing problem worse. AIs may assist in cyberattacks, but attacks already occur every day. Models trained on Internet discussion boards might make biased parole recommendations, but flawed parole decisions have been around for a long time. AI will certainly be used to generate spam, but spam is nothing new. These are real concerns, and I do not mean to minimize them. But precisely because these problems are not new, there are known measures that would help. Not complete solutions, but worthwhile actions that would make a significant difference. Often these actions are neglected – which represents both a problem and an opportunity. Consider the concern that someone might eventually use an AI to create a new pandemic virus. The mechanism of creation might be new, but the most likely scenarios would result in a familiar sort of virus, one that spreads much like Covid or the flu 1 . Infamously, we spent trillions of dollars coping with Covid, but are failing to prioritize relatively cheap measures to reduce the threat of a future pandemic. Improvements to air ventilation and filtration in public spaces would make it more difficult for respiratory viruses to travel from one person to another. Broad-spectrum vaccines would reduce the impact of common viral families such as flu and coronaviruses. Wastewater monitoring would help us to quickly identify a new virus, and standby manufacturing capacity would allow us to quickly deploy tests and vaccines 2 . In combination, these measures have the potential to greatly reduce the potential for a future pandemic, whether natural or engineered. In short, we worry because we live in a world where a respiratory virus can rapidly cross the globe and kill millions of people. But we shouldn’t resign ourselves to living in that world! Rather than arm-wrestling over precisely whether and how to regulate biological capabilities of AI models, we could push for measures that attack respiratory viruses directly. Not only would this help to loosen one knot of the AI policy tangle, it would address the very real impact of Covid, the flu, and other diseases that kill hundreds of thousands every year. The idea that we have the power to combat long-standing issues like respiratory viruses is exhilarating. Once you start looking at the world through this lens, you see opportunities everywhere. Let’s Use AI as a Wake-Up Call It is an underappreciated fact that we have many viable paths for reducing the burden of respiratory viruses. The same is true for other problems that AI might exacerbate. Action in all of these areas has languished, but the specter of AI might be an opportunity to address that. Phone and text spam and frauds are abetted by the ease of spoofing caller ID. There are technical measures that would make it harder to use a fake phone number, but institutional inertia has delayed deployment. Bias in institutional decisions that affect people’s lives is exacerbated by a lack of transparency, and the lack of effective, prompt channels for appealing flawed decisions. A successful cyberattack sometimes involves “SIM jacking” – tricking a mobile carrier into transferring an employee’s phone number to a SIM card controlled by the attacker, so that they can receive authentication codes intended for the target employee. This could be addressed through tighter procedures at mobile carriers, or by moving away from the use of phone messages as a security measure. Cybersecurity in general suffers from policies and standards that encourage box-checking over effective security. In most cases, what we can hope for are partial solutions. We are not going to eliminate spam, cyberattacks, biased decision making, or (probably) the flu. But perhaps we can harness the renewed attention that AI is bringing to these problems to spur constructive action. What about the potential for AI to introduce genuinely new problems into the world? New Problems Have Constructive Solutions, Too It’s surprisingly difficult to identify genuinely novel concerns raised by AI. Deepfakes are new, but doctored (or simply out-of-context) photographs have been around since long before Photoshop. AI might centralize power in a handful of mega-corporations, but centralized power structures go back to the dawn of history. AI companions can be viewed as a continuation of the trend toward doomscrolling, information bubbles, and online interactions replacing real-life friendships. One candidate for a genuinely novel problem is the potential end of employment. Past advances have produced special-purpose technologies, capable of some jobs, leaving people to find other jobs. If we eventually develop true AGI (and capable robot bodies), by definition this would be a general-purpose technology, able to subsume all jobs. This would result in a world where there is no realistic prospect for most people to find work. Permanent mass unemployment in today’s society would not be a pretty sight, but that is not our only option. We should encourage discussion of ideas such as universal basic income, an automation dividend, or collective ownership of various resources. Arguably the most frightening concern is the (controversial) possibility of “loss of control”, where a superintelligent AI achieves unchallenged control over the entire Earth. Even this could be viewed as merely an extension of the ancient problem of totalitarian rulers, but no dictator has ever been immortal, able to conquer the world, or had the capacity to directly monitor every one of his subjects for signs of treachery. We are still working to understand the circumstances under which a loss of control could take place, let alone find reliable solutions. But there are many productive avenues for research. It is not clear whether we can fully address (or rule out) novel concerns like loss-of-control. Nor can we count on fully eliminating pandemics, biased decision making, cyberattacks, or other problems that could be exacerbated by AI. But for every one of these issues, there are at least constructive steps that we can be taking. The Goal Isn't AI Progress or AI Safety, The Goal Is A Better World ChatGPT seems very committed to the idea of farm-to-table There are many debates around AI policy. Positions are often justified by appeals to principles such as progress, safety, or equity. However, none of these principles are absolute. Without progress, safety is stifling; without safety, we might not be around to enjoy progress. We should not think in terms of “winning” the debate over progress, safety, or equity. A desirable future must satisfy many criteria. If we argue for sacrificing one goal in favor of another, we’ll just be robbing Peter to pay Paul – even as someone else is robbing Paul to pay Peter. The constructive course is to fight for win-win actions. Every time someone expresses concern that AI might enable bioterrorism, I would love to see them call for improved air circulation or other steps that reduce the threat. Whenever an AI proponent argues that concerns AI will enable more cyberattacks are overblown, I wish they would also help cement the case by pushing for the software industry to address long-standing issues with cybersecurity. If you are worried about biased AIs, restrictions on AI capabilities are not the only tool you should be reaching for. If you fear that safety concerns will stifle AI progress, you can help your cause by working to alleviate the real-world problems that give those concerns salience. We can’t eliminate the tensions over biased AIs, or highly capable AI agents, or other topics of debate. But we can try to reduce those tensions to the point where constructive compromise is possible. There are few courses of action more likely to promote a positive future. You can stay up to date with my project to shed light on the big questions around AI by clicking one button. Share Thanks to Grant Mulligan , Julius Simonelli , Kevin Kohler , Rob Tracinski , Sean Fleming , and Shreeda Segan for invaluable feedback and suggestions. 1 In many scenarios, it would actually *be* a variant of the flu or Covid viruses. 2 Summarized from Biosecurity and AI: Risks and Opportunities , which links to further material describing promising avenues for reducing pandemic risk.…
As many readers of this blog know all too well, there has been ferocious debate around California SB 1047, a bill which would enact regulations on AI. Even the “Godfathers of AI” – Yann LeCun, Yoshua Bengio, and Geoffrey Hinton – are divided. LeCun seems to hate the bill; last month he tweeted 1 : Excellent argument by @AndrewYNg against the ignominious California regulation SB1047, which would essentially kill open source AI and significantly slow down or stop AI innovation. Meanwhile, Bengio and Hinton signed a letter in which they “express our strong support”: …we are deeply concerned about the severe risks posed by the next generation of AI if it is developed without sufficient care and oversight. SB 1047 outlines the bare minimum for effective regulation of this technology. It is tempting to think that people are just “talking their book” – supporting or opposing the bill according to how it affects their professional or financial interests. Opponents of the bill are often associated with AI labs (LeCun is Chief AI Scientist at Meta), while many proponents work on AI safety. But I think there’s more to the story. It’s Not All About Vested Interests OpenAI opposes SB 1047 , but competitor Anthropic has more or less endorsed it . What’s going on here? To shine some light on what drives views of the bill, let me talk about two people I know personally – Dean Ball of the Mercatus Center and Nathan Labenz of the Cognitive Revolution podcast. Dean and I recently appeared on Nathan’s podcast 2 to discuss the bill , and they participated in an offline panel discussion on AI regulation that I recently organized. They are both thoughtful, honest brokers, well versed in current developments. And they are on opposite sides of the debate. Dean has serious concerns with the bill : Maybe it’s a worthwhile tradeoff. ... Maybe AI capabilities will become sufficiently dangerous that releasing them without extensive, government-mandated testing is wildly irresponsible. Maybe they’ll become so dangerous that it really is too risky to release them as open source, since currently anyone can subvert the safety protections of an open-source model. And maybe after that happens, Meta or another well-resourced company, with its shareholders and its public reputation on the line, will choose to disregard all of those safety best practices and open source its models anyway, prioritizing its strategic business goals over the safety of society. Maybe that is a world we’ll live in next month, or next year, or in five years, or in ten. But it is manifestly not the world we live in today, and to me, it is not obvious that any one of the “maybes” above is bound to come true. Nathan, meanwhile, says “if I were the Governor, I would sign the bill”, and quotes a recent letter from Anthropic: SB 1047 likely presents a feasible compliance burden for companies like ours, in light of the importance of averting catastrophic misuse The disagreement here seems to be rooted in differing views as to the likely impact of AI . Dean is not convinced that “AI capabilities will become sufficiently dangerous” in the next few years, while Nathan references “the importance of averting catastrophic misuse”. Such differences in expectation – how powerful will AI become, and how dangerous is that power? – underlie many disagreements. Everyone Is Basing Their Policy On Their Expectations We have very little idea what capabilities future models will have. It is even difficult to discern the capabilities of existing models. Improvements in prompting, “scaffolding”, and other techniques are squeezing ever-higher levels of performance out of models after they are released 3 . Long after OpenAI launched GPT-4, someone discovered it had the unsuspected ability to play chess – but only if prompted in just the right way. Even when a model’s capabilities are well understood, there is wide room to over- or under-estimate the potential impact. Creative uses emerge well after the model is released; a capability which seemed benign might turn out to have harmful applications. Conversely, a capability that seems dangerous may turn out to be insufficient to cause harm 4 . The result is that everyone is proposing policies based on their imagined future. If you’re working at an AI lab, it’s easy to assume that you’ll be able to control the technology you’re building, that it will be used mostly for good, that of course you’ll avoid harmful capabilities. Someone outside the industry may imagine the opposite. People imagine wildly different futures, leading them to equally different policy proposals; it’s no wonder that they then find it difficult to have a constructive discussion. When the problem is framed this way, the solution seems clear: rather than arguing about what might happen, we should work to ground the discussion in reality. How To Reduce Uncertainty A recent post from Helen Toner nicely presents some important ideas. Here is my own laundry list. Researchers have been developing techniques for measuring a model’s capabilities. This work can use more funding. Researchers should also have guaranteed access to the latest models, including those still under development, as well as “agent frameworks” and other systems and applications that are designed to squeeze more capabilities out of the models. Forecasting the rate at which AI capabilities will progress is another area of research which could use more funding and access. Then we come to the task of anticipating AI’s impact . For instance, there is extensive debate as to whether an AI that can provide detailed and accurate instructions for synthesizing a dangerous virus would pose a real danger 5 . Again, research funding would be helpful. We should also be carefully watching for “warning shots” – early indicators of potential danger. Hospitals could screen patients with unusual illnesses; they might have contracted an artificial virus that is fizzling out (or just beginning to spread). Cloud hosting providers might be asked to watch for signs of a self-replicating AI. The companies that are developing advanced AI models and applications have the best visibility into many important questions. We should institute mechanisms for policymakers, the research community, and the general public to have appropriate access to that information. Some possibilities: Requirements to report internal evaluations of model capabilities, including models still under development. Monitoring how models and applications are used 6 , focusing on actual or attempted use for bad purposes. For “red teams” and other safety researchers: access to unreleased models, or (under tight controls!) to model weights. Whistleblower protections. If an employee at an AI lab sees something concerning, they should be encouraged to report their concern to an appropriate regulator 7 . Such requirements should be focused on the largest and most advanced projects. To protect trade secrets, some information would be reported at a coarse level of detail and/or restricted to a narrow circle. Other measures to reduce uncertainty: Safe harbor protections for safety research. Within responsible limits, researchers should be able to poke at an AI application without fear of being accused of hacking or violating terms of service. Antitrust exemptions for cooperation on safety initiatives. The Time To Act Is Now, The Way To Act Is Gathering Data This post might be viewed as a call to delay policy action until we know more. Unfortunately, we don’t have that luxury. There will always be uncertainty around AI, and we will need to take action with imperfect information. Our first priority should be to gather more information – and quickly! This is a job for policy, not just voluntary commitments and privately funded research. You can stay up to date with my project to shed light on the big questions around AI by clicking one button. Share Thanks to Dean Ball , Dominik Hermle , Kevin Kohler , Julius Simonelli , and Robert Tracinski for feedback and suggestions. 1 Note that this was in the context of an earlier draft of the bill. I’m not aware that LeCun’s views have changed after the most recent amendments. 2 Along with Nathan Calvin. 3 See AI capabilities can be significantly improved without expensive retraining . 4 An AI that can tell you the instructions for creating anthrax may not be able to coach you through the necessary lab technique to successfully follow those instructions. See https://x.com/sebkrier/status/1817877099673203192 for a concrete example of a beneficial capability (identifying fraudulent tax returns) not having a significant impact in practice. In the same way, capabilities that are theoretically dangerous do not always result in significant harm. 5 A hypothetical attacker would need to clear multiple hurdles. For instance: access to a well-equipped lab, practical lab skills, some way of evading filters that are proposed to be built into DNA synthesizers, and the desire to kill thousands or millions of people. 6 Of course, it’s not possible to monitor usage of open-weight models. In an upcoming post, I’ll talk about the many difficult tradeoffs open-weight models pose. 7 In order to avoid discouraging potential whistleblowers, the process for responding to reports should be designed to minimize the impact on the business being reported, unless serious wrongdoing is uncovered.…
Leopold Aschenbrenner, who until recently worked on the “superalignment” team at OpenAI, believes that the United States should go on a wartime footing to develop advanced AI and use it to take over the world – so that China doesn’t get there first. I believe this is a fair summary of his recent 165-page treatise on the future of AI, “ SITUATIONAL AWARENESS: The Decade Ahead ”. I also believe that his assumptions are wrong, and his conclusions dangerous. In this post, I’m going to summarize the paper, and explain where I think it goes wrong. I just published 2500 words explaining why it’s better to talk to people than to argue with them on the Internet , so: Leopold 1 , if you’re reading this, please message me here on Substack, I’d love to talk; in the meantime, I’m going to try to get in touch with you. And instead of just disagreeing, I’m going to finish by presenting some thoughts on how to best move forward given that we should both be humble about our ability to predict the future. Warning to regular readers: compared to most of what I publish here, this piece is more hastily written, and aimed at a narrower audience. As a rule of thumb, if you hadn’t already heard of the paper I’m discussing, you might not be interested in reading this. Overview I know, I know, if we already have robots like this, the race is nearly over. Don’t overthink it. In Leopold’s view, the next half-decade or so will be the most decisive period in human history: The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace many college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be unleashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war. And he sees it as his duty to sound the alarm: Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness . Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride. Let me tell you what we see. I’ve speed-read the paper, and listened to the first hour of an interview Leopold gave on the (always excellent) Dwarkesh podcast. Here is my crude summary of his thesis: AI capabilities will continue to improve until we develop ASI 2 . Whoever controls ASI will wield overwhelming power – economically, militarily, and otherwise. Thus, whoever gets there first will be in a position to rule the world – potentially, in perpetuity. It would be bad if this were the wrong people – for instance, China. The only way to avoid this is for the US and its allies to get there first. All of this is likely to play out by 2030. Achieving ASI, especially at that pace, will require multi-trillion-dollar investments in chips, electricity generation, and other necessities. Hence, we (again, the US and allies) must go on a wartime footing: marshaling resources, suspending environmental protections, imposing strict security requirements, and sprinting to ASI. I can’t overstate how bold the claims are. The paper contemplates data centers costing a trillion dollars each; doubling the pace of fracking in the US so as to obtain natural gas to generate the necessary electricity; immediately placing all advanced AI research under military-bunker-level security; and many other steps of a similar nature. Most Of The Logic Actually Makes Sense As a syllogism, the overall thesis is sound. If you assume claims 1 through 7 above, then claim 8 does pretty well follow. So let’s consider each in turn. AI capabilities will continue to improve until we develop ASI : I am on record as believing that this is virtually inevitable. Here’s a piece I wrote a little over a year ago: Get Ready For AI To Outdo Us At Everything . Whoever controls ASI will wield overwhelming power – economically, militarily, and otherwise : Leopold’s argument is that a fleet of ASIs will yield astounding progress in essentially every field. Importantly, this will include AI research, robotics, material science, manufacturing techniques, energy, weapons design, and military strategy. Advanced robots, controlled by superintelligent AIs, will quickly be able to build the mines and factories to build more robots, and pretty soon we will have a massive supply of advanced equipment of any sort – including military hardware. I think Leopold massively underestimates the complexity of such an undertaking, and as a result he’s much too optimistic about the speed at which this could play out. But the fundamental logic seems sound. We are already on track to build decent robot bodies within a matter of decades, if not sooner; superintelligent AIs would certainly be able to finish the job. And the rest follows straightforwardly enough: given millions of superintelligent robots – which can build billions more – the question is not whether we would see widespread progress, but merely how quickly things would proceed. (If you don’t think an artificial superintelligence could do all these things, then I’d ask you: what is a specific thing you don’t think an ASI could do? Could a large crew of superintelligent people do that thing, assuming they are also strong, dextrous, dedicated, and brave? If the answer is "yes”, then actually what you’re objecting to is the idea that we will ever create ASI.) Thus, whoever gets there first will be in a position to rule the world – potentially, in perpetuity : if you have massive factories churning out military hardware, more advanced than anything else in the world, all controlled by AIs that are faster and smarter than anything else in the world, then yes, I think you have the option of exerting global control. Leopold takes a strong view of this: It seems likely the advantage conferred by superintelligence would be decisive enough even to preemptively take out an adversary’s nuclear deterrent. Improved sensor networks and analysis could locate even the quietest current nuclear submarines (similarly for mobile missile launchers). Millions or billions of mouse-sized autonomous drones, with advances in stealth, could infiltrate behind enemy lines and then surreptitiously locate, sabotage, and decapitate the adversary’s nuclear forces. Improved sensors, targeting, and so on could dramatically improve missile defense (similar to, say, the Iran vs. Israel example above); moreover, if there is an industrial explosion, robot factories could churn out thousands of interceptors for each opposing missile. And all of this is without even considering completely new scientific and technological paradigms (e.g., remotely deactivating all the nukes). It would simply be no contest. Even with a large technological lead, I would be, let us say, nervous about attacking a nuclear power. Not to mention the possibility that that power might decide to use the nuclear threat to force us to stop short of ASI. But there’s no denying that if one country develops ASI well before any other, they would have a profound advantage. As for the “in perpetuity” part: A dictator who wields the power of superintelligence would command concentrated power unlike any we’ve ever seen. In addition to being able to impose their will on other countries, they could enshrine their rule internally. Millions of AI-controlled robotic law enforcement agents could police their populace; mass surveillance would be hypercharged; dictator-loyal AIs could individually assess every citizen for dissent, with advanced near-perfect lie detection rooting out any disloyalty. Most importantly, the robotic military and police force could be wholly controlled by a single political leader, and programmed to be perfectly obedient—no more risk of coups or popular rebellions. Whereas past dictatorships were never permanent, superintelligence could eliminate basically all historical threats to a dictator’s rule and lock in their power (cf value lock-in). If the CCP gets this power, they could enforce the Party’s conception of “truth” totally and completely. (I’ve written about this idea myself .) It would be bad if this were the wrong people – for instance, China : I don’t have anything insightful to say about this. The only way to avoid this is for the US and its allies to get there first : this is one of the places where I disagree with Leopold. I’ll say more about this below. All of this will play out by 2030 : this is another point of disagreement. Again, more below. Achieving ASI, especially at that pace, will require multi-trillion-dollar investments : if you assume that world-shaking ASI is coming by 2030, then all the other seemingly fantastic elements of Leopold’s scenario – trillion-dollar data centers, doubling US fracking, and so forth – start to look pretty reasonable, or at any rate necessary. However, I consider this to be moot, since I don’t think ASI is coming by 2030. Hence, we (again, the US and allies) must go on a wartime footing: again, if you accept the earlier premises, then this does seem to follow, but I don’t accept those premises. No, We Won’t Have ASI By 2030 I can’t overemphasize how quickly Leopold thinks things will go from here. Some snippets: To put this in perspective, suppose GPT-4 training took 3 months. In 2027, a leading AI lab will be able to train a GPT-4-level model in a minute. We are on course for AGI by 2027. These AI systems will basically be able to automate basically all cognitive jobs (think: all jobs that could be done remotely). In fairness, I should note that he does acknowledge that the pace of progress is uncertain: To be clear—the error bars are large. Progress could stall as we run out of data, if the algorithmic breakthroughs necessary to crash through the data wall prove harder than expected. Maybe unhobbling doesn’t go as far, and we are stuck with merely expert chatbots, rather than expert coworkers. Perhaps the decade-long trendlines break, or scaling deep learning hits a wall for real this time. But he generally speaks as if skyrocketing capabilities are pretty much inevitable: In any case, do not expect the vertiginous pace of progress to abate. The trendlines look innocent, but their implications are intense. As with every generation before them, every new generation of models will dumbfound most onlookers; they’ll be incredulous when, very soon, models solve incredibly difficult science problems that would take PhDs days, when they’re whizzing around your computer doing your job, when they’re writing codebases with millions of lines of code from scratch, when every year or two the economic value generated by these models 10xs. Forget scifi, count the OOMs: it’s what we should expect. AGI is no longer a distant fantasy. Scaling up simple deep learning techniques has just worked, the models just want to learn, and we’re about to do another 100,000x+ by the end of 2027. It won’t be long before they’re smarter than us. I’m not going to bother nitpicking his analysis in detail. Forecasting AI progress more than a year or two out is incredibly difficult, and Leopold appears to be doing his earnest best. However, the result is full of credulous intuitive leaps. He waves away deep shortcomings of current models as due to the models being “hobbled” in ways that he believes are easily fixed: With simple algorithmic improvements like reinforcement learning from human feedback (RLHF), chain-of-thought (CoT), tools, and scaffolding, we can unlock significant latent capabilities. But unlocking test-time compute might merely be a matter of relatively small “unhobbling” algorithmic wins. Perhaps a small amount of RL helps a model learn to error correct (“hm, that doesn’t look right, let me double check that”), make plans, search over possible solutions, and so on. In a sense, the model already has most of the raw capabilities, it just needs to learn a few extra skills on top to put it all together. In essence, we just need to teach the model a sort of System II outer loop that lets it reason through difficult, long-horizon projects. [emphasis added] When he says “we just need to teach the model a sort of System II outer loop”, there’s an awful lot hiding in that “just” . The entire argument is full of handwaves like this. I won’t bother picking it apart point by point. I will just list two of the biggest reasons I believe his estimates are wildly over-optimistic. First, his timeline depends on an extraordinary number of things all breaking in the right direction, from AI progress to buildout of data centers, electricity generation, chips, and other necessities; trillions of dollars of investment capital arriving on cue; recursive self-improvement 3 having a massive impact ( it might not ); advances in robotics, manufacturing systems, and all of the other fields discussed earlier; and the successful, breakneck-speed adoption of AIs and robots throughout the economy ( seems like a stretch ). In prospect, it’s easy to overlook the many practical details that cause a complex project to bog down, no matter how much energy and urgency is brought to bear. And this would dwarf the scale and complexity of the Manhattan Project. There’s a reason that so many sayings from computer science and engineering have the same message: The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time. Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law. And my personal favorite: The Programmers’ Credo: we do these things not because they are easy, but because we thought they were going to be easy. Second, he drastically oversimplifies the task ahead. For instance, he states that GPT-4 is as intelligent as a “Smart High Schooler”. But in actual fact, GPT-4 is greatly superior to a high-schooler in some ways, and falls hopelessly short in others 4 . The results don’t average out. For instance, a 1924 automobile was much better than me at traveling in a straight line on a paved road, but much worse at navigating an obstacle course. The subsequent 100 years of progress have not resulted in an automobile that can now tackle the obstacle course. I wrote about this tendency to overstate progress toward AGI in The AI Progress Paradox , and I say more about why I don’t think further scaling alone will get us to AGI in What To Expect When You’re Expecting GPT-5 . I’ll conclude this section by noting that there are nice ideas throughout the paper. For instance, I like this argument that there may be room to extract a lot more value from existing training data: But perhaps, then, there are ways to incorporate aspects of how humans would digest a dense math textbook to let the models learn much more from limited data. In a simplified sense, this sort of thing—having an internal monologue about material, having a discussion with a study-buddy, trying and failing at problems until it clicks—is what many synthetic data/self-play/RL approaches are trying to do. Is Racing To World Domination Our Only Hope? If ASI isn’t isn’t coming anytime soon, then this entire discussion is moot. But if it is coming? Leopold’s core thesis is that to prevent China from using ASI to achieve world domination, we need to achieve world domination first. But there’s another path forward: a negotiated agreement not to unilaterally develop ASI. Leopold, like many before him, points out that this would be very difficult: Some hope for some sort of international treaty on safety. This seems fanciful to me. The world where both the CCP and USG are AGI-pilled enough to take safety risk seriously is also the world in which both realize that international economic and military predominance is at stake, that being months behind on AGI could mean being permanently left behind. If the race is tight, any arms control equilibrium, at least in the early phase around superintelligence, seems extremely unstable. In short, ”breakout” is too easy: the incentive (and the fear that others will act on this incentive) to race ahead with an intelligence explosion, to reach superintelligence and the decisive advantage, too great. He’s absolutely correct, a successful and effective treaty would be very difficult to pull off. It will be hard to reach agreement, and enforcement and monitoring pose serious challenges. However, many voices in the AI community believe that aligning a superintelligent AI – that is, making sure that it doesn’t decide to wipe out humanity, or lead to some other highly regrettable monkey’s-paw outcome – appears to be super difficult, double secret difficult. Leopold more or less waves this away, saying that we can “muddle through”: I’m incredibly bullish on the technical tractability of the superalignment problem. It feels like there’s tons of low-hanging fruit everywhere in the field. This is a bold claim, given that up to this point no one has even managed to construct a chatbot that will reliably refuse to provide napalm recipes. His plan for alignment is vague, and does little to address the many deeply-thought arguments that have been presented to make the case that it will be difficult. Probably the most frightening thing about the entire proposal is that it depends on getting alignment right on a very fast time scale: But I also want to tell you why I’m worried. Most of all, ensuring alignment doesn’t go awry will require extreme competence in managing the intelligence explosion. If we do rapidly transition from from AGI to superintelligence , we will face a situation where, in less than a year , we will go from recognizable human-level systems for which descendants of current alignment techniques will mostly work fine, to much more alien, vastly superhuman systems that pose a qualitatively different, fundamentally novel technical alignment problem; at the same time, going from systems where failure is low-stakes to extremely powerful systems where failure could be catastrophic; all while most of the world is probably going kind of crazy. It makes me pretty nervous. It makes me nervous too! I’d like to find a way to avoid it! Leopold notes that if we get to the threshold of ASI and aren’t confident that we’ve solved alignment, we can pause for a few months while we finish the alignment work. He doesn’t address the question of what happens if solving alignment takes longer than our lead over rival nations – a lead which could easily be measured in months. He also doesn’t address the possibility that uncertainty as to the size of our lead could lead us to take excessive risks – nor the fact that, by racing ahead, we will be pushing both ourselves and our rivals to give short shrift to safety. In short, Leopold assumes heroic steps and optimistic outcomes toward building, scaling, and aligning ASI, but does not allow for heroic steps or optimistic outcomes toward a treaty . For my part, both paths look perilous, so I hope to hell that ASI is a lot farther away than 2030 5 . Where Do We Go From Here? In Leopold’s scenario, the stakes could not be higher: Our failure today will be irreversible soon: in the next 12-24 months, we will leak key AGI breakthroughs to the CCP. It will be the national security establishment’s single greatest regret before the decade is out. Perhaps the single scenario that most keeps me up at night is if China or another adversary is able to steal the automated-AI-researcher-model-weights on the cusp of an intelligence explosion. China could immediately use these to automate AI research themselves (even if they had previously been way behind)—and launch their own intelligence explosion. That’d be all they need to automate AI research, and build superintelligence. Any lead the US had would vanish. Moreover, this would immediately put us in an existential race; any margin for ensuring superintelligence is safe would disappear. The CCP may well try to race through an intelligence explosion as fast as possible—even months of lead on superintelligence could mean a decisive military advantage—in the process skipping all the safety precautions any responsible US AGI effort would hope to take. He recommends aggressive steps for the US and its allies to develop and control advanced AI. These steps would be risky, destabilizing, and massively expensive. If his assumptions are correct, that might be unavoidable; if they are incorrect, it should be avoided. What to do? Extraordinary claims require extraordinary evidence. Before we undertake such drastic steps as suspending climate change mitigation, doubling US production of natural gas, or moving “all research personnel” into bunker-like SCIFs 6 , we should verify that we really are living in the scenario Leopold describes. Here are three assumptions he relies on, which I believe are importantly wrong: ASI (and its physical-world implications) is likely to arrive by 2030 We can confidently align ASI on that schedule 7 An effective international treaty is impossible In this light, I would suggest that Leopold articulate the following: What predictions can he make about the next 6, 12, 18 months, that (if they bear out) would support his estimate of rapid progress toward ASI? For instance, can he propose a way to validate the potential of “unhobbling”? What predictions can he make that would support his expectations regarding the feasibility of aligning AGI and ASI? If his scenario is correct, what steps need to be initiated most urgently? How can those steps be designed so as to minimize their impact (financial, political, diplomatic, and otherwise), especially in the near term (before evidence begins to accumulate that his scenario is correct)? What steps can be taken to shed more light on the feasibility of rapid ASI, feasibility of alignment, and difficulty of a treaty? I would especially encourage Leopold to articulate “no-regrets” moves that will be worthwhile whether or not his scenario is correct – such as increased investment in AI safety and alignment research. In short, if the future really is on the line and drastic measures are required, it’s not enough to paint a picture. You need to articulate priorities, and prepare to marshal increasing levels of evidence to support increasingly drastic steps. And Leopold, as I mentioned at the beginning: let’s talk! If you read this far, you should probably subscribe. Share 1 Apologies for referring to you on a first-name basis when we’ve never met, but that’s how it’s done nowadays, right? Last names seem awkwardly formal. 2 That is, Artificial Superintelligence – systems that are vastly superior to human intelligence at essentially any task. 3 I.e. the idea that, as AI approaches and then exceeds human-level intelligence, the AIs themselves will begin to drastically accelerate progress in AI research. 4 In a later section, Leopold himself notes a similar phenomenon: This is because AI capabilities are likely to be somewhat spikey—by the time AGI is human-level at whatever a human AI researcher/engineer is worst at, it’ll be superhuman at many other things. 5 And my biggest reason to doubt my belief that timelines will be slower, is the knowledge that I’m hoping they will be. 6 Sensitive Compartmented Information Facility 7 To say nothing of coping with the massive societal disruption and non-alignment-related risks that wide deployment of ASI would entail.…
Given how much I run my mouth on the topic of AI, you might think I understand the important issues. I only wish that were the case. For instance, earlier this year, Microsoft and OpenAI reported that they have detected “state affiliated adversaries … using LLMs 1 to augment cyberoperations”. Is this a big deal? Does it presage a tsunami of automated cyberattacks? I have no idea 2 . I don’t know enough about the dynamics of cybersecurity to understand which tasks / skills are the key bottlenecks, so it’s difficult for me to judge the impact of any given advance. This flavor of confusion – AIs are starting to become capable of X, but does X really matter? – pops up all over the place. Deepfakes can be used for political misinformation, but will that make things any worse than they already are? Some observers doubt it, arguing that the market for fake news is already saturated . Chatbots can write code, and there are claims this is revolutionizing developer productivity. But at the current level of capability, I’m very dubious. Somehow no one gets around to answering these big-picture real-world questions: what are the key levers that determine the equilibrium in cybersecurity, or misinformation in politics, or the productivity of software engineers. This might explain why discussion of AI risks and benefits is often out of touch with the real world, leading to acrimony and bad takes. It doesn’t have to be this way. We could be doing a much better job of making sense of the big questions around AI. The information we need is out there. But it’s scattered across too many different individuals, and no one is bringing them together. Until now. It's Hard To Make Sense By Yourself If we aren’t working from a common understanding of how the world functions today, we’re going to have a tough time working together to make good decisions about how AI could and should affect the future. And in many cases, we don’t have that common understanding. Consider the concern that AI could help a bad actor synthesize a dangerous virus. People have very different (mis)understandings of how difficult this would be, how close current AIs are to being able to help with the critical steps, and how serious the consequences might be. I recently had occasion to dive into the topic; it was a frustrating experience. I’ve previously written about how difficult it is to make sense of academic writing . Scientific papers often omit important context, use terminology that seems straightforward but isn’t, or rely on over-simplified models of the real world. If you’re not an expert in the relevant specialty, you’re going to come away confused. For instance, one paper suggests that it will soon be quite easy to synthesize a dangerous virus 3 : Numerous independent advances in virology and biotechnology … have recently combined to render many viruses accessible to skilled individuals at a low cost. Step-by-step assembly protocols capable of producing infectious viruses … are widely available … The typical advance … has required just … five years to be reproduced by undergraduates and individuals with moderate skills, and 12-13 years to become accessible to high school students and others with low skills and resources. [emphasis added] Another paper appears to say the opposite 4 : …over the next 5–10 years, benchtop device advances may enable reliable synthesis of dsDNA up to 10,000 base pairs long. As a reference point, there are a few viral genomes that are shorter than 7,000 base pairs, but the vast majority are between 10,000 and 200,000 base pairs in length. … It is possible to assemble shorter DNA fragments into these longer lengths, but the process requires skill and expertise and is not amenable to automation. For the foreseeable future—at least the next 10 years—it is unlikely that benchtop DNA synthesis devices will be able to automatically generate dsDNA at these greater lengths. [emphasis added] Cutting-edge labs have been able to routinely synthesize a variety of dangerous viruses for some time, so the first source implies that undergraduates could probably do it today, and high school students within the decade. The second source implies that this will remain quite difficult for “at least the next 10 years”. It’s no wonder that people debating AI safety often seem to be working from very different ideas of how dangerous AI might be! Misunderstandings Can Be Quite Profound Not infrequently, seemingly reliable sources will repeat statistics that are wildly implausible, either because they were wrong at the source or because of a failure to capture important context. Someone recently pointed me at an article in the Christian Science Monitor which cites “the OECD’s economists” as saying that: People who write for a living and now rely on AI are about 50% more efficient. Computer coders are 60% more efficient. As a statement about the world today, this is laughable. I presume the figure for computer coders was plucked from some study which measured the time to write a routine piece of standalone code. However, “computer coders” don’t spend that much time literally writing code 5 , and much of the code they write is not the sort that current AIs can reproduce. I don’t know anyone who writes software for a living and has personally experienced a significant productivity boost 6 . The practical scenario in which large productivity gains are seen today is when someone needs to write a straightforward, isolated piece of code in a domain that they are not personally experienced with. This doesn’t come up very often in practice, but it’s precisely the scenario that can be easily measured in a study. My favorite silly claim comes up in discussions of AI and cybersecurity. I asked ChatGPT for the economic impact of cyberattacks , and it provided a figure of roughly $10 trillion per year. Which is insane. Ten trillion dollars per year is: 10% of the entire world economy Well over $1000 per person Close to $2000 per person who has access to the Internet Twice the size of the global fossil fuel industry That’s not a ChatGPT hallucination, it’s a collective Internet hallucination – pretty much all the sources I’ve found online use that figure. Does anyone really believe that cybercrime is twice the size of the fossil fuel industry? If not, then why is this figure routinely cited? I take it as a sign that most people have no real idea what is going on. Many Heads Are Better Than One I recently had the opportunity to work with the Center for AI Safety (CAIS) to produce brief reports on AI and biosecurity and AI and cybersecurity . These were topics I didn’t know much about. I started out by plowing through a bunch of papers and reports… and ran into all of the problems mentioned above: apparent contradictions, implausible claims, difficulty in understanding which facts were important. Fortunately, I had another resource to lean on – the folks at CAIS helped me connect with multiple experts on each topic. When I was confused about something, I could ask. When I received contradictory input, I could point out the contradiction. When I circulated a draft report for review, I received corrections on points I hadn’t thought to question. With a day or two of work (spread over a longer calendar period), I accomplished more than I might have with months of solitary research. Here are some (lightly edited) examples of invaluable feedback I received. Each of these directly addressed an incorrect or oversimplified understanding I’d been holding. On the topic of excluding dangerous biological knowledge from LLM training data: I disagree with that paper -- I don't think you could meaningfully separate those "concerning" types of experiments from the rest of virology. Also, the scientific community and NIH have all had this discussion already, and they identified a set of research (GOF research) as the type of research that might require additional oversight. It's a sensitive topic and [redacted] is outside the mainstream on this point. On the harm that could result if someone synthesized a virus from a published genome: This is true, yes, but very few known viruses synthesized today have pandemic potential due to existing immunity and their general epidemiology - think low R0. E.g. someone could build SARS-CoV-2, but releasing it wouldn't have much of an impact considering the virus is already circulating. Hence, there are only a handful of pathogens that, if constructed without significant modification, have any appreciable risk of causing 10s or 100s of thousands of deaths. On the potential for nipping the next pandemic in the bud through strategic monitoring of viruses in wastewater: Important to note that wastewater is promising but also won't be a magic bullet. The next HIV may not shed into stool or urine. Early detection should be layered with multiple systems - what about screening plasma banks? Sending swabs to people in the mail for a routine "infection survey"? Lots of promising possibilities. This sort of targeted, “here’s what you missed when reading that source” information is invaluable in coming to a true understanding of a complex topic. And engaging with multiple experts, having different points of view, allows you to avoid blindly accepting one side of a controversial point. Doing This For More Topics There are so many complex questions around AI. How quickly will capabilities progress? How seriously should we take concerns around biosafety, cybersecurity, job loss, algorithmic bias, losing control of the machines? How feasible are advances in medicine, education, clean energy, general productivity? How might AI impact social media, elections, relationships, warfare? What would be the impact of various forms of regulation? For each of these important questions (and many more), I think the path forward involves picking the brains of a diverse group. So I’m starting a project to do exactly that. We’ll pick a topic, convene a group, drill down on as many important, confusing, and/or contentious details as we can identify, and publish the results. Then do it again, and again, and again. There’s little danger of running out of topics. We won’t strive for consensus; these topics involve too many unknowns, and too many subjective value judgements. Instead, the goal is to shed light on the full range of reasonable viewpoints, how those ideas relate to one another, and supporting arguments. This may sound unwieldy. But I believe it can be done in a way that is accessible to casual readers, without oversimplifying. Each topic will start with an overview, and include links down to additional detail and supporting arguments. Organizing all of this will be a lot of work. But worth it. Organizing The World’s Disagreements And Making Them Universally Accessible and Constructive GPT-4o: “Draw a picture that represents the concept of Organizing The AI World’s Disagreements And Making Them Universally Accessible and Constructive.” Google’s original mission statement was “to organize the world's information and make it universally accessible and useful”. Every topic I’ve mentioned in this post is the subject of active online debate. Unfortunately, those debates don’t do much to advance the collective understanding. People talk past one another, miscommunicate, cling to out-of-date facts, and generally generate more heat than light. There is plenty of worthwhile material in the mix, but it’s difficult to locate. Like the pre-Google web, much of the discussion is hard to access and not very useful. Through constructive engagement, we can use the diversity of viewpoints on AI to genuinely advance our collective understanding. But the work won’t stop when we publish a report. No matter how many people we can draw into a discussion, there will be ideas we failed to collect. No matter how hard we work on the writeup, we’ll sometimes fall short of clarity. And given the pace of developments in AI, anything we publish will almost immediately be out of date. To address this, we’ll publish living documents , forever open to feedback and updates. My dream is for this to become a central hub where people go: To discuss complex, speculative, and subjective topics around AI. To understand those topics. This should be a home for constructive conversation, a place where it’s fruitful to contribute ideas and knowledge. And it should be a trusted source for grounded analysis, presented in a way that is accessible to a broad audience. This will be a little bit like Wikipedia, but unlike Wikipedia, we’ll often tackle speculative and subjective topics. It will be a little bit like discussion forums such as Reddit, Hacker News, or LessWrong. But unlike those sites, we’ll actively recruit panels on topics of importance, and facilitate discussions that explore all of the nooks and crannies. We’ll maintain accessible summaries of each discussion, rather than expecting readers to dive into enormous discussion threads. And we’ll continuously update each topic over time. There’s a reason this doesn’t currently exist: it will require a lot of effort – for moderation, analysis, and curation. I have thoughts for how to achieve this at scale, but that’s a topic for another day. We’ll start small. Want To Get Involved? If you’d like to follow along, there’s a WhatsApp group where I post occasional updates. We’re convening our first panel, on the impact of AI on cybersecurity. I’d welcome suggestions for experienced folks to participate. If there’s a topic you’d like to see explored, and you’re in a position to help recruit panelists, I would love to hear from you. Drop me a line at amistrongeryet@substack.com . Help me build a discussion around the ideas expressed in this blog – please subscribe, comment, and share! Share 1 I.e. large language models like ChatGPT, though they’d more likely be using an “open” model that they can run on their own computers, such as Facebook’s Llama. 2 Well OK I have an idea : I suspect it’s not a big deal – yet. But that’s mostly just because, if it were a big deal, it’d probably be getting more press. My point is that I don’t know enough, for instance about how cybersecurity plays out on a day-to-day basis in the real world, to evaluate the question on the merits. Nor have I seen any informed commentary on the significance of this development. 3 From Delay, Detect, Defend: Preparing for a Future in which Thousands Can Release New Pandemics . The complete quote: Numerous independent advances in virology and biotechnology, none of which is obviously threatening on its own, have recently combined to render many viruses accessible to skilled individuals at a low cost. Step-by-step assembly protocols capable of producing infectious viruses from a genome sequence and standard laboratory reagents are widely available, with particularly detailed and reliable instructions for influenzaviruses and coronaviruses, the families responsible for the last five respiratory pandemics. Such protocols, which are intended to obviate the requirement for “tacit knowledge” to successfully perform the experiment, have become increasingly common. The recent democratisation of biotechnology suggests that they have broadly succeeded: the typical advance made in a cutting-edge laboratory by individuals with doctorates has required just one year to be reproduced in other laboratories, three years to be adapted for use in other contexts, five years to be reproduced by undergraduates and individuals with moderate skills, and 12-13 years to become accessible to high school students and others with low skills and resources. 4 From Benchtop DNA Synthesis Devices: Capabilities, Biosecurity Implications, and Governance . One paragraph discusses benchtop synthesis capabilities and the length of viral genomes: Current benchtop synthesis devices can reliably print DNA up to 200 bases in length, but it is very likely that newer devices will be able to reliably and automatically produce doublestranded DNA (dsDNA) up to approximately 5,000–7,000 base pairs in length within the next 2–5 years. Over the next 5–10 years, benchtop device advances may enable reliable synthesis of dsDNA up to 10,000 base pairs long. As a reference point, there are a few viral genomes that are shorter than 7,000 base pairs, but the vast majority are between 10,000 and 200,000 base pairs in length. Bacterial genomes are longer than 1 million base pairs. Another adds context: It is possible to assemble shorter DNA fragments into these longer lengths, but the process requires skill and expertise and is not amenable to automation. For the foreseeable future—at least the next 10 years—it is unlikely that benchtop DNA synthesis devices will be able to automatically generate dsDNA at these greater lengths. 5 From a recent interview with the CEO of a company that provides an AI tool that helps people write code: [At] the largest companies, the amount of time that people spend only writing software is probably about 20%. Based on my own experience, this figure seems plausible. The implication is that if AI tools could cut coding time in half, overall programmer productivity would only increase by about 10%. 6 I’m sure one could find examples of folks who claim that AI is making them much more productive. I’m just saying that there aren’t enough of them to intersect with my circle – which includes everyone I’ve met in a career building startups in Silicon Valley. I suspect that people who write code for a living and claim a big boost in productivity are either doing some very specific niche sort of work, or are greatly exaggerating (not necessarily lying, all sorts of cognitive biases could be coming into play). One of the best use cases for AI coding assistants seems to be writing a simple, routine, isolated piece of code using a language or tool that you’re not experienced with. This is exactly what most people who are not professional programmers do when they want to take the pulse of AI coding assistants, so they come away with an unrepresentatively positive impression.…
The ocean is wet, the sun is bright, and online discourse heavily features people criticizing one another's viewpoints. I recently got pulled into such a situation, when econ blogger Noah Smith wrote a post arguing that fears of advanced AI causing mass unemployment are overblown . This seemed obviously wrong to me: when AIs can do everything faster, cheaper, and better than a person... won't we let them? How does that not result in the collapse of employment? I was, to put it simply, confused and sad. I respect Noah, and didn't understand how we could see things so differently. Is he not as smart or trustworthy as I thought? Should I drop my subscription to his blog? Or maybe am I not as smart as I thought? Should I stop blogging? I was also worried, because Noah is a prominent blogger, and was promoting an idea that struck me as dangerously incorrect. I wrote a rebuttal , but that didn’t really accomplish anything. Well, it let me feel like I'd "done something", but that’s always a suspect motivation. Then I did what we somehow never do in these situations: I talked to Noah about it. Our Conversation I asked my readership whether anyone could connect me with him. It turns out that we know a few people in common, and he offered to chat. I hadn't really expected this to pan out – he's a Famous Blogger, and I'm... an infamous blogger? Not-famous blogger? Anyway, someone he's probably never heard of. But he was gracious enough to engage. I wrote a summary of my thoughts, he added a bunch of comments, and then we went back and forth a few times over email 1 . In the end, we determined that we both agree with the following high-level model of job prospects in an age of strong AI 2 : If there is some natural constraint on our ability to efficiently devote resources to the creation and operation of ever-more AIs [for instance, due to a shortage of AI chips], then comparative advantage applies and people should have plenty of work. Alternatively, we can get the same result by imposing an artificial constraint, such as limiting the amount of energy used for data centers. Or we could distribute wealth in some fashion, such as by nationalizing natural resources. In this world, people won't necessarily find jobs in the conventional sense, but nor will they need them. If none of the above transpire, there would be a problem (immiseration of some, potentially large, portion of the population). But this should be quite avoidable, per 1/2/3 above. Noah added: Sounds right to me! But keep in mind that (4) also only happens if AI fully replaces human skills across the board. If there are still skills only humans can do, then we don't even need 1/2/3 to keep being valuable. (A complete transcript of our conversation can be found here – thanks Noah for inviting me to publish it.) To be clear, this doesn't mean we agree on everything. I still think we will eventually get to a point where AI destroys the conventional job market, and I don’t get the sense Noah expects that. But look at all the good things which came out of our conversation: I've moved from baffled disagreement on a question of logic 3 , to a legitimate difference of expectation regarding concrete future developments, such as potential constraints on chip supply. I understand where and why we diverge, such that I am still able to trust Noah’s opinions in general. I've received a useful reality check on my own ideas, and uncovered some assumptions I hadn’t realized were important to my thinking. If I find myself in another discussion of this topic, and someone cite’s Noah’s blog post, I can point out that Noah agrees that his argument depends on 1/2/3 above and we can then dive into those questions. If I hope to shift the broader public discourse on this question, I have a better idea how to go about it. Perhaps most important, I made a connection with Noah. It's not like we're buddies; I don't know whether we'll communicate again. But if we do, we'll have an increment of mutual context, and hopefully respect and trust. Why This Worked When two Internet strangers come together to discuss an important topic on which they disagree, success is far from guaranteed. It can easily end in frustration, acrimony, and disdain, and with no forward progress on any informational point. Especially over a medium like email, lacking in nonverbal cues or reminders of one another's humanity, where it's so easy for misunderstandings to fester. So why did our conversation go well? I credit the following factors: 1. I respect Noah , having read countless thousands of words on his blog. As a result, I had trust that it was worth my trouble to invest in the conversation, reading his words carefully and taking the trouble to express my ideas clearly. 2. I don't know whether Noah knew me from Adam, but he also put in the effort to make things work . He read and engaged with what I had to say, he stuck with the conversation through multiple rounds, he stayed constructive and friendly throughout. 3. We're both good communicators in written form. He does this professionally, and I've honed my craft over years of consensus-seeking in workplace discussions among distributed teams. 4. We were communicating in private , with no audience to play to or get distracted by, and no need to score quick points. 5. We had achievable goals. Instead of trying to change one another's minds, we simply analyzed our disagreement until we teased out the differences in our underlying assumptions. In particular, we have different intuitions as to the likelihood of constraints on the amount of computing capacity that can be manufactured. 6. We stayed focused ; neither of us introduced new topics or clung to unnecessary supporting points. 7. We had a good starting point . Noah's blog post did an excellent job of explaining his argument, which allowed me to understand where he was coming from and start our conversation at a point fairly close to the key difference in our assumptions. Not every conversation will have these advantages, but in an upcoming post, I’ll be presenting some ideas for how to foster more good discussions. Conversation is More Efficient Than Posting Rebuttals Apparently economics professor Robin Hanson has published some strong views on the value of modern medicine. Scott Alexander, a blogger I greatly admire, decided that Hanson “more or less believes medicine doesn’t work”, and to address this, he posted a nearly 7000 word critique . Hanson wrote a 2200 word response , and on the day I’m writing this, Alexander followed up with another 5400 words . Much of the disagreement seems to be about what Hanson’s views actually are. In his response, Hanson states that Alexander mischaracterized his views. Alexander replied: I acknowledge he’s the expert on his own opinion, so I guess I must be misrepresenting him, and I apologize. But I can’t figure out how these claims fit together coherently with what he’s said in the past. So I’ll lay out my thoughts on why that is, and he can decide if this is worth another post where he clarifies his position. (I think this is a polite way of saying “Hanson has been all over the place on this topic, and I’d appreciate it if he would acknowledge that.” Note that I haven’t looked into any of this myself, I am just echoing Alexander here.) So basically, we’re now 14,000 words in, and readers of Alexander’s blog are left in confusion as to the position Hanson would defend 4 . Possibly, if the back-and-forth continues, we’ll eventually get somewhere. But it would be so much faster and easier if these two would just talk directly. The big problem, I think, is that because Alexander is addressing his audience in a static blog post instead of engaging directly with Hanson, he feels the need to be systematic. He can’t wait to see whether the reader has grasped his point, so he throws in everything he could possibly say up front. When there’s no opportunity for back-and-forth, sometimes this the best you can do. But Hanson is responding! It would be much more efficient to talk to him directly, fast-forward to agreement on what their respective views are (and how they might differ), and only then start presenting public evidence to debate their actual differences. (It’s worth noting that Scott Alexander does often engage in direct conversation with folks whose opinions he questions; I always appreciate reading about those conversations.) Direct Engagement Doesn't Always Work I’ve recently come across some examples where disagreeing parties engaged in direct conversation, and failed to arrive at a shared understanding. These strike me as exceptions that prove the rule 5 . First, the February debate between Beff Jezos and Connor Leahy , prominent figures with opposing views on AI safety. The three-hour session didn’t seem to do much to advance anyone’s understanding of anything. The participants (Leahy in particular) routinely interrupted one another, the conversation constantly jumped around, and there was heavy reliance on abstractions and hypotheticals that were easily misconstrued. As a result, while some interesting ideas came out, nothing was ever really settled and it’s not clear to me that the participants (let alone the audience) properly understood one another’s ideas. This might have gone better with a more clearly defined structure, supported by a moderator. Second, the Rootclaim $100,000 Lab Leak Debate , intended to resolve the question of whether COVID originated in a lab leak. The process was incredibly rigorous, with three separate debate sessions totaling 15 hours (!), supported by massive research and preparation by both participants, and two judges putting in roughly 100 hours each 6 . This failed to produce consensus, in the sense that the loser of the debate disagrees with the outcome. However, consensus on the origins of COVID was a very ambitious goal, given the extent to which primary evidence went uncollected or was actively concealed. And the thorough debate did succeed in massively advancing the public understanding of the topic. I consider this to be a noteworthy achievement. Finally, forecasting existential risk from AI: the Forecasting Research Institute gathered eleven expert forecasters with eleven AI safety experts . Each group was chosen for extreme views on AI risk; the selected forecasters on average put the probability of AI doom at 0.1%, and the safety experts put it at 25%. After 80 hours of research and discussion, the two groups barely budged. My sense – take this with a grain of salt – is that the AI safety folks in question were deep into a very specific worldview, and did not do a great job of communicating that worldview to folks outside their circle 7 . The upshot is that for a conversation to go well, you need some combination of: a tractable topic (all three of these examples involved very difficult topics), participants who are skilled communicators, and a highly engaged and skilled moderator. For complex topics, a large time commitment will also be needed. Talking: It's Good I don't know how much time I invested in my conversation with Noah Smith. A couple of hours, all told? Certainly less than I spent writing my original rebuttal post. In return, I understand where Noah and I diverge; I reaffirmed my overall trust in his writing; and I better understand my own position. If I hope to shift the broader public discourse on this question, I have a better idea how to go about it. Last but not least, I made a connection with someone I respect. Not a bad payoff! I’m cooking up an initiative to generate more productive conversations about AI. Getting to see this work in practice was validating, and I’m looking forward to doing a lot more of it, in an environment that’s set up for success: with well-defined topics, active moderation, and committed participants. The next time you're tempted to dunk on (what strikes you as) a bad take, consider whether you could instead reach out and start a conversation. You might accomplish something; you might learn something; you might make a connection. The world doesn't really need more dunks, but it desperately needs more connections. Help me build a discussion around the ideas expressed in this blog – please subscribe, comment, and share! Share 1 I gather that, as this was going on, Noah was also responding to various other responses to his post. For instance, from Zvi Moshowitz’s blog : Before I get to this week’s paper, I will note that Noah Smith reacted to my comments on his post in this Twitter thread indicating that he felt my tone missed the mark and was too aggressive (I don’t agree, but it’s not about me), after which I responded attempting to clarify my positions , for those interested. There was a New York Times op-ed about this, and Smith clarified his thoughts . Noah Smith: I asked Smith by email what he thought of the comments by Autor, Acemoglu and Mollick. He wrote that the future of human work hinges on whether A.I. is or isn't allowed to consume all the energy that's available. If it isn't, "then humans will have some energy to consume, and then the logic of comparative advantage is in full effect." He added: "From this line of reasoning we can see that if we want government to protect human jobs, we don't need a thicket of job-specific regulations. All we need is ONE regulation – a limit on the fraction of energy that can go to data centers.” Matt Reardon: Assuming super-human AGI, every economist interviewed for this NYT piece agrees that you'll need to cap the resources available to AI to avoid impoverishing most humans. 2 I’ve lightly edited this for clarity. 3 That is, whether the economic concept of comparative advantage somehow renders it mathematically impossible for AIs to permanently disrupt the job market. 4 Source: I am a reader of Alexander’s blog, I am in confusion, and pride prevents me from believing that it’s just me. 5 In the original sense of that phrase, in which “prove” means “test”, not “support”. That is, these are exceptions which test the rule and show its limits. 6 In addition to the 15 hours of debate, the judges spent time fact-checking participants’ claims, assessing the merits, and writing up their verdicts. 7 Appendix 8 of the report on an earlier stage of this project , beginning on page 113, contains raw samples of the actual discussions between participants and makes for interesting reading. The conversation wandered into all sorts of strange places, such as the probability that advanced AI encloses the Sun in a Dyson sphere within the next 77 years.…
Selamat datang ke Player FM
Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.