Ist ChatGPT o3 schon klüger als wir alle?

Apr. 25, 2025

An abstract painting representing the concept of 'Jagged AGI'. The artwork should feature a dynamic interplay between chaos and intelligence, using bold, jagged geometric shapes to symbolize unpredictability and sharp intelligence. Integrate cool and warm colors to contrast areas of superhuman AI capabilities with zones of failure or uncertainty. Use layered color fields, visible brushstrokes, and overlapping forms to evoke a sense of complexity and transition. Include swirling, fragmented paths or arrows to suggest agentic movement and strategic goal pursuit. The composition should feel energetic, unbalanced yet intentional, reflecting a world on the edge of transformation.

Ethan Mollick schreibt:

But the fact that the AI often messes up on this particular brainteaser does not take away from the fact that it can solve much harder brainteasers, or that it can do the other impressive feats I have demonstrated above. That is the nature of the Jagged Frontier. In some tasks, AI is unreliable. In others, it is superhuman. You could, of course, say the same thing about calculators, but it is also clear that AI is different. It is already demonstrating general capabilities and performing a wide range of intellectual tasks, including those that it is not specifically trained on. Does that mean that o3 and Gemini 2.5 are AGI? Given the definitional problems, I really don’t know, but I do think they can be credibly seen as a form of “Jagged AGI” - superhuman in enough areas to result in real changes to how we work and live, but also unreliable enough that human expertise is often needed to figure out where AI works and where it doesn’t. Of course, models are likely to become smarter, and a good enough Jagged AGI may still beat humans at every task, including in ones the AI is weak in.
Does it matter?
Returning to Tyler’s post, you will notice that, despite thinking we have achieved AGI, he doesn’t think that threshold matters much to our lives in the near term. That is because, as many people have pointed out, technologies do not instantly change the world, no matter how compelling or powerful they are. Social and organizational structures change much more slowly than technology, and technology itself takes time to diffuse. Even if we have AGI today, we have years of trying to figure out how to integrate it into our existing human world.
Of course, that assumes that AI acts like a normal technology, and one whose jaggedness will never be completely solved. There is the possibility that this may not be true. The agentic capabilities we're seeing in models like o3, like the ability to decompose complex goals, use tools, and execute multi-step plans independently, might actually accelerate diffusion dramatically compared to previous technologies. If and when AI can effectively navigate human systems on its own, rather than requiring integration, we might hit adoption thresholds much faster than historical precedent would suggest.
And there's a deeper uncertainty here: are there capability thresholds that, once crossed, fundamentally change how these systems integrate into society? Or is it all just gradual improvement? Or will models stop improving in the future as LLMs hit a wall? The honest answer is we don't know.
What's clear is that we continue to be in uncharted territory. The latest models represent something qualitatively different from what came before, whether or not we call it AGI. Their agentic properties, combined with their jagged capabilities, create a genuinely novel situation with few clear analogues. It may be that history continues to be the best guide, and that figuring out how to successfully apply AI in a way that shows up in the economic statistics may be a process measured in decades. Or it might be that we are on the edge of some sort of faster take-off, where AI-driven change sweeps our world suddenly. Either way, those who learn to navigate this jagged landscape now will be best positioned for what comes next… whatever that is.

Das sagt Tyler Cowen dazu.

Hier der ganze Beitrag von Mollick:

One Useful Thing

On Jagged AGI: o3, Gemini 2.5, and everything after

Amid today’s AI boom, it’s disconcerting that we still don’t know how to measure how smart, creative, or empathetic these systems are. Our tests for these traits, never great in the first place, were made for humans, not AI. Plus, our recent paper testing prompting techniques…

2 months ago · 369 likes · 57 comments · Ethan Mollick

Notizblock von Andreas Sator

Diskussion über diese Post