The trillion-dollar question

Artificial General Intelligence, a system that can reason, learn, and act across any domain at or above human level, remains the stated goal of several major AI laboratories. The progress over the past two years has been remarkable. Models like Anthropic's Claude, OpenAI's GPT series, and Google's Gemini can write code, analyse legal documents, generate creative content, and hold nuanced conversations across virtually any subject. Yet despite this progress, AGI remains out of reach. The question is why, and whether the remaining barriers are fundamental or simply a matter of more engineering effort. The honest answer is that nobody is entirely sure.

Reasoning and planning

This is probably the single biggest gap. Current models are impressive pattern matchers and can chain reasoning steps together, but they struggle with genuine long-horizon planning where you need to decompose a novel problem, try different approaches, recognise dead ends, backtrack, and adapt your strategy. They can simulate this behaviour to a degree, but it breaks down on problems that are genuinely outside their training distribution.

The reasoning-focused models, such as OpenAI's o1 and o3 series, are making progress here by giving models the ability to "think" for longer before responding. But it remains unclear whether scaling that approach gets you all the way to general reasoning or simply produces better performance on structured problems that happen to reward step-by-step thinking.

Memory and persistent learning

Models today are fundamentally stateless between conversations. Memory systems exist, but they are bolted on after the fact rather than being native to the architecture. An AI system cannot genuinely learn from experience the way humans continuously update their understanding through every interaction.

Fine-tuning allows models to absorb new knowledge, but it is expensive, slow, and blunt compared to how a human professional absorbs and integrates new information in real time during a conversation, a meeting, or while reading a document. A true AGI would need to learn continuously from every interaction and build cumulative understanding over time, something no current architecture supports natively.

Grounding and world models

Large language models learn statistical relationships between tokens. Whether they build genuine internal models of how the world works, as opposed to sophisticated approximations, is one of the most debated questions in AI research. They can describe physics convincingly but cannot truly simulate it. They hallucinate because they are optimising for plausible text rather than verified truth.

Connecting language models to real-world perception and action through robotics and embodiment is progressing, but remains primitive. A model that can write a brilliant essay about bicycle mechanics still has no understanding of what it feels like to lose balance or how to adjust grip pressure on a wet handlebar. This gap between linguistic competence and genuine understanding may prove to be one of the hardest to close.

The evaluation problem

This is an underappreciated barrier. We do not actually have a robust, agreed-upon definition of AGI, let alone a reliable way to test for it. Models keep saturating benchmarks that were supposed to be years away from being solved, which either means they are closer to AGI than we thought, or, more likely, that the benchmarks were testing a narrower capability than we assumed.

Without knowing precisely what you are aiming for, it is remarkably difficult to know how far away you are. The AI research community is effectively trying to build something it cannot yet properly define or measure.

Compute and architecture

Whether raw computational power is the primary bottleneck is itself a major point of disagreement. The scaling laws school of thought argues that bigger models trained on more data with more compute will continue to improve and eventually cross whatever threshold constitutes AGI. Others believe the transformer architecture, which underpins virtually all current large language models, has fundamental limitations and that genuinely new architectures will be needed.

The history of AI research suggests that raw compute tends to win eventually. But there is genuine uncertainty about whether that pattern holds for a challenge as broad as general intelligence, or whether some qualitative architectural breakthrough is required that we have not yet conceived.

Agency and autonomy

Even if a model could reason at human level across all domains, giving it the ability to act autonomously in the real world introduces an entirely separate set of challenges. Maintaining goals over extended periods, managing resources, recovering gracefully from errors, and coordinating actions across multiple systems requires production-grade systems engineering that is still in its infancy.

Frameworks like LangChain, CrewAI, and Temporal are beginning to address pieces of this puzzle, enabling AI agents to take actions, collaborate, and run reliably over time. But connecting an intelligent model to the real world in a way that is both capable and safe remains one of the most difficult unsolved problems.

Alignment and safety

This is arguably not a technical limitation on capability but a deliberate and necessary constraint. Companies like Anthropic, OpenAI, and Google DeepMind are actively choosing to move carefully because a system that can genuinely reason at or above human level across all domains would be extraordinarily powerful and potentially dangerous if its objectives were not properly aligned with human interests.

Some capability is very likely being held back or released gradually for safety reasons rather than technical ones. This is a responsible position. The alignment problem, ensuring that a superintelligent system does what we actually want rather than what we literally asked for, is arguably harder than building the capability itself. Getting this wrong could have consequences that are difficult to overstate.

Where does this leave us?

Current AI systems are remarkably capable in narrow and semi-general domains. They can already transform how businesses operate, and at Node we are building production systems that demonstrate this daily. But they lack the deep integration of reasoning, memory, planning, grounding, and autonomy that would constitute genuine general intelligence.

Whether that integration is five years away or fifty is the question the entire industry is grappling with. The barriers are real, the progress is undeniable, and the answer probably lies somewhere between the optimists who think scaling alone will get us there and the sceptics who believe something fundamentally new is required. What we can say with confidence is that the AI systems available today are already powerful enough to deliver transformative value to businesses willing to deploy them intelligently, and that is where our focus remains.

Related Work