Skip to content
Back

AI Evolution Through the Lens of Gall's Law

Edit page

Gall’s Law states:

A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be made to work by patching.

Originally used to describe “evolution over planning” in system design, this law seems particularly fitting when placed in the development trajectory of artificial intelligence.


The Evolution of Neural Networks

Artificial intelligence, especially the development of deep learning, well embodies the path of gradually evolving from simple systems to complex systems.

  • From the earliest MP model (1943) to the perceptron (1958), although capabilities were limited, they verified that the basic concept of “simulating neurons” was feasible.
  • By the 1980s, the proposal of multi-layer perceptrons and backpropagation algorithms provided a technical foundation for training more complex networks.
  • Entering the 2000s, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) brought practical breakthroughs.
  • In 2012, AlexNet ignited the deep learning wave with deep networks and GPU acceleration.
  • Subsequent development was even more astonishing: ResNet (2015) solved the deep model degradation problem, Transformer (2017) introduced attention mechanisms and a new paradigm for sequence modeling, completely changing NLP and even the entire AI field.

Every stage is iterative optimization based on existing successful structures, not reinventing the wheel from scratch. Transformer itself is not an architecture “falling from the sky”; its predecessors include Seq2Seq models (2014), Bahdanau attention mechanism (2014), encoder-decoder structures, and more.

The deeper one understands the history of artificial intelligence development, the more one can feel this is an inevitable technological explosion, the accumulation of day-to-day results. Accompanying the improvement of hardware capabilities brings exponential increase in computing power, algorithm improvements, the inevitable expansion of data—everything is preparing for the scaling law.


The Counterexample of Building from Scratch

In contrast, Japan’s “Fifth Generation Computer System” initiative launched in the 1980s provides a typical counterexample that violates Gall’s Law.

At that time, to catch up with American AI technology, the Japanese government and enterprises attempted to bypass the traditional von Neumann architecture and directly construct an “intelligent system” based on logical reasoning and knowledge representation, completely detached from existing architectures and experience.

However, despite huge investment, this attempt yielded few results and failed to form any actually sustainable technical path. It attempted to “build a complex system from scratch,” violating Gall’s Law’s evolutionary principle, so failure was perhaps inevitable.


Reinforcement Learning is the Next Step in Evolution

Reinforcement learning (RL) is becoming a new focus in language model development, but this is not a “revolution that overthrows existing architecture,” but a natural continuation of evolution.

Sholto Douglas, a core member of the Claude development team, recently stated in an interview:

In the next 6 to 12 months, we are very focused on scaling up reinforcement learning (RL) and exploring where this will take us. I expect to see extremely rapid progress. There’s no need to invest in more orders of magnitude in pre-training scale. Facts have proven that reinforcement learning is effective, and these models will be able to achieve “plug-and-play remote worker” capabilities by 2027.

Previously, OpenAI researcher Yao Shunyu wrote an article AI Enters the Second Half, saying:

RL is usually considered the “ultimate form” of AI, after all, in theory it can guarantee winning in games (all competitive tasks in closed environments with clear winners and losers), and in practice, almost all superhuman-level AI systems (like AlphaGo) cannot be separated from RL support.

The introduction of reinforcement learning is further evolution, fine-tuning, and behavioral optimization based on the existing architecture. This approach essentially still follows Gall’s Law.


The Enhancement of Intelligence

We might constantly hear voices to “disrupt Transformer,” but looking back at AI development history, we find that truly effective progress is always gradual evolution based on existing systems. Some current model architectures have been improved on the basis of Transformer, and even fused with other architectures, with deep-level optimization.

As Gall’s Law reveals, a complex system that can work is not designed out of thin air, but grows gradually from a simple system that can work. Looking forward to reinforcement learning further verifying these.

There is still a long way to go for the enhancement of intelligence. “Intelligence” in my eyes: the ability to complete certain complex goals.

加尔定律(Gall’s Law)指出:

一个能够正常运作的复杂系统,必然起源于一个能够正常运作的简单系统。 而试图从零构建的复杂系统,几乎注定无法正常工作,也无法通过事后修补让它变得可行。

这一定律原本用于描述系统设计中的”演化优于规划”,但放在人工智能的发展轨迹中,却显得格外贴切。


神经网络的演化

人工智能,特别是深度学习的发展,很好地体现了从简单系统逐步演化到复杂系统的路径。

  • 从最早的 MP 模型(1943)到感知机(1958),尽管能力有限,但验证了”模拟神经元”这一基本构想是可行的。
  • 到 80 年代,多层感知机与反向传播算法的提出,为更复杂的网络训练提供了技术基础。
  • 进入 2000 年代,卷积神经网络(CNN)和循环神经网络(RNN)带来了实用化的突破。
  • 2012 年,AlexNet 凭借深层网络和 GPU 加速,引爆了深度学习浪潮。
  • 后续的发展更为惊人:ResNet(2015)解决了深层模型退化问题,Transformer(2017)引入注意力机制与序列建模新范式,彻底改变了 NLP 乃至整个 AI 领域。

每一个阶段,都是在已有成功结构的基础上迭代优化,而非从头重造轮子。Transformer 本身也不是”从天而降”的架构,其前身包括 Seq2Seq 模型(2014)、Bahdanau 注意力机制(2014)、编码器-解码器结构等。

越深入了解人工智能发展史,越能感受到这是必然的技术爆炸,日积月累的成果。伴随的硬件能力的提升带来算力倍增,算法的改进、数据的必然性膨胀,一切为 scaling law 准备。


从零构建的反例

相比之下,日本在 80 年代发起的”第五代计算机系统”计划,则提供了一个违反加尔定律的典型反例。

当时,为了赶超美国的 AI 技术,日本政府和企业试图绕过传统冯·诺依曼架构,直接构建基于逻辑推理和知识表示的”智能系统”,完全脱离已有架构和经验。

然而,尽管投入巨大,这一尝试最终成果寥寥,未能形成任何实际可持续的技术路径。它试图”从零构建一个复杂系统”,违背了加尔定律的演化原则,因此失败或许也是注定的。


强化学习是演化中的下一步

强化学习(RL)正在成为语言模型发展的新焦点,但这也不是一种”推翻现有架构”的革命,而是演化的自然延续。

Claude 开发团队的核心成员 Sholto Douglas 近期在采访中表示:

未来 6 到 12 个月,我们非常关注扩大强化学习 (RL) 的规模,并探索这将把我们带向何方。我预计,因此会看到极其快速的进展。不需要再投入更多数量级的预训练规模。事实证明,强化学习是行之有效的,而且这些模型到 2027 年将能够达到”即插即用型远程工作者”的能力。

之前 OpenAI 研究员姚顺雨一篇文章表示AI 进入下半场,说:

RL 通常被认为是 AI 的”终极形态”,毕竟从理论上,它能够保证在 game( 所有在封闭环境中,有明确输赢的博弈任务。)中取胜,而在实践上,几乎所有 superhuman 水平的 AI 系统(比如 AlphaGo)都离不开 RL 的支撑。

强化学习的引入,在原有架构基础上的进一步演化、微调和行为优化。这种做法,本质上依然遵循着加尔定律。


智能的提升

我们或许会不断看到”颠覆 Transformer”的声音,但回顾 AI 发展历史我们发现:真正有效的进步,总是在已有系统基础上的渐进式演化。现有的一些模型架构已经是 Transformer 基础上进行改进的,甚至与其他架构融合,深入底层优化。

就像加尔定律所揭示的,**能够正常工作的复杂系统,不是凭空设计出来的,而是从一个能够工作的简单系统,逐步生长出来的。**期待强化学习进一步验证这些。

智能的提升仍有很长的路可以走,我眼中的”智能”:完成某种复杂目标的能力。



Edit page
Share this post on:

Previous Post
Model Usage Reflection: Intelligence and Experience
Next Post
AI is Just a Name: Complexity Transfer and Product Design