The End of Programming : 编程的终结
Posted 禅与计算机程序设计艺术
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了The End of Programming : 编程的终结相关的知识,希望对你有一定的参考价值。
The End of Programming:
编程的终结
The end of classical Computer Science is coming, and most of us are dinosaurs waiting for the meteor to hit.
经典计算机科学的末日即将来临,我们大多数人都是等待流星撞击的恐龙。
I came of age in the 1980s, programming personal computers like the Commodore VIC-20 and Apple ][e at home. Going on to study Computer Science in college and ultimately getting a PhD at Berkeley, the bulk of my professional training was rooted in what I will call “classical” CS: programming, algorithms, data structures, systems, programming languages. In Classical Computer Science, the ultimate goal is to reduce an idea to a program written by a human — source code in a language like Java or C++ or Python. Every idea in Classical CS — no matter how complex or sophisticated — from a database join algorithm to the mind-bogglingly obtuse Paxos consensus protocol — can be expressed as a human-readable, human-comprehendible program.
我在 1980 年代长大成人,在家中为 Commodore VIC-20 和 Apple 等个人电脑编程。继续在大学学习计算机科学并最终在伯克利获得博士学位,我的大部分专业培训都植根于我所说的“经典”CS:编程、算法、数据结构、系统、编程语言。在经典计算机科学中,最终目标是将想法简化为人类编写的程序——Java、C++ 或 Python 等语言的源代码。经典 CS 中的每一个想法——无论多么复杂或复杂——从数据库连接算法到令人难以置信的迟钝 Paxos 共识协议——都可以表示为人类可读、人类可理解的程序。
When I was in college in the early ’90s, we were still in the depth of the AI Winter, and AI as a field was likewise dominated by classical algorithms. My first research job at Cornell was working with Dan Huttenlocher, a leader in the field of computer vision (and now Dean of the MIT School of Computing). In Dan’s PhD-level computer vision course in 1995 or so, we never once discussed anything resembling deep learning or neural networks—it was all classical algorithms like Canny edge detection, optical flow, and Hausdorff distances. Deep learning was in its infancy, not yet considered mainstream AI, let alone mainstream CS.
90 年代初我上大学的时候,我们还处于 AI 寒冬的深度,AI 作为一个领域同样被经典算法主导。我在康奈尔大学的第一份研究工作是与计算机视觉领域的领导者Dan Huttenlocher(现任麻省理工学院计算学院院长)合作。在 Dan 1995 年左右的博士级计算机视觉课程中,我们从未讨论过任何类似于深度学习或神经网络的东西——都是经典算法,如 Canny 边缘检测、光流和 Hausdorff 距离。深度学习还处于起步阶段,尚未被视为主流 AI,更不用说主流 CS。
Of course, this was 30 years ago, and a lot has changed since then, but one thing that has not really changed is that Computer Science is taught as a discipline with data structures, algorithms, and programming at its core. I am going to be amazed if in 30 years, or even 10 years, we are still approaching CS in this way. Indeed, I think CS as a field is in for a pretty major upheaval that few of us are really prepared for.
当然,这是 30 年前的事了,从那时起发生了很多变化,但没有真正改变的一件事是,计算机科学是一门以数据结构、算法和编程为核心的学科。如果在 30 年,甚至 10 年后,我们仍然以这种方式接近 CS,我会感到惊讶。事实上,我认为 CS 作为一个领域正在经历一场我们中很少有人真正做好准备的重大变革。
Programming will be obsolete
编程将过时
I believe that the conventional idea of “writing a program” is headed for extinction, and indeed, for all but very specialized applications, most software, as we know it, will be replaced by AI systems that are trained rather than programmed. In situations where one needs a “simple” program (after all, not everything should require a model of hundreds of billions of parameters running on a cluster of GPUs), those programs will, themselves, be generated by an AI rather than coded by hand.
我相信“编写程序”的传统观念正在走向消亡,事实上,除了非常专业的应用程序之外,大多数软件,正如我们所知,将被训练而不是编程的人工智能系统所取代。在需要一个“简单”程序的情况下(毕竟,并非所有程序都需要在 GPU 集群上运行数千亿参数的模型),这些程序本身将由 AI 生成,而不是手动编码.
I don’t think this idea is crazy. No doubt the earliest pioneers of Computer Science, emerging from the (relatively) primitive cave of Electrical Engineering, stridently believed that all future Computer Scientists would need to command a deep understanding of semiconductors, binary arithmetic, and microprocessor design to understand software. Fast forward to today, and I am willing to bet good money that 99% of people who are writing software have almost no clue how a CPU actually works, let alone the physics underlying transistor design. By extension, I believe the Computer Scientists of the future will be so far removed from the classic definitions of “software” that they would be hard-pressed to reverse a linked list or implement Quicksort. (Hell, I’m not sure I remember how to implement Quicksort myself.)
我不认为这个想法是疯狂的。毫无疑问,最早的计算机科学先驱,从(相对)原始的电气工程洞穴中脱颖而出,强烈地相信所有未来的计算机科学家都需要对半导体、二进制算术和微处理器设计有深刻的理解才能理解软件。快进到今天,我敢打赌,99% 的软件编写者几乎不知道 CPU 是如何工作的,更不用说晶体管设计背后的物理原理了。通过扩展,我相信未来的计算机科学家将远离“软件”的经典定义,以至于他们将很难反转链表或实施快速排序。(见鬼,我不确定我是否记得如何自己实现快速排序。)
AI coding assistants like CoPilot are only scratching the surface of what I’m talking about. It seems totally obvious to me that of course all programs in the future will ultimately be written by AIs, with humans relegated to, at best, a supervisory role. Anyone who doubts this prediction need only look at the very rapid progress being made in other aspects of AI content generation, like image generation. The difference in quality and complexity between DALL-E v1 and DALL-E v2 — announced only 15 months later — is staggering. If I have learned anything over the last few years working in AI, it is that it is very easy to underestimate the power of increasingly large AI models. Things that seemed like science fiction only a few months ago are rapidly becoming reality.
像 CoPilot 这样的 AI 编码助手只是触及了我所说的表面。对我来说,很明显,当然,未来的所有程序最终都将由 AI 编写,而人类充其量只能充当监督角色。任何怀疑这一预测的人只需看看 AI 内容生成的其他方面(例如图像生成)正在取得的快速进展。DALL-E v1 和 DALL-E v2(仅在 15 个月后宣布)在质量和复杂性上的差异令人震惊。如果我在过去几年的人工智能工作中学到了什么,那就是很容易低估越来越大的人工智能模型的力量。几个月前看起来还像科幻小说的事情正在迅速成为现实。
So I’m not just talking about CoPilot replacing programmers. I’m talking about replacing the entire concept of writing programs with training models. In the future, CS students aren’t going to need to learn such mundane skills as how to add a node to a binary tree or code in C++. That kind of education will be antiquated, like teaching engineering students how to use a slide rule.
所以我不只是在谈论CoPilot 取代程序员。我说的是用训练模型代替编写程序的整个概念。将来,CS 学生将不再需要学习诸如如何将节点添加到二叉树或 C++ 代码之类的普通技能。那种教育会过时,就像教工科学生如何使用计算尺一样。
The engineers of the future will, in a few keystrokes, fire up an instance of a four-quintillion-parameter model that already encodes the full extent of human knowledge (and them some), ready to be given any task required of the machine. The bulk of the intellectual work of getting the machine to do what one wants will be about coming up with the right examples, the right training data, and the right ways to evaluate the training process. Suitably powerful models capable of generalizing via few-shot learning will require only a few good examples of the task to be performed. Massive, human-curated datasets will no longer be necessary in most cases, and most people “training” an AI model won’t be running gradient descent loops in PyTorch, or anything like it. They will be teaching by example, and the machine will do the rest.
未来的工程师只需敲几下键,就会启动一个四五亿参数模型的实例,该模型已经编码了人类知识的全部范围(以及一些知识),准备好接受机器所需的任何任务。让机器做人们想做的事情的大部分智力工作将是提出正确的示例、正确的训练数据以及评估训练过程的正确方法。能够通过小样本学习进行泛化的适当强大模型将只需要执行任务的几个好的示例。在大多数情况下,将不再需要大量人工管理的数据集,并且大多数“训练”人工智能模型的人不会在 PyTorch 或类似的东西中运行梯度下降循环。他们将通过示例进行教学,其余的由机器完成。
In this New Computer Science — if we even call it Computer Science at all — the machines will be so powerful and already know how to do so many things that the field will look like less of an engineering endeavor and more of an an educational one; that is, how to best educate the machine, not unlike the science of how to best educate children in school. Unlike (human) children, though, these AI systems will be flying our airplanes, running our power grids, and possibly even governing entire countries. I would argue that the vast majority of Classical CS becomes irrelevant when our focus turns to teaching intelligent machines rather than directly programming them. Programming, in the conventional sense, will in fact be dead.
在这门新的计算机科学中——如果我们甚至称它为计算机科学——机器将如此强大,并且已经知道如何做很多事情,以至于该领域看起来不像是一项工程努力,而更像是一个教育领域;也就是说,如何最好地教育机器,与如何最好地教育孩子在学校的科学没有什么不同。然而,与(人类)儿童不同,这些人工智能系统将驾驶我们的飞机,运行我们的电网,甚至可能管理整个国家。我认为,当我们的重点转向教授智能机器而不是直接对它们进行编程时,绝大多数经典 CS 就变得无关紧要了。传统意义上的编程实际上已经死了。
How does all of this change how we think about the field of Computer Science?
这一切如何改变我们对计算机科学领域的看法?
这一切如何改变我们对计算机科学领域的看法?
How does all of this change how we think about the field of Computer Science?
The new atomic unit of computation becomes not a processor, memory, and I/O system implementing a von Neumann machine, but rather a massive, pre-trained, highly adaptive AI model. This is a seismic shift in the way we think about computation — not as a predictable, static process, governed by instruction sets, type systems, and notions of decidability. AI-based computation has long since crossed the Rubicon of being amenable to static analysis and formal proof. We are rapidly moving towards a world where the fundamental building blocks of computation are temperamental, mysterious, adaptive agents.
新的原子计算单元不再是实现冯诺依曼机器的处理器、内存和 I/O 系统,而是一个庞大的、预训练的、高度自适应的 AI 模型。这是我们思考计算方式的巨大转变——不是一个可预测的静态过程,受指令集、类型系统和可判定性概念的支配。基于 AI 的计算早已跨越了可以进行静态分析和形式证明的 Rubicon。我们正在迅速走向一个计算的基本构建块是喜怒无常、神秘、适应性强的世界。
This shift is underscored by the fact that nobody actually understands how large AI models work. People are publishing research papers actually discovering new behaviors of existing large models, even though these systems have been “engineered” by humans. Large AI models are capable of doing things that they have not been explicitly trained to do, which should scare the shit out of Nick Bostrom and anyone else worried (rightfully) about an superintelligent AI running amok. We currently have no way, apart from empirical study, to determine the limits of current AI systems. As for future AI models that are orders of magnitude larger and more complex — good friggin’ luck!
没有人真正了解大型 AI 模型的工作原理,这一事实突显了这种转变。人们正在发表 研究 论文,实际上发现了现有大型模型的 新行为,即使这些系统是由人类“设计”的。大型人工智能模型能够做他们没有被明确训练过的事情,这应该把尼克博斯特罗姆和其他任何担心(理所当然地)担心超级智能人工智能发疯的人吓坏了。除了实证研究,我们目前无法确定当前人工智能系统的局限性。至于更大、更复杂的未来人工智能模型——祝你好运!
The shift in focus from programs to models should be obvious to anyone who has read any modern machine learning papers. These papers barely mention the code or systems underlying their innovations; the building blocks of AI systems are much higher-level abstractions like attention layers, tokenizers, and datasets. A time traveller from even 20 years ago would have a hard time making sense of the three sentences in the (75-page-long!) GPT-3 paper that describe the actual software that was built for the model:
We use the same model and architecture as GPT-2 [RWC+19], including the modified initialization, pre-normalization, and reversible tokenization described therein, with the exception that we use alternating dense and locally banded sparse attention patterns in the layers of the transformer, similar to the Sparse Transformer [CGRS19]. To study the dependence of ML performance on model size, we train 8 different sizes of model, ranging over three orders of magnitude from 125 million parameters to 175 billion parameters, with the last being the model we call GPT-3. Previous work [KMH+20] suggests that with enough training data, scaling of validation loss should be approximately a smooth power law as a function of size; training models of many different sizes allows us to test this hypothesis both for validation loss and for downstream language tasks.
This shift in the underlying definition of computing presents a huge opportunity, and plenty of huge risks. Yet I think it’s time to accept that this is a very likely future, and evolve our thinking accordingly, rather than just sit here waiting for the meteor to hit.
对于任何阅读过任何现代机器学习论文的人来说,从程序到模型的重点转移应该是显而易见的。这些论文几乎没有提到他们创新背后的代码或系统。AI 系统的构建块是更高级别的抽象,例如注意力层、标记器和数据集。即使是 20 年前的时间旅行者,也很难理解(75 页长!)GPT-3 论文中描述为模型构建的实际软件的三个句子:
我们使用与 GPT-2 [RWC+19] 相同的模型和架构,包括其中描述的修改后的初始化、预归一化和可逆标记化,除了我们在变压器,类似于稀疏变压器 [CGRS19]。为了研究 ML 性能对模型大小的依赖性,我们训练了 8 种不同大小的模型,从 1.25 亿个参数到 1750 亿个参数的三个数量级,最后一个是我们称为 GPT-3 的模型。以前的工作 [KMH+20] 表明,如果有足够的训练数据,验证损失的缩放应该近似为一个平滑幂律,作为大小的函数;许多不同大小的训练模型使我们能够针对验证损失和下游语言任务测试这个假设。
计算的基本定义的这种转变带来了巨大的机会,也带来了许多巨大的风险。然而,我认为是时候接受这是一个很有可能的未来,并相应地发展我们的思维,而不是坐在这里等待流星撞击。
以上是关于The End of Programming : 编程的终结的主要内容,如果未能解决你的问题,请参考以下文章
The history of programming languages.
what's the problem of Object oriented programming
The 2019 University of Jordan Collegiate Programming Contest
The art of multipropcessor programming 读书笔记-硬件基础2