x86架构是专门为键盘设计的,而ARM则是为移动设备设计的吗?两者之间的关键区别是什么?
当前回答
除了多年来ARM在功耗方面拥有相当大的优势,这使得它对各种电池驱动的设备都具有吸引力之外,这两种技术都没有针对键盘或移动设备的特定特性。
至于实际的差异:ARM拥有更多的寄存器,早在英特尔添加之前就支持大多数指令的预测,并且长期以来集成了各种各样的技术(如果你喜欢,可以称之为“技巧”)来尽可能地节省电力。
两者在编码指令的方式上也有相当大的不同。英特尔使用相当复杂的变长编码,其中一条指令可以占用1到15个字节。这使得程序非常小,但是使得指令解码相对困难(比如:并行快速解码指令更像是一个完全的噩梦)。
ARM has two different instruction encoding modes: ARM and THUMB. In ARM mode, you get access to all instructions, and the encoding is extremely simple and fast to decode. Unfortunately, ARM mode code tends to be fairly large, so it's fairly common for a program to occupy around twice as much memory as Intel code would. Thumb mode attempts to mitigate that. It still uses quite a regular instruction encoding, but reduces most instructions from 32 bits to 16 bits, such as by reducing the number of registers, eliminating predication from most instructions, and reducing the range of branches. At least in my experience, this still doesn't usually give quite as dense of coding as x86 code can get, but it's fairly close, and decoding is still fairly simple and straightforward. Lower code density means you generally need at least a little more memory and (generally more seriously) a larger cache to get equivalent performance.
At one time Intel put a lot more emphasis on speed than power consumption. They started emphasizing power consumption primarily on the context of laptops. For laptops their typical power goal was on the order of 6 watts for a fairly small laptop. More recently (much more recently) they've started to target mobile devices (phones, tablets, etc.) For this market, they're looking at a couple of watts or so at most. They seem to be doing pretty well at that, though their approach has been substantially different from ARM's, emphasizing fabrication technology where ARM has mostly emphasized micro-architecture (not surprising, considering that ARM sells designs, and leaves fabrication to others).
Depending on the situation, a CPU's energy consumption is often more important than its power consumption though. At least as I'm using the terms, power consumption refers to power usage on a (more or less) instantaneous basis. Energy consumption, however, normalizes for speed, so if (for example) CPU A consumes 1 watt for 2 seconds to do a job, and CPU B consumes 2 watts for 1 second to do the same job, both CPUs consume the same total amount of energy (two watt seconds) to do that job--but with CPU B, you get results twice as fast.
ARM处理器在功耗方面做得非常好。因此,如果您需要的某些东西几乎经常需要处理器的“存在”,但实际上并没有做太多工作,那么它们可以很好地工作。例如,如果你正在进行视频会议,你收集几毫秒的数据,压缩它,发送它,从其他人那里接收数据,解压缩它,播放它,然后重复。即使是一个非常快的处理器也不能花很多时间睡觉,所以对于这样的任务,ARM做得非常好。
英特尔的处理器(尤其是他们的Atom处理器,实际上是为低功耗应用而设计的)在能耗方面极具竞争力。当它们接近全速运行时,它们会比大多数ARM处理器消耗更多的能量,但它们也能很快完成工作,所以它们可以更快地进入睡眠状态。因此,它们可以将良好的电池寿命与良好的性能结合起来。
因此,在比较两者时,你必须小心衡量的东西,以确保它反映了你真正关心的东西。ARM在功耗方面做得很好,但根据具体情况,您可能更关心功耗而不是瞬时功耗。
其他回答
除了多年来ARM在功耗方面拥有相当大的优势,这使得它对各种电池驱动的设备都具有吸引力之外,这两种技术都没有针对键盘或移动设备的特定特性。
至于实际的差异:ARM拥有更多的寄存器,早在英特尔添加之前就支持大多数指令的预测,并且长期以来集成了各种各样的技术(如果你喜欢,可以称之为“技巧”)来尽可能地节省电力。
两者在编码指令的方式上也有相当大的不同。英特尔使用相当复杂的变长编码,其中一条指令可以占用1到15个字节。这使得程序非常小,但是使得指令解码相对困难(比如:并行快速解码指令更像是一个完全的噩梦)。
ARM has two different instruction encoding modes: ARM and THUMB. In ARM mode, you get access to all instructions, and the encoding is extremely simple and fast to decode. Unfortunately, ARM mode code tends to be fairly large, so it's fairly common for a program to occupy around twice as much memory as Intel code would. Thumb mode attempts to mitigate that. It still uses quite a regular instruction encoding, but reduces most instructions from 32 bits to 16 bits, such as by reducing the number of registers, eliminating predication from most instructions, and reducing the range of branches. At least in my experience, this still doesn't usually give quite as dense of coding as x86 code can get, but it's fairly close, and decoding is still fairly simple and straightforward. Lower code density means you generally need at least a little more memory and (generally more seriously) a larger cache to get equivalent performance.
At one time Intel put a lot more emphasis on speed than power consumption. They started emphasizing power consumption primarily on the context of laptops. For laptops their typical power goal was on the order of 6 watts for a fairly small laptop. More recently (much more recently) they've started to target mobile devices (phones, tablets, etc.) For this market, they're looking at a couple of watts or so at most. They seem to be doing pretty well at that, though their approach has been substantially different from ARM's, emphasizing fabrication technology where ARM has mostly emphasized micro-architecture (not surprising, considering that ARM sells designs, and leaves fabrication to others).
Depending on the situation, a CPU's energy consumption is often more important than its power consumption though. At least as I'm using the terms, power consumption refers to power usage on a (more or less) instantaneous basis. Energy consumption, however, normalizes for speed, so if (for example) CPU A consumes 1 watt for 2 seconds to do a job, and CPU B consumes 2 watts for 1 second to do the same job, both CPUs consume the same total amount of energy (two watt seconds) to do that job--but with CPU B, you get results twice as fast.
ARM处理器在功耗方面做得非常好。因此,如果您需要的某些东西几乎经常需要处理器的“存在”,但实际上并没有做太多工作,那么它们可以很好地工作。例如,如果你正在进行视频会议,你收集几毫秒的数据,压缩它,发送它,从其他人那里接收数据,解压缩它,播放它,然后重复。即使是一个非常快的处理器也不能花很多时间睡觉,所以对于这样的任务,ARM做得非常好。
英特尔的处理器(尤其是他们的Atom处理器,实际上是为低功耗应用而设计的)在能耗方面极具竞争力。当它们接近全速运行时,它们会比大多数ARM处理器消耗更多的能量,但它们也能很快完成工作,所以它们可以更快地进入睡眠状态。因此,它们可以将良好的电池寿命与良好的性能结合起来。
因此,在比较两者时,你必须小心衡量的东西,以确保它反映了你真正关心的东西。ARM在功耗方面做得很好,但根据具体情况,您可能更关心功耗而不是瞬时功耗。
ARM是精简指令集计算(RISC)架构,而x86是复杂指令集计算(CISC)架构。
The core difference between those in this aspect is that ARM instructions operate only on registers with a few instructions for loading and storing data from/to memory while x86 can use memory or register operands with ALU instructions, sometimes getting the same work done in fewer instructions. Sometimes more because ARM has its own useful tricks like loading a pair of registers in one instruction, or using a shifted register as part of another operation. Up until ARMv8 / AArch64, ARM was a native 32 bit architecture, favoring four byte operations over others.
所以ARM是一个更简单的架构,导致了小的硅面积和大量的节能功能,而x86在功耗和生产方面都成为了一个功耗猛兽。
To answer your question "Is the x86 Architecture specially designed to work with a keyboard while ARM expects to be mobile?". x86 isn't specially designed to work with a keyboard just like ARM isn't designed specifically for mobile. However, again because of the core architectural choices, x86 also has instructions to work directly with a separate IO address space, while ARM does not. Instead, ARM uses memory-mapped IO for everything, including reading/writing PCI IO space. (Which is rarely needed with modern devices because it's slow on x86. e.g. modern USB controllers, so accessing USB-connected devices is as efficient as the USB controller makes it.)
如果你需要引用一份文档,这是Cortex-A系列程序员指南(4.0)讲述的RISC和CISC架构之间的区别:
An ARM processor is a Reduced Instruction Set Computer (RISC) processor. Complex Instruction Set Computer (CISC) processors, like the x86, have a rich instruction set capable of doing complex things with a single instruction. Such processors often have significant amounts of internal logic that decode machine instructions to sequences of internal operations (microcode). RISC architectures, in contrast, have a smaller number of more general purpose instructions, that might be executed with significantly fewer transistors, making the silicon cheaper and more power efficient. Like other RISC architectures, ARM cores have a large number of general-purpose registers and many instructions execute in a single cycle. It has simple addressing modes, where all load/store addresses can be determined from register contents and instruction fields.
ARM公司还提供了一篇题为《架构、处理器和设备开发文章》的论文,描述了如何将这些术语应用到他们的业务中。
一个比较指令集结构的例子:
例如,如果你的应用程序中需要某种字节内存比较块(由编译器生成,跳过细节),这就是在x86上的情况,如果优化代码大小而不是速度。(在现代cpu上,rep movsb / rep stosb是非常快的,条件-rep比较指令不是。)
repe cmpsb /* repeat while equal compare string bytewise */
而在ARM上最短的形式可能是这样的(没有错误检查或优化,一次比较多个字节等)。
top:
ldrb r2, [r0, #1]! /* load a byte from address in r0 into r2, increment r0 after */
ldrb r3, [r1, #1]! /* load a byte from address in r1 into r3, increment r1 after */
subs r2, r3, r2 /* subtract r2 from r3 and put result into r2 */
beq top /* branch(/jump) if result is zero */
这应该会给你一个提示,RISC和CISC指令集在复杂性上有什么不同。有趣的是,x86没有回写寻址模式(加载和增加指针),除非通过它的“字符串”指令,如lodsd。
除了杰瑞·科芬的第一段。即,ARM设计提供了更低的功耗。
ARM公司只授权CPU技术。他们不生产实体芯片。这允许其他公司添加各种外围技术,通常称为SOC或片上系统。无论是平板电脑、手机还是车载娱乐系统。这使得芯片供应商可以根据特定的应用程序定制芯片的其余部分。这有额外的好处,
较低的电路板成本 低功率(注1) 容易制造 较小的外形尺寸
ARM通过AMBA支持SOC供应商,允许SOC实现者购买现成的第三方模块;比如以太网、内存和中断控制器。其他一些CPU平台也支持这一点,比如MIPS,但MIPS对功耗不敏感。
所有这些都有利于手持/电池操作的设计。有些人就是全方位都很好。此外,ARM也有电池驱动设备的历史;苹果牛顿,心灵术士组织者。PDA软件基础设施被一些公司用来创建智能手机类型的设备。不过,那些为智能手机重新发明GUI的人取得了更大的成功。
开源工具集和操作系统的兴起也促进了各种SOC芯片的出现。一个封闭的组织在尝试支持所有可用于ARM的各种设备时会遇到问题。android和OSx/IOS这两个最流行的手机平台是基于Linux和FreeBSD、Mach和NetBSD操作系统的。开源帮助SOC供应商为其芯片组提供软件支持。
希望x86用于键盘的原因是不言而喻的。它有软件,更重要的是,它有受过使用软件培训的人。Netwinder是一种ARM系统,最初是为键盘设计的。此外,制造商目前正在为服务器市场寻找ARM64。电源/热量是全天候数据中心需要关注的问题。
所以我想说,围绕这些芯片发展的生态系统与低功耗等功能一样重要。ARM一直致力于低功耗,高性能的计算(20世纪80年代中后期),他们已经有很多人了。
注1:多个芯片需要总线驱动器在已知电压下相互通信和驱动。此外,通常单独的芯片需要支持电容和其他可在SOC系统中共享的功率组件。
ARM架构最初是为Acorn个人计算机(参见Acorn Archimedes,大约在1987年和RiscPC)设计的,它是基于键盘的个人计算机,就像基于x86的IBM PC模型一样多。只有后来的ARM实现才主要针对移动和嵌入式市场。
最初,性能大致相当的简单RISC cpu可以由比英特尔x86开发团队小得多的工程团队(参见Berkeley RISC)设计。
但是,现在,最快的ARM芯片有非常复杂的多问题无序指令分派单元,由大型工程团队设计,x86内核可能有类似RISC内核的东西,由指令翻译单元提供。
因此,两种架构之间的任何当前差异都更多地与开发团队所瞄准的产品利基的特定市场需求有关。(随机观点:ARM可能从嵌入式应用程序中赚取更多的许可费,这些应用程序往往更强大,成本更有限。英特尔需要在个人电脑和服务器上保持性能优势,以获得利润。因此,您可以看到不同的实现优化。)
ARM就像一辆意大利跑车:
平衡很好,调得很好,引擎。具有良好的加速度和最高速度。 出色的追逐,刹车和悬挂。能快速停车,能不减速过弯。
x86就像一辆美国肌肉车:
大引擎,大燃油泵。最高速度和加速都很出色,但耗油很多。 糟糕的刹车,如果你想减速,你需要在你的日记里预约。 转向很糟糕,你必须在拐弯处减速。
总而言之:x86基于1974年的设计,在直线行驶方面表现良好(但耗油很多)。手臂耗油少,转弯(支路)时不减速。
比喻结束了,这里有一些真正的区别。
Arm has more registers. Arm has few special purpose registers, x86 is all special purpose registers (so less moving stuff around). Arm has few memory access commands, only load/store register. Arm is internally Harvard architecture my design. Arm is simple and fast. Arm instructions are architecturally single cycle (except load/store multiple). Arm instructions often do more than one thing (in a single cycle). Where more that one Arm instruction is needed, such as the x86's looping store & auto-increment, the Arm still does it in less clock cycles. Arm has more conditional instructions. Arm's branch predictor is trivially simple (if unconditional or backwards then assume branch, else assume not-branch), and performs better that the very very very complex one in the x86 (there is not enough space here to explain it, not that I could). Arm has a simple consistent instruction set (you could compile by hand, and learn the instruction set quickly).
推荐文章
- 核心和处理器的区别
- ARM架构与x86有何不同?
- 如何影响Delphi XEx代码生成Android/ARM目标?
- 基指针和堆栈指针到底是什么?他们指的是什么?
- 什么是回跳线?它是如何工作的?
- 多核汇编语言是什么样子的?
- 如何在没有操作系统的情况下运行程序?
- 如何用SSE4.2和AVX指令编译Tensorflow ?
- 为什么这段代码在对循环携带的加法进行强度降低的乘法运算后执行得更慢?
- 在Intel sandybridge系列cpu中对管道的程序进行反优化
- 如何确定.NET程序集是为x86还是x64构建的?
- 我如何实现每周期4次flop的理论最大值?
- LEA指令的目的是什么?
- 为什么用于测试Collatz猜想的c++代码比手写的程序集运行得更快?
- 用64位替换32位循环计数器会导致英特尔CPU上_mm_popcnt_u64的性能偏差