并行编程和并行编程的区别是什么?我问了谷歌,但没有找到任何帮助我理解这种区别的东西。你能给我举个例子吗?
现在我找到了这个解释:http://www.linux-mag.com/id/7411 -但是“并发性是程序的属性”vs“并行执行是机器的属性”对我来说还不够-我仍然不能说什么是什么。
并行编程和并行编程的区别是什么?我问了谷歌,但没有找到任何帮助我理解这种区别的东西。你能给我举个例子吗?
现在我找到了这个解释:http://www.linux-mag.com/id/7411 -但是“并发性是程序的属性”vs“并行执行是机器的属性”对我来说还不够-我仍然不能说什么是什么。
当前回答
我认为并发编程指的是多线程编程,它是关于让你的程序运行多个线程,从硬件细节中抽象出来。
并行编程是指专门设计程序算法以利用可用的并行执行。例如,您可以并行执行某些算法的两个分支,期望它会比先检查第一个分支再检查第二个分支更快地到达结果(平均而言)。
其他回答
它们是从(非常轻微的)不同的角度描述同一件事情的两个短语。并行编程是从硬件的角度描述情况——至少有两个处理器(可能在一个物理包中)并行处理一个问题。并发编程更多地是从软件的角度描述事情——两个或多个操作可能同时(并发)发生。
这里的问题是,人们试图用这两个短语来做出明确的区分,但实际上这两个短语并不存在。现实情况是,几十年来,他们试图划定的分界线一直是模糊的,而且随着时间的推移越来越模糊。
What they're trying to discuss is the fact that once upon a time, most computers had only a single CPU. When you executed multiple processes (or threads) on that single CPU, the CPU was only really executing one instruction from one of those threads at a time. The appearance of concurrency was an illusion--the CPU switching between executing instructions from different threads quickly enough that to human perception (to which anything less than 100 ms or so looks instantaneous) it looked like it was doing many things at once.
与此形成鲜明对比的是具有多个CPU或多核CPU的计算机,因此机器正在同时执行来自多个线程和/或进程的指令;执行其中一个的代码不能/不会对执行另一个的代码产生任何影响。
Now the problem: such a clean distinction has almost never existed. Computer designers are actually fairly intelligent, so they noticed a long time ago that (for example) when you needed to read some data from an I/O device such as a disk, it took a long time (in terms of CPU cycles) to finish. Instead of leaving the CPU idle while that happened, they figured out various ways of letting one process/thread make an I/O request, and let code from some other process/thread execute on the CPU while the I/O request completed.
因此,早在多核cpu成为标准之前,我们就有多个线程并行进行操作。
That's only the tip of the iceberg though. Decades ago, computers started providing another level of parallelism as well. Again, being fairly intelligent people, computer designers noticed that in a lot of cases, they had instructions that didn't affect each other, so it was possible to execute more than one instruction from the same stream at the same time. One early example that became pretty well known was the Control Data 6600. This was (by a fairly wide margin) the fastest computer on earth when it was introduced in 1964--and much of the same basic architecture remains in use today. It tracked the resources used by each instruction, and had a set of execution units that executed instructions as soon as the resources on which they depended became available, very similar to the design of most recent Intel/AMD processors.
But (as the commercials used to say) wait--that's not all. There's yet another design element to add still further confusion. It's been given quite a few different names (e.g., "Hyperthreading", "SMT", "CMP"), but they all refer to the same basic idea: a CPU that can execute multiple threads simultaneously, using a combination of some resources that are independent for each thread, and some resources that are shared between the threads. In a typical case this is combined with the instruction-level parallelism outlined above. To do that, we have two (or more) sets of architectural registers. Then we have a set of execution units that can execute instructions as soon as the necessary resources become available. These often combine well because the instructions from the separate streams virtually never depend on the same resources.
然后,当然,我们会讲到具有多核的现代系统。这里的情况很明显,对吧?我们有N个(目前大约在2到256之间)独立的内核,它们都可以同时执行指令,所以我们有了真正的并行性的清晰案例——在一个进程/线程中执行指令不会影响在另一个进程/线程中执行指令。
嗯,算是吧。即使在这里,我们也有一些独立的资源(寄存器、执行单元、至少一个级别的缓存)和一些共享资源(通常至少是最低级别的缓存,当然还有内存控制器和内存带宽)。
To summarize: the simple scenarios people like to contrast between shared resources and independent resources virtually never happen in real life. With all resources shared, we end up with something like MS-DOS, where we can only run one program at a time, and we have to stop running one before we can run the other at all. With completely independent resources, we have N computers running MS-DOS (without even a network to connect them) with no ability to share anything between them at all (because if we can even share a file, well, that's a shared resource, a violation of the basic premise of nothing being shared).
每个有趣的案例都涉及到独立资源和共享资源的某种组合。每一台相当现代的计算机(以及许多根本不现代的计算机)都至少有一些能力同时执行至少几个独立的操作,而任何比MS-DOS更复杂的东西都至少在某种程度上利用了这一点。
人们喜欢在“并发”和“并行”之间画出的漂亮、清晰的分界线根本不存在,而且几乎从来都不存在。人们喜欢归类为“并发”的东西通常仍然包含至少一种或更多不同类型的并行执行。他们喜欢归类为“并行”的内容通常涉及共享资源,(例如)一个进程在使用两个进程之间共享的资源时阻塞另一个进程的执行。
试图在“并行”和“并发”之间划清界限的人,其实是生活在一个从未真正存在过的计算机幻想中。
传统的任务调度可以是串行、并行或并发的。
Serial: tasks must be executed one after the other in a known tricked order or it will not work. Easy enough. Parallel: tasks must be executed at the same time or it will not work. Any failure of any of the tasks - functionally or in time - will result in total system failure. All tasks must have a common reliable sense of time. Try to avoid this or we will have tears by tea time. Concurrent: we do not care. We are not careless, though: we have analysed it and it doesn't matter; we can therefore execute any task using any available facility at any time. Happy days.
通常,在已知事件发生时,可用的调度会发生变化,我们称之为状态变化。
人们通常认为这是关于软件的,但实际上这是一种早于计算机的系统设计概念;软件系统的吸收速度有点慢,甚至很少有软件语言试图解决这个问题。如果你感兴趣,你可以试着查一下transputer language occam。
简而言之,系统设计解决以下问题:
动词——你在做什么(操作或算法) 名词——对(数据或接口)进行操作的对象 启动时,进度、状态改变 是串行、并行还是并发 地点——一旦你知道事情发生的时间,你就能说出事情可能发生的地点,而不是之前。 为什么,这是正确的方法吗?还有别的办法吗,更重要的是,更好的办法?如果你不做会怎么样?
祝你好运。
不同的人在许多不同的具体情况下讨论不同类型的并发性和并行性,因此需要一些抽象来涵盖它们的共同性质。
The basic abstraction is done in computer science, where both concurrency and parallelism are attributed to the properties of programs. Here, programs are formalized descriptions of computing. Such programs need not to be in any particular language or encoding, which is implementation-specific. The existence of API/ABI/ISA/OS is irrelevant to such level of abstraction. Surely one will need more detailed implementation-specific knowledge (like threading model) to do concrete programming works, the spirit behind the basic abstraction is not changed.
第二个重要的事实是,作为一般属性,并发性和并行性可以在许多不同的抽象中共存。
关于一般的区别,请参阅并发和并行的基本观点的相关答案。(还有一些链接包含一些其他来源。)
并发编程和并行编程是用一些系统实现这些一般属性的技术,这些系统公开了可编程性。系统通常是编程语言及其实现。
A programming language may expose the intended properties by built-in semantic rules. In most cases, such rules specify the evaluations of specific language structures (e.g. expressions) making the computation involved effectively concurrent or parallel. (More specifically, the computational effects implied by the evaluations can perfectly reflect these properties.) However, concurrent/parallel language semantics are essentially complex and they are not necessary to practical works (to implement efficient concurrent/parallel algorithms as the solutions of realistic problems). So, most traditional languages take a more conservative and simpler approach: assuming the semantics of evaluation totally sequential and serial, then providing optional primitives to allow some of the computations being concurrent and parallel. These primitives can be keywords or procedural constructs ("functions") supported by the language. They are implemented based on the interaction with hosted environments (OS, or "bare metal" hardware interface), usually opaque (not able to be derived using the language portably) to the language. Thus, in this particular kind of high-level abstractions seen by the programmers, nothing is concurrent/parallel besides these "magic" primitives and programs relying on these primitives; the programmers can then enjoy less error-prone experience of programming when concurrency/parallelism properties are not so interested.
Although primitives abstract the complex away in the most high-level abstractions, the implementations still have the extra complexity not exposed by the language feature. So, some mid-level abstractions are needed. One typical example is threading. Threading allows one or more thread of execution (or simply thread; sometimes it is also called a process, which is not necessarily the concept of a task scheduled in an OS) supported by the language implementation (the runtime). Threads are usually preemptively scheduled by the runtime, so a thread needs to know nothing about other threads. Thus, threads are natural to implement parallelism as long as they share nothing (the critical resources): just decompose computations in different threads, once the underlying implementation allows the overlapping of the computation resources during the execution, it works. Threads are also subject to concurrent accesses of shared resources: just access resources in any order meets the minimal constraints required by the algorithm, and the implementation will eventually determine when to access. In such cases, some synchronization operations may be necessary. Some languages treat threading and synchronization operations as parts of the high-level abstraction and expose them as primitives, while some other languages encourage only relatively more high-level primitives (like futures/promises) instead.
Under the level of language-specific threads, there come multitasking of the underlying hosting environment (typically, an OS). OS-level preemptive multitasking are used to implement (preemptive) multithreading. In some environments like Windows NT, the basic scheduling units (the tasks) are also "threads". To differentiate them with userspace implementation of threads mentioned above, they are called kernel threads, where "kernel" means the kernel of the OS (however, strictly speaking, this is not quite true for Windows NT; the "real" kernel is the NT executive). Kernel threads are not always 1:1 mapped to the userspace threads, although 1:1 mapping often reduces most overhead of mapping. Since kernel threads are heavyweight (involving system calls) to create/destroy/communicate, there are non 1:1 green threads in the userspace to overcome the overhead problems at the cost of the mapping overhead. The choice of mapping depending on the programming paradigm expected in the high-level abstraction. For example, when a huge number of userspace threads expected being concurrently executed (like Erlang), 1:1 mapping is never feasible.
The underlying of OS multitasking is ISA-level multitasking provided by the logical core of the processor. This is usually the most low-level public interface for programmers. Beneath this level, there may exist SMT. This is a form of more low-level multithreading implemented by the hardware, but arguably, still somewhat programmable - though it is usually only accessible by the processor manufacturer. Note the hardware design is apparently reflecting parallelism, but there is also concurrent scheduling mechanism to make the internal hardware resources being efficiently used.
在上面提到的每一层“线程”中,都涉及并发性和并行性。尽管编程接口变化很大,但它们都服从于一开始基本抽象所揭示的属性。
将原始问题解释为并行/并发计算,而不是编程。
在并发计算中,两个计算都是彼此独立前进的。第二个计算不需要等到第一个计算完成后才能继续进行。但是,它并没有说明这是如何实现的机制。在单核设置中,线程之间需要挂起和交替(也称为抢占式多线程)。
在并行计算中,两个计算同时进行——字面上是同时进行。这对于单CPU来说是不可能的,而是需要多核设置。
图片来自文章:“Node.js中的并行vs并发”
与
我的理解是:
1)并发-使用共享资源串联运行 2)使用不同的资源并行运行
所以你可以让两件事情同时发生,即使它们在点(2)聚集在一起,或者两件事情在整个执行的操作中占用相同的储备(1)。