最近,我在一次采访中被问到一个问题:进程和线程有什么区别?真的,我不知道答案。我想了一会儿,给出了一个非常奇怪的答案。

线程共享相同的内存,而进程不共享。回答完这个问题后,面试官对我邪恶地笑了笑,然后接连问了我几个问题:

问:你知道节目分成哪些部分吗?

我的答案是:是的(认为这很简单)堆栈,数据,代码,堆

问:那么,告诉我:线程共享哪些片段?

我无法回答这个问题,最后只能把它们都说了出来。

请问,谁能就进程和线程之间的区别给出正确的、令人印象深刻的答案?


当前回答

线程共享堆(有一个关于线程特定堆的研究),但当前的实现共享堆。(当然还有代码)

其他回答

Besides global memory, threads also share a number of other attributes (i.e., these attributes are global to a process, rather than specific to a thread). These attributes include the following: process ID and parent process ID; process group ID and session ID; controlling terminal; process credentials (user and group IDs); open file descriptors; record locks created using fcntl(); signal dispositions; file system–related information: umask, current working directory, and root directory; interval timers (setitimer()) and POSIX timers (timer_create()); System V semaphore undo (semadj) values (Section 47.8); resource limits; CPU time consumed (as returned by times()); resources consumed (as returned by getrusage()); and nice value (set by setpriority() and nice()). Among the attributes that are distinct for each thread are the following: thread ID (Section 29.5); signal mask; thread-specific data (Section 31.3); alternate signal stack (sigaltstack()); the errno variable; floating-point environment (see fenv(3)); realtime scheduling policy and priority (Sections 35.2 and 35.3); CPU affinity (Linux-specific, described in Section 35.4); capabilities (Linux-specific, described in Chapter 39); and stack (local variables and function call linkage information).

摘自:《Linux编程接口:Linux和UNIX系统编程手册》,Michael Kerrisk,第619页

线程共享代码、数据段和堆,但不共享堆栈。

A process has code, data, heap and stack segments. Now, the Instruction Pointer (IP) of a thread OR threads points to the code segment of the process. The data and heap segments are shared by all the threads. Now what about the stack area? What is actually the stack area? Its an area created by the process just for its thread to use... because stacks can be used in a much faster way than heaps etc. The stack area of the process is divided among threads, i.e. if there are 3 threads, then the stack area of the process is divided into 3 parts and each is given to the 3 threads. In other words, when we say that each thread has its own stack, that stack is actually a part of the process stack area allocated to each thread. When a thread finishes its execution, the stack of the thread is reclaimed by the process. In fact, not only the stack of a process is divided among threads, but all the set of registers that a thread uses like SP, PC and state registers are the registers of the process. So when it comes to sharing, the code, data and heap areas are shared, while the stack area is just divided among threads.

告诉面试官这完全取决于操作系统的实现。

以Windows x86为例。只有两个段[1],代码和数据。它们都被映射到整个2GB(线性,用户)地址空间。基础= 0,限制= 2 gb。他们本来可以做一个,但x86不允许一个段同时读/写和执行。所以他们做了两个,并设置CS指向代码描述符,其余(DS, ES, SS等)指向另一个[2]。但两者都指向同样的东西!

面试你的人做了一个隐藏的假设,他/她没有说出来,这是一个愚蠢的伎俩。

所以关于

问:告诉我是哪个线段 分享吗?

细分市场与问题无关,至少在Windows上是这样。线程共享整个地址空间。只有一个堆栈段SS,它指向DS ES CS做的完全一样的东西[2]。比如整个用户空间。0-2GB。当然,这并不意味着线程只有一个堆栈。当然,每个都有自己的堆栈,但x86段并不用于此目的。

也许*nix做一些不同的事情。谁知道呢。这个问题的前提被打破了。


至少对于用户空间是这样的。 从ntsd记事本:cs=001b ss=0023 ds=0023 es=0023

线程共享数据和代码,而进程不共享。栈不是为两者共享的。

进程也可以共享内存,更精确的代码,例如Fork()之后,但这是一个实现细节和(操作系统)优化。由多个进程共享的代码将(希望)在第一次写入代码时复制—这被称为写时复制。我不确定线程代码的确切语义,但我假设是共享代码。

           Process   Thread

   Stack   private   private
   Data    private   shared
   Code    private1  shared2

代码在逻辑上是私有的,但出于性能原因可能会被共享。 我不能百分之百肯定。