每个核心的最佳线程数

我知道这个问题很老了，但事情从2009年开始就有了变化。

现在有两件事需要考虑:核心的数量，以及每个核心中可以运行的线程的数量。

With Intel processors, the number of threads is defined by the Hyperthreading which is just 2 (when available). But Hyperthreading cuts your execution time by two, even when not using 2 threads! (i.e. 1 pipeline shared between two processes -- this is good when you have more processes, not so good otherwise. More cores are definitively better!) Note that modern CPUs generally have more pipelines to divide the workload, so it's no really divided by two anymore. But Hyperthreading still shares a lot of the CPU units between the two threads (some call those logical CPUs).

在其他处理器上，您可能有2、4甚至8个线程。因此，如果你有8个内核，每个内核支持8个线程，你可以有64个进程并行运行，而不需要上下文切换。

“没有上下文切换”显然是不正确的，如果你运行的是一个标准的操作系统，它会对各种你无法控制的事情进行上下文切换。但这是主要的思想。一些操作系统允许你分配处理器，这样只有你的应用程序可以访问/使用处理器!

From my own experience, if you have a lot of I/O, multiple threads is good. If you have very heavy memory intensive work (read source 1, read source 2, fast computation, write) then having more threads doesn't help. Again, this depends on how much data you read/write simultaneously (i.e. if you use SSE 4.2 and read 256 bits values, that stops all threads in their step... in other words, 1 thread is probably a lot easier to implement and probably nearly as speedy if not actually faster. This will depend on your process & memory architecture, some advanced servers manage separate memory ranges for separate cores so separate threads will be faster assuming your data is properly filed... which is why, on some architectures, 4 processes will run faster than 1 process with 4 threads.)

2012-12-27 12:08:25

一次4000个线程是相当高的。

答案是肯定的，也不是。如果您在每个线程中执行大量阻塞I/O，那么是的，您可以在每个逻辑核心中执行3或4个线程时显示显著的加速。

If you are not doing a lot of blocking things however, then the extra overhead with threading will just make it slower. So use a profiler and see where the bottlenecks are in each possibly parallel piece. If you are doing heavy computations, then more than 1 thread per CPU won't help. If you are doing a lot of memory transfer, it won't help either. If you are doing a lot of I/O though such as for disk access or internet access, then yes multiple threads will help up to a certain extent, or at the least make the application more responsive.

2009-11-11 22:32:32

如果你的线程不做I/O，同步等，没有其他的运行，1个线程一个核可以让你获得最好的性能。然而，情况很可能并非如此。添加更多的线程通常会有所帮助，但在某种程度上，它们会导致性能下降。

Not long ago, I was doing performance testing on a 2 quad-core machine running an ASP.NET application on Mono under a pretty decent load. We played with the minimum and maximum number of threads and in the end we found out that for that particular application in that particular configuration the best throughput was somewhere between 36 and 40 threads. Anything outside those boundaries performed worse. Lesson learned? If I were you, I would test with different number of threads until you find the right number for your application.

有一件事是肯定的:4k线程将花费更长的时间。这有很多上下文转换。

2009-11-11 22:28:40

我知道这个问题很老了，但事情从2009年开始就有了变化。

现在有两件事需要考虑:核心的数量，以及每个核心中可以运行的线程的数量。

With Intel processors, the number of threads is defined by the Hyperthreading which is just 2 (when available). But Hyperthreading cuts your execution time by two, even when not using 2 threads! (i.e. 1 pipeline shared between two processes -- this is good when you have more processes, not so good otherwise. More cores are definitively better!) Note that modern CPUs generally have more pipelines to divide the workload, so it's no really divided by two anymore. But Hyperthreading still shares a lot of the CPU units between the two threads (some call those logical CPUs).

在其他处理器上，您可能有2、4甚至8个线程。因此，如果你有8个内核，每个内核支持8个线程，你可以有64个进程并行运行，而不需要上下文切换。

“没有上下文切换”显然是不正确的，如果你运行的是一个标准的操作系统，它会对各种你无法控制的事情进行上下文切换。但这是主要的思想。一些操作系统允许你分配处理器，这样只有你的应用程序可以访问/使用处理器!

From my own experience, if you have a lot of I/O, multiple threads is good. If you have very heavy memory intensive work (read source 1, read source 2, fast computation, write) then having more threads doesn't help. Again, this depends on how much data you read/write simultaneously (i.e. if you use SSE 4.2 and read 256 bits values, that stops all threads in their step... in other words, 1 thread is probably a lot easier to implement and probably nearly as speedy if not actually faster. This will depend on your process & memory architecture, some advanced servers manage separate memory ranges for separate cores so separate threads will be faster assuming your data is properly filed... which is why, on some architectures, 4 processes will run faster than 1 process with 4 threads.)

2012-12-27 12:08:25

实际性能取决于每个线程的自愿屈服程度。例如，如果线程根本不做I/O，也不使用任何系统服务(即它们100%受cpu限制)，那么每个核1个线程是最优的。如果线程执行任何需要等待的操作，那么您必须试验以确定最佳线程数。4000个线程会导致大量的调度开销，所以这可能也不是最优的。

2009-11-11 22:26:38

理想的情况是每个内核有一个线程，只要没有线程会阻塞。

在一种情况下，这可能是不正确的:有其他线程在核心上运行，在这种情况下，更多的线程可能会给您的程序更大的执行时间。

2009-11-11 22:23:33

每个核心的最佳线程数

推荐文章

最新文章

标签