线和纤维的区别是什么?我听说过来自红宝石的纤维,我也听说过它们在其他语言中也有,有人能简单地给我解释一下线和纤维的区别吗?
当前回答
线程使用抢占式调度,而光纤使用协作式调度。
对于一个线程,控制流可能在任何时候被中断,而另一个线程可以接管。使用多个处理器,可以同时运行多个线程(同步多线程,或SMT)。因此,您必须非常小心并发数据访问,并使用互斥锁、信号量、条件变量等保护数据。这通常是非常棘手的。
对于光纤,控制只在您告诉它切换时才切换,通常使用名为yield()的函数调用。这使得并发数据访问更容易,因为您不必担心数据结构或互斥对象的原子性。只要您不让步,就不会有被抢占的危险,也不会有另一条光纤试图读取或修改您正在处理的数据。因此,如果你的纤维进入了一个无限循环,其他的纤维就无法运行,因为你没有屈服。
您还可以混合使用线程和纤维,这会导致两者都面临的问题。不建议这样做,但如果仔细的话,有时这样做是正确的。
其他回答
Win32光纤定义实际上是由太阳微系统公司建立的“绿线”定义。没有必要在某种类型的线程上浪费光纤这个术语,即在用户代码/线程库控制下在用户空间中执行的线程。
为了澄清这一论点,请看以下评论:
使用超线程,多核CPU可以接受多个线程,并将它们分布在每个核上。 超标量流水线CPU接受一个线程执行,并使用指令级并行(ILP)来更快地运行线程。我们可以假设一根线被分解成在平行管道中运行的平行纤维。 SMT CPU可以接受多个线程,并将它们分解成指令纤维,以便在多个管道上并行执行,更有效地使用管道。
我们应该假设进程是由线组成的,而线应该是由纤维组成的。考虑到这个逻辑,将光纤用于其他类型的线程是错误的。
线程使用抢占式调度,而光纤使用协作式调度。
对于一个线程,控制流可能在任何时候被中断,而另一个线程可以接管。使用多个处理器,可以同时运行多个线程(同步多线程,或SMT)。因此,您必须非常小心并发数据访问,并使用互斥锁、信号量、条件变量等保护数据。这通常是非常棘手的。
对于光纤,控制只在您告诉它切换时才切换,通常使用名为yield()的函数调用。这使得并发数据访问更容易,因为您不必担心数据结构或互斥对象的原子性。只要您不让步,就不会有被抢占的危险,也不会有另一条光纤试图读取或修改您正在处理的数据。因此,如果你的纤维进入了一个无限循环,其他的纤维就无法运行,因为你没有屈服。
您还可以混合使用线程和纤维,这会导致两者都面临的问题。不建议这样做,但如果仔细的话,有时这样做是正确的。
在Win32中,光纤是一种用户管理的线程。一个光纤有它自己的堆栈和指令指针等等,但是光纤不是由操作系统调度的:你必须显式地调用SwitchToFiber。相反,线程是由操作系统预先调度的。因此,粗略地说,光纤是在应用程序/运行时级别管理的线程,而不是真正的操作系统线程。
结果是光纤更便宜,应用程序对调度有更多的控制。如果应用程序创建了大量并发任务,并且/或希望在运行时密切优化,这可能很重要。例如,数据库服务器可能选择使用光纤而不是线程。
(同一术语可能有其他用法;如上所述,这是Win32的定义。)
用最简单的术语来说,线程通常被认为是抢占式的(尽管这可能并不总是正确的,这取决于操作系统),而光纤被认为是轻量级的协作线程。两者都是应用程序的独立执行路径。
With threads: the current execution path may be interrupted or preempted at any time (note: this statement is a generalization and may not always hold true depending on OS/threading package/etc.). This means that for threads, data integrity is a big issue because one thread may be stopped in the middle of updating a chunk of data, leaving the integrity of the data in a bad or incomplete state. This also means that the operating system can take advantage of multiple CPUs and CPU cores by running more than one thread at the same time and leaving it up to the developer to guard data access.
With fibers: the current execution path is only interrupted when the fiber yields execution (same note as above). This means that fibers always start and stop in well-defined places, so data integrity is much less of an issue. Also, because fibers are often managed in the user space, expensive context switches and CPU state changes need not be made, making changing from one fiber to the next extremely efficient. On the other hand, since no two fibers can run at exactly the same time, just using fibers alone will not take advantage of multiple CPUs or multiple CPU cores.
首先,我建议阅读关于进程和线程之间区别的解释,作为背景材料。
一旦你读过了,这就很简单了。线程既可以在内核中实现,也可以在用户空间中实现,或者两者混合实现。光纤基本上是在用户空间中实现的线程。
What is typically called a thread is a thread of execution implemented in the kernel: what's known as a kernel thread. The scheduling of a kernel thread is handled exclusively by the kernel, although a kernel thread can voluntarily release the CPU by sleeping if it wants. A kernel thread has the advantage that it can use blocking I/O and let the kernel worry about scheduling. It's main disadvantage is that thread switching is relatively slow since it requires trapping into the kernel. Fibers are user space threads whose scheduling is handled in user space by one or more kernel threads under a single process. This makes fiber switching very fast. If you group all the fibers accessing a particular set of shared data under the context of a single kernel thread and have their scheduling handled by a single kernel thread, then you can eliminate synchronization issues since the fibers will effectively run in serial and you have complete control over their scheduling. Grouping related fibers under a single kernel thread is important, since the kernel thread they are running in can be pre-empted by the kernel. This point is not made clear in many of the other answers. Also, if you use blocking I/O in a fiber, the entire kernel thread it is a part of blocks including all the fibers that are part of that kernel thread.
在现代操作系统中的第11.4节“Windows Vista中的进程和线程”中,Tanenbaum评论道:
Although fibers are cooperatively scheduled, if there are multiple threads scheduling the fibers, a lot of careful synchronization is required to make sure fibers do not interfere with each other. To simplify the interaction between threads and fibers, it is often useful to create only as many threads as there are processors to run them, and affinitize the threads to each run only on a distinct set of available processors, or even just one processor. Each thread can then run a particular subset of the fibers, establishing a one to-many relationship between threads and fibers which simplifies synchronization. Even so there are still many difficulties with fibers. Most Win32 libraries are completely unaware of fibers, and applications that attempt to use fibers as if they were threads will encounter various failures. The kernel has no knowledge of fibers, and when a fiber enters the kernel, the thread it is executing on may block and the kernel will schedule an arbitrary thread on the processor, making it unavailable to run other fibers. For these reasons fibers are rarely used except when porting code from other systems that explicitly need the functionality provided by fibers.
推荐文章
- 跨线程操作无效:控件“textBox1”从创建它的线程以外的线程访问
- 线和纤维的区别是什么?
- Java同步方法锁定对象,或方法?
- Haskell对Node.js的响应是什么?
- 调用线程必须是STA,因为许多UI组件都要求这一点
- Redis是单线程的,那么它如何做并发I/O?
- Java:如何测试调用System.exit()的方法?
- 不能pickle <type 'instancemethod'>当使用多处理Pool.map()
- UI线程上的任务继续
- ExecutorService,如何等待所有任务完成
- 在python中创建线程
- 处理来自Java ExecutorService任务的异常
- c#事件和线程安全
- Thread start()和Runnable run()有什么区别
- 如何在Python中获取线程id ?