
任何对终端操作的调用都会关闭流,使其不可用。 这个“功能”会带走很多功能。



IEnumerable<int> QuickSort(IEnumerable<int> ints)
  if (!ints.Any()) {
    return Enumerable.Empty<int>();

  int pivot = ints.First();

  IEnumerable<int> lt = ints.Where(i => i < pivot);
  IEnumerable<int> gt = ints.Where(i => i > pivot);

  return QuickSort(lt).Concat(new int[] { pivot }).Concat(QuickSort(gt));


这在Java中是做不到的! 我甚至不能在不导致流不可用的情况下询问流是否为空。


流是围绕spliterator构建的,spliterator是有状态的、可变的对象。它们没有“重置”操作,事实上,要求支持这样的倒带操作会“消耗很多能量”。random .int()应该如何处理这样的请求?




It seems that a lot of confusion stems from misguiding comparison of IEnumerable with Stream. An IEnumerable represents the ability to provide an actual IEnumerator, so its like an Iterable in Java. In contrast, a Stream is a kind of iterator and comparable to an IEnumerator so it’s wrong to claim that this kind of data type can be used multiple times in .NET, the support for IEnumerator.Reset is optional. The examples discussed here rather use the fact that an IEnumerable can be used to fetch new IEnumerators and that works with Java’s Collections as well; you can get a new Stream. If the Java developers decided to add the Stream operations to Iterable directly, with intermediate operations returning another Iterable, it was really comparable and it could work the same way.

However, the developers decided against it and the decision is discussed in this question. The biggest point is the confusion about eager Collection operations and lazy Stream operations. By looking at the .NET API, I (yes, personally) find it justified. While it looks reasonable looking at IEnumerable alone, a particular Collection will have lots of methods manipulating the Collection directly and lots of methods returning a lazy IEnumerable, while the particular nature of a method isn’t always intuitively recognizable. The worst example I found (within the few minutes I looked at it) is List.Reverse() whose name matches exactly the name of the inherited (is this the right terminus for extension methods?) Enumerable.Reverse() while having an entirely contradicting behavior.

当然,这是两个截然不同的决定。第一个是使Stream成为一种不同于Iterable/Collection的类型,第二个是使Stream成为一种一次性迭代器而不是另一种迭代器。但这些决定是一起做出的,可能从来没有考虑过把这两个决定分开。它在创建时并没有考虑到与. net的可比性。


There is another implementation aspect you have to consider. Streams are not immutable data structures. Each intermediate operation may return a new Stream instance encapsulating the old one but it may also manipulate its own instance instead and return itself (that doesn’t preclude doing even both for the same operation). Commonly known examples are operations like parallel or unordered which do not add another step but manipulate the entire pipeline). Having such a mutable data structure and attempts to reuse (or even worse, using it multiple times at the same time) doesn’t play well…


static Stream<Integer> quickSort(Supplier<Stream<Integer>> ints) {

  final Optional<Integer> optPivot = ints.get().findAny();
  if(!optPivot.isPresent()) return Stream.empty();

  final int pivot = optPivot.get();

  Supplier<Stream<Integer>> lt = ()->ints.get().filter(i -> i < pivot);
  Supplier<Stream<Integer>> gt = ()->ints.get().filter(i -> i > pivot);

  return Stream.of(quickSort(lt), Stream.of(pivot), quickSort(gt)).flatMap(s->s);


List<Integer> l=new Random().ints(100, 0, 1000).boxed().collect(Collectors.toList());
    .map(Object::toString).collect(Collectors.joining(", ")));


static Stream<Integer> quickSort(Supplier<Stream<Integer>> ints) {
    return ints.get().findAny().map(pivot ->
                   quickSort(()->ints.get().filter(i -> i < pivot)),
                   quickSort(()->ints.get().filter(i -> i > pivot)))




    Spliterator<String> split = Stream.of("hello","world")

    Stream<String> replayable1 = StreamSupport.stream(split,false);
    Stream<String> replayable2 = StreamSupport.stream(split,false);






We could make use of a stateless Stream creation method such as Stream#generate(). We would have to manage state externally in our own code and reset between Stream "replays": Spliterator<String> split = Stream.generate(this::nextValue) .map(s->"prefix-"+s) .spliterator(); Stream<String> replayable1 = StreamSupport.stream(split,false); Stream<String> replayable2 = StreamSupport.stream(split,false); replayable1.forEach(System.out::println); this.resetCounter(); replayable2.forEach(System.out::println); Another (slightly better but not perfect) solution to this is to write our own ArraySpliterator (or similar Stream source) that includes some capacity to reset the current counter. If we were to use it to generate the Stream we could potentially replay them successfully. MyArraySpliterator<String> arraySplit = new MyArraySpliterator("hello","world"); Spliterator<String> split = StreamSupport.stream(arraySplit,false) .map(s->"prefix-"+s) .spliterator(); Stream<String> replayable1 = StreamSupport.stream(split,false); Stream<String> replayable2 = StreamSupport.stream(split,false); replayable1.forEach(System.out::println); arraySplit.reset(); replayable2.forEach(System.out::println); The best solution to this problem (in my opinion) is to make a new copy of any stateful Spliterators used in the Stream pipeline when new operators are invoked on the Stream. This is more complex and involved to implement, but if you don't mind using third party libraries, cyclops-react has a Stream implementation that does exactly this. (Disclosure: I am the lead developer for this project.) Stream<String> replayableStream = ReactiveSeq.of("hello","world") .map(s->"prefix-"+s); replayableStream.forEach(System.out::println); replayableStream.forEach(System.out::println);




我有一些关于Streams API早期设计的回忆,可能会对设计原理有所启发。



现有的工作对设计有很多影响。其中更有影响力的是谷歌的Guava库和Scala collections库。(如果有人对Guava的影响感到惊讶,请注意Guava的首席开发人员Kevin Bourrillion是JSR-335 Lambda专家组的成员。)关于Scala集合,我们发现Martin Odersky的演讲特别有趣:面向未来的Scala集合:从可变到持久再到并行。(斯坦福EE380, 2011年6月1日。)




Now what if the source is one-shot, like reading lines from a file? Maybe the first Iterator should get all the values but the second and subsequent ones should be empty. Maybe the values should be interleaved among the Iterators. Or maybe each Iterator should get all the same values. Then, what if you have two iterators and one gets farther ahead of the other? Somebody will have to buffer up the values in the second Iterator until they're read. Worse, what if you get one Iterator and read all the values, and only then get a second Iterator. Where do the values come from now? Is there a requirement for them all to be buffered up just in case somebody wants a second Iterator?


我们还观察到其他人也遇到了这些问题。在JDK中,大多数Iterables是集合或类似集合的对象,它们允许多次遍历。它没有在任何地方指定,但似乎有一个不成文的期望,即Iterables允许多次遍历。一个明显的例外是NIO DirectoryStream接口。它的规范包括这个有趣的警告:




大约在这个时候,Bruce Eckel发表了一篇文章,描述了他在使用Scala时遇到的一些问题。他写了这样的代码:

// Scala
val lines = fromString(data).getLines
val registrants = lines.map(Registrant)


这种经验使我们相信,如果尝试多次遍历,那么获得清晰可预测的结果是非常重要的。它还强调了区分惰性管道式结构与存储数据的实际集合的重要性。这反过来推动了将惰性管道操作分离到新的Stream接口中,并直接在Collections上只保留急切的、可变的操作。Brian Goetz解释了其中的基本原理。



Iterable<?> it = source.filter(...).map(...).filter(...).map(...);



正如我上面提到的,我们一直在与Guava开发者进行交流。他们有一个很酷的东西是Idea Graveyard,在那里他们描述了他们决定不实现的功能以及原因。惰性集合的想法听起来很酷,但下面是他们对它的看法。考虑一个返回List的List.filter()操作:

这里最大的问题是太多的操作会变成昂贵的线性时间命题。如果你想过滤一个列表并返回一个列表,而不仅仅是一个Collection或Iterable,你可以使用immutabllist . copyof (Iterables. copyof)。Filter (list, predicate)),它“预先声明”它正在做什么以及它的开销是多少。



In proposing to disallow non-linear or "no-reuse" streams, Paul Sandoz described the potential consequences of allowing them as giving rise to "unexpected or confusing results." He also mentioned that parallel execution would make things even trickier. Finally, I'd add that a pipeline operation with side effects would lead to difficult and obscure bugs if the operation were unexpectedly executed multiple times, or at least a different number of times than the programmer expected. (But Java programmers don't write lambda expressions with side effects, do they? DO THEY??)

这就是Java 8 Streams API设计的基本原理,它允许一次遍历,并且需要一个严格的线性(无分支)管道。它提供了跨多个不同流源的一致行为,它清晰地将惰性操作与急切操作区分开来,并且它提供了一个简单的执行模型。

With regard to IEnumerable, I am far from an expert on C# and .NET, so I would appreciate being corrected (gently) if I draw any incorrect conclusions. It does appear, however, that IEnumerable permits multiple traversal to behave differently with different sources; and it permits a branching structure of nested IEnumerable operations, which may result in some significant recomputation. While I appreciate that different systems make different tradeoffs, these are two characteristics that we sought to avoid in the design of the Java 8 Streams API.

The quicksort example given by the OP is interesting, puzzling, and I'm sorry to say, somewhat horrifying. Calling QuickSort takes an IEnumerable and returns an IEnumerable, so no sorting is actually done until the final IEnumerable is traversed. What the call seems to do, though, is build up a tree structure of IEnumerables that reflects the partitioning that quicksort would do, without actually doing it. (This is lazy computation, after all.) If the source has N elements, the tree will be N elements wide at its widest, and it will be lg(N) levels deep.

It seems to me -- and once again, I'm not a C# or .NET expert -- that this will cause certain innocuous-looking calls, such as pivot selection via ints.First(), to be more expensive than they look. At the first level, of course, it's O(1). But consider a partition deep in the tree, at the right-hand edge. To compute the first element of this partition, the entire source has to be traversed, an O(N) operation. But since the partitions above are lazy, they must be recomputed, requiring O(lg N) comparisons. So selecting the pivot would be an O(N lg N) operation, which is as expensive as an entire sort.


对于惰性IEnumerables树,树的底部有N个分区。计算每个分区需要遍历N个元素,每一个都需要lg(N)次比较。为了计算树底部的所有分区,需要O(N^2 lgn)次比较。



The reason is that you can create streams from things that can only be used once by definition, such as an Iterator or a BufferedReader. You can think of a Stream as being consumed the same way as having used a BufferedReader to read a text file to its end. Once you reach the end of the file, the BufferedReader doesn't stop existing, but it just become useless as you can't get anything out of it anymore. If you want to read the file again, you have to create a new reader. The same goes for streams. If you want to process the source of the stream twice, you have to create two separate streams.



IEnumerable<int> numbers = new int[] { 1, 2, 3, 4, 5 };

foreach (var n in numbers) {


IEnumerable<int> numbers = new int[] { 1, 2, 3, 4, 5 };

IEnumerator<int> enumerator = numbers.GetEnumerator();
while (enumerator.MoveNext()) {



class Generator : IEnumerator<int> {
    Random _r;
    int _current;
    int _count = 0;

    public Generator(Random r) {
        _r = r;

    public bool MoveNext() {
        _current= _r.Next();
        return _count <= 5;

    public int Current {
        get { return _current; }

class RandomNumberStream : IEnumerable<int> {
    Random _r = new Random();
    public IEnumerator<int> GetEnumerator() {
        return new Generator(_r);
    public IEnumerator IEnumerable.GetEnumerator() {
        return this.GetEnumerator();


IEnumerable<int> numbers = new RandomNumberStream();

foreach (var n in numbers) {
foreach (var n in numbers) {







