我理解流是字节序列的表示。每个流都提供了将字节读写到其给定的后备存储的方法。但溪流的意义何在?为什么后台存储本身不是我们交互的对象?

不知什么原因,我就是不喜欢这个概念。我读了很多文章,但我觉得我需要一个类比。


当前回答

为了增加回声室,流是一个抽象,所以您不关心底层存储。当您考虑有和没有流的场景时,这是最有意义的。

文件在很大程度上是无趣的,因为除了我熟悉的非基于流的方法之外,流并没有做太多事情。让我们从网络文件开始。

如果我想从互联网上下载一个文件,我必须打开一个TCP套接字,建立一个连接,并接收字节,直到没有更多的字节。我必须管理一个缓冲区,知道预期文件的大小,并编写代码来检测连接何时断开并适当地处理这个问题。

假设我有某种TcpDataStream对象。我用适当的连接信息创建它,然后从流中读取字节,直到它说没有任何字节。流处理缓冲区管理、数据结束条件和连接管理。

通过这种方式,流使I/O更容易。当然,您可以编写一个TcpFileDownloader类来完成流所做的工作,但是这样您就有了一个特定于TCP的类。大多数流接口只提供Read()和Write()方法,任何更复杂的概念都由内部实现处理。因此,您可以使用相同的基本代码来读写内存、磁盘文件、套接字和许多其他数据存储。

其他回答

我认为您需要考虑后备存储本身通常只是另一种抽象。内存流很容易理解,但是文件完全不同,这取决于您使用的文件系统,而不考虑您使用的是什么硬盘驱动器。事实上,并不是所有的流都位于备份存储的顶部:网络流基本上就是流。

流的意义在于我们将注意力限制在重要的事情上。通过标准抽象,我们可以执行公共操作。例如,即使您今天不想在文件或HTTP响应中搜索url,也不意味着您明天就不想这样做了。

Streams were originally conceived when memory was tiny compared to storage. Just reading a C file could be a significant load. Minimizing the memory footprint was extremely important. Hence, an abstraction in which very little needed to be loaded was very useful. Today, it is equally useful when performing network communication and, it turns out, rarely that restrictive when we deal with files. The ability to transparently add things like buffering in a general fashion makes it even more useful.

流已经是一个比喻,一个类比,所以真的没有必要再提供另一个。你可以把它想象成一个管道,里面有水流,水实际上是数据,管道是流。我认为这是一种双向管道如果流是双向的。它基本上是一种常见的抽象,用于在一个或两个方向上有数据流或数据序列的事物。

In languages such as C#, VB.Net, C++, Java etc., the stream metaphor is used for many things. There are file streams, in which you open a file and can read from the stream or write to it continuously; There are network streams where reading from and writing to the stream reads from and writes to an underlying established network connection. Streams for writing only are typically called output streams, as in this example, and similarly, streams that are for reading only are called input streams, as in this example.

流可以执行数据的转换或编码(例如,.Net中的SslStream将耗尽SSL协商数据并将其隐藏起来;TelnetStream可能对您隐藏Telnet协商,但提供对数据的访问;Java中的ZipOutputStream允许您写入zip归档中的文件,而不必担心zip文件格式的内部问题。

您可能会发现的另一个常见的东西是允许您编写字符串而不是字节的文本流,或者一些语言提供了允许您编写基本类型的二进制流。您将在文本流中发现一个常见的东西是字符编码,您应该知道这一点。

一些流还支持随机访问,如本例所示。另一方面,由于显而易见的原因,网络流不会。

MSDN很好地概述了。net中的流。 Sun还概述了他们的通用OutputStream类和InputStream类。 在c++中,这里有istream(输入流),ostream(输出流)和iostream(双向流)文档。

类似UNIX的操作系统也支持带有程序输入和输出的流模型,如下所述。

我使用的可视化是传送带,不是在真实的工厂里,因为我对此一无所知,而是在卡通工厂里,物品沿着线移动,被盖章、装箱、计数和检查,由一系列愚蠢的设备完成。

你有做一件事的简单组件,例如一个把樱桃放在蛋糕上的设备。这个设备有一个无樱桃蛋糕的输入流,和一个有樱桃蛋糕的输出流。用这种方式组织处理有三个优点值得一提。

首先,它简化了组件本身:如果你想把巧克力糖衣放在蛋糕上,你不需要一个复杂的设备,知道蛋糕的一切,你可以创造一个愚蠢的设备,把巧克力糖衣粘在任何东西上(在漫画中,这甚至不知道下一个东西不是蛋糕,而是怀尔E.大狼)。

其次,你可以通过将这些设备按不同的顺序排列来创造不同的产品:也许你想让你的蛋糕在樱桃上放糖衣,而不是樱桃在糖衣上,你可以简单地通过在生产线上交换设备来做到这一点。

Thirdly, the devices don't need to manage inventory, boxing, or unboxing. The most efficient way of aggregating and packaging things is changeable: maybe today you're putting your cakes into boxes of 48 and sending them out by the truckload, but tomorrow you want to send out boxes of six in response to custom orders. This kind of change can be accommodated by replacing or reconfiguring the machines at the start and end of the production line; the cherry machine in the middle of the line doesn't have to be changed to process a different number of items at a time, it always works with one item at a time and it doesn't have to know how its input or output is being grouped.

流表示可以按顺序访问的对象序列(通常是字节,但不一定是这样)。流的典型操作:

read one byte. Next time you read, you'll get the next byte, and so on. read several bytes from the stream into an array seek (move your current position in the stream, so that next time you read you get bytes from the new position) write one byte write several bytes from an array into the stream skip bytes from the stream (this is like read, but you ignore the data. Or if you prefer it's like seek but can only go forwards.) push back bytes into an input stream (this is like "undo" for read - you shove a few bytes back up the stream, so that next time you read that's what you'll see. It's occasionally useful for parsers, as is: peek (look at bytes without reading them, so that they're still there in the stream to be read later)

一个特定的流可能支持读(在这种情况下,它是一个“输入流”),写(“输出流”)或两者都支持。并不是所有的溪流都是可搜索的。

Push back is fairly rare, but you can always add it to a stream by wrapping the real input stream in another input stream that holds an internal buffer. Reads come from the buffer, and if you push back then data is placed in the buffer. If there's nothing in the buffer then the push back stream reads from the real stream. This is a simple example of a "stream adaptor": it sits on the "end" of an input stream, it is an input stream itself, and it does something extra that the original stream didn't.

Stream is a useful abstraction because it can describe files (which are really arrays, hence seek is straightforward) but also terminal input/output (which is not seekable unless buffered), sockets, serial ports, etc. So you can write code which says either "I want some data, and I don't care where it comes from or how it got here", or "I'll produce some data, and it's entirely up to my caller what happens to it". The former takes an input stream parameter, the latter takes an output stream parameter.

我能想到的最好的比喻是,溪流是一条传送带,向你走来或离开你(有时两者兼而有之)。你从输入流中取出东西,你把东西放到输出流中。有些传送带你可以认为是从墙上的一个洞里出来的——它们是不可寻找的,阅读或写作是一次性的交易。一些传送带就摆在你面前,你可以在溪流中选择你想读/写的位置——这就是寻找。

As IRBMe says, though, it's best to think of a stream in terms of the operations it offers (which vary from implementation to implementation, but have a lot in common) rather than by a physical analogy. Streams are "things you can read or write". When you start connecting up stream adaptors, you can think of them as a box with a conveyor in, and a conveyor out, that you connect to other streams and then the box performs some transformation on the data (zipping it, or changing UNIX linefeeds to DOS ones, or whatever). Pipes are another thorough test of the metaphor: that's where you create a pair of streams such that anything you write into one can be read out of the other. Think wormholes :-)

把流看作是抽象的数据源(字节、字符等)。它们抽象了具体数据源的实际读写机制,可以是网络套接字、磁盘上的文件或来自web服务器的响应。