打开一个文件实际上做什么?

在所有编程语言(至少我使用的)中，必须先打开一个文件，然后才能对其进行读写。

但是这个开放操作实际上是做什么的呢?

典型函数的手册页面实际上没有告诉你任何东西，除了它“打开一个文件进行读写”:

http://www.cplusplus.com/reference/cstdio/fopen/

https://docs.python.org/3/library/functions.html#open

显然，通过使用该函数，您可以知道它涉及到创建某种对象，以方便访问文件。

另一种说法是，如果我要实现一个开放函数，它在Linux上需要做什么?

当前回答

在它的核心，当打开阅读时，实际上不需要发生任何花哨的事情。它所需要做的就是检查文件是否存在，并且应用程序有足够的权限来读取它，并创建一个句柄，您可以在这个句柄上向文件发出读取命令。

正是在这些命令上，实际的读取将被分派。

操作系统通常会通过开始一个读操作来填充与句柄相关的缓冲区来开始读取操作。然后，当您实际执行读取操作时，它可以立即返回缓冲区的内容，而不需要等待磁盘IO。

为了打开一个新文件写操作系统将需要在目录中添加一个新(当前为空)文件的条目。再次创建一个句柄，您可以在其上发出写入命令。

2015-11-03 10:39:58

其他回答

你想谈论的任何文件系统或操作系统我都可以。好了!

在ZX Spectrum上，初始化LOAD命令将使系统进入一个紧密的循环，读取音频。

数据开始由一个常量音调表示，之后是长/短脉冲序列，其中短脉冲表示二进制0，较长脉冲表示二进制1 (https://en.wikipedia.org/wiki/ZX_Spectrum_software)。紧加载循环收集位，直到它填满一个字节(8位)，将其存储到内存中，增加内存指针，然后循环回来扫描更多的位。

通常，加载器会读取的第一件事是一个短的、固定格式的头，它至少指示了预期的字节数，以及可能的附加信息，如文件名、文件类型和加载地址。在读取这个短报头后，程序可以决定是继续加载数据的主要部分，还是退出加载例程并为用户显示适当的消息。

可以通过接收任意数量的字节来识别文件结束状态(可以是固定数量的字节，在软件中是硬连接的，也可以是在头文件中指出的可变数量)。如果加载循环在一定时间内没有接收到预期频率范围内的脉冲，则抛出错误。

关于这个答案有一点背景知识

所描述的过程是从普通磁带中加载数据——因此需要扫描audio In(它与磁带录音机的标准插头连接)。LOAD命令在技术上与打开文件相同——但它在物理上与实际加载文件绑定。这是因为磁带录音机不是由计算机控制的，你不能(成功地)打开一个文件而不加载它。

The "tight loop" is mentioned because (1) the CPU, a Z80-A (if memory serves), was really slow: 3.5 MHz, and (2) the Spectrum had no internal clock! That means that it had to accurately keep count of the T-states (instruction times) for every. single. instruction. inside that loop, just to maintain the accurate beep timing. Fortunately, that low CPU speed had the distinct advantage that you could calculate the number of cycles on a piece of paper, and thus the real world time that they would take.

2015-11-03 10:49:07

Basically, a call to open needs to find the file, and then record whatever it needs to so that later I/O operations can find it again. That's quite vague, but it will be true on all the operating systems I can immediately think of. The specifics vary from platform to platform. Many answers already on here talk about modern-day desktop operating systems. I've done a little programming on CP/M, so I will offer my knowledge about how it works on CP/M (MS-DOS probably works in the same way, but for security reasons, it is not normally done like this today).

On CP/M you have a thing called the FCB (as you mentioned C, you could call it a struct; it really is a 35-byte contiguous area in RAM containing various fields). The FCB has fields to write the file-name and a (4-bit) integer identifying the disk drive. Then, when you call the kernel's Open File, you pass a pointer to this struct by placing it in one of the CPU's registers. Some time later, the operating system returns with the struct slightly changed. Whatever I/O you do to this file, you pass a pointer to this struct to the system call.

What does CP/M do with this FCB? It reserves certain fields for its own use, and uses these to keep track of the file, so you had better not ever touch them from inside your program. The Open File operation searches through the table at the start of the disk for a file with the same name as what's in the FCB (the '?' wildcard character matches any character). If it finds a file, it copies some information into the FCB, including the file's physical location(s) on the disk, so that subsequent I/O calls ultimately call the BIOS which may pass these locations to the disk driver. At this level, specifics vary.

2015-11-04 19:43:57

正是在这些命令上，实际的读取将被分派。

为了打开一个新文件写操作系统将需要在目录中添加一个新(当前为空)文件的条目。再次创建一个句柄，您可以在其上发出写入命令。

2015-11-03 10:39:58

当你打开一个文件时到底会发生什么，这取决于操作系统。下面我将描述在Linux中发生的事情，因为它可以让您了解当您打开一个文件时会发生什么，如果您对更详细的内容感兴趣，您可以检查源代码。我没有涉及权限，因为这会使这个答案太长。

In Linux every file is recognised by a structure called inode. Each structure has an unique number and every file only gets one inode number. This structure stores meta data for a file, for example file-size, file-permissions, time stamps and pointer to disk blocks, however, not the actual file name itself. Each file (and directory) contains a file name entry and the inode number for lookup. When you open a file, assuming you have the relevant permissions, a file descriptor is created using the unique inode number associated with file name. As many processes/applications can point to the same file, inode has a link field that maintains the total count of links to the file. If a file is present in a directory, its link count is one, if it has a hard link its link count will be two and if a file is opened by a process, the link count will be incremented by 1.

2015-11-03 09:38:20

在几乎所有高级语言中，打开文件的函数都是对应内核系统调用的包装器。它也可以做其他奇特的事情，但是在当代的操作系统中，打开一个文件必须总是通过内核。

这就是为什么fopen库函数或Python的open函数的实参非常类似于open(2)系统调用的实参。

除了打开文件，这些函数通常还会设置一个缓冲区，用于读/写操作。这个缓冲区的目的是确保每当您想读取N个字节时，相应的库调用将返回N个字节，而不管对底层系统调用的调用是否返回更少。

我对实现我自己的功能不感兴趣;只是为了理解到底发生了什么……“超越语言”，如果你喜欢的话。

在类unix操作系统中，成功调用open将返回一个“文件描述符”，它只是用户进程上下文中的一个整数。因此，该描述符将被传递给与打开的文件交互的任何调用，并且在对其调用close之后，该描述符将失效。

重要的是要注意，对open的调用就像一个进行各种检查的验证点。如果不是所有条件都满足，则调用失败，返回-1而不是描述符，错误类型在errno中指出。基本检查包括:

文件是否存在; 调用进程是否有权限以指定的方式打开该文件。这是通过将文件权限、所有者ID和组ID与调用进程的ID相匹配来确定的。

在内核上下文中，进程的文件描述符和物理打开的文件之间必须存在某种映射。映射到描述符的内部数据结构可能包含另一个处理基于块的设备的缓冲区，或者指向当前读/写位置的内部指针。

2015-11-03 09:30:42

打开一个文件实际上做什么?

推荐文章

最新文章

标签