我想找到最快的方法来检查一个文件是否存在于标准c++ 11, 14, 17,或C。我有成千上万的文件,在对它们做一些事情之前,我需要检查它们是否都存在。在下面的函数中,我可以写什么来代替/* SOMETHING */ ?

inline bool exist(const std::string& name)
{
    /* SOMETHING */
}

当前回答

It depends on where the files reside. For instance, if they are all supposed to be in the same directory, you can read all the directory entries into a hash table and then check all the names against the hash table. This might be faster on some systems than checking each file individually. The fastest way to check each file individually depends on your system ... if you're writing ANSI C, the fastest way is fopen because it's the only way (a file might exist but not be openable, but you probably really want openable if you need to "do something on it"). C++, POSIX, Windows all offer additional options.

While I'm at it, let me point out some problems with your question. You say that you want the fastest way, and that you have thousands of files, but then you ask for the code for a function to test a single file (and that function is only valid in C++, not C). This contradicts your requirements by making an assumption about the solution ... a case of the XY problem. You also say "in standard c++11(or)c++(or)c" ... which are all different, and this also is inconsistent with your requirement for speed ... the fastest solution would involve tailoring the code to the target system. The inconsistency in the question is highlighted by the fact that you accepted an answer that gives solutions that are system-dependent and are not standard C or C++.

其他回答

我需要一个快速的函数,可以检查一个文件是否存在,PherricOxide的答案几乎是我所需要的,除了它没有比较boost::filesystem::exists和open函数的性能。从基准测试结果中,我们可以很容易地看到:

使用stat函数是检查文件是否存在的最快方法。注意,我的结果与PherricOxide的答案是一致的。 boost::filesystem::exists函数的性能与stat函数非常接近,并且具有可移植性。如果boost库可以从代码中访问,我会推荐这个解决方案。

在Linux内核4.17.0和gcc-7.3上获得的基准测试结果:

2018-05-05 00:35:35
Running ./filesystem
Run on (8 X 2661 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 256K (x4)
  L3 Unified 8192K (x1)
--------------------------------------------------
Benchmark           Time           CPU Iterations
--------------------------------------------------
use_stat          815 ns        813 ns     861291
use_open         2007 ns       1919 ns     346273
use_access       1186 ns       1006 ns     683024
use_boost         831 ns        830 ns     831233

下面是我的基准代码:

#include <string.h>                                                                                                                                                                                                                                           
#include <stdlib.h>                                                                                                                                                                                                                                           
#include <sys/types.h>                                                                                                                                                                                                                                        
#include <sys/stat.h>                                                                                                                                                                                                                                         
#include <unistd.h>                                                                                                                                                                                                                                           
#include <dirent.h>                                                                                                                                                                                                                                           
#include <fcntl.h>                                                                                                                                                                                                                                            
#include <unistd.h>                                                                                                                                                                                                                                           

#include "boost/filesystem.hpp"                                                                                                                                                                                                                               

#include <benchmark/benchmark.h>                                                                                                                                                                                                                              

const std::string fname("filesystem.cpp");                                                                                                                                                                                                                    
struct stat buf;                                                                                                                                                                                                                                              

// Use stat function                                                                                                                                                                                                                                          
void use_stat(benchmark::State &state) {                                                                                                                                                                                                                      
    for (auto _ : state) {                                                                                                                                                                                                                                    
        benchmark::DoNotOptimize(stat(fname.data(), &buf));                                                                                                                                                                                                   
    }                                                                                                                                                                                                                                                         
}                                                                                                                                                                                                                                                             
BENCHMARK(use_stat);                                                                                                                                                                                                                                          

// Use open function                                                                                                                                                                                                                                          
void use_open(benchmark::State &state) {                                                                                                                                                                                                                      
    for (auto _ : state) {                                                                                                                                                                                                                                    
        int fd = open(fname.data(), O_RDONLY);                                                                                                                                                                                                                
        if (fd > -1) close(fd);                                                                                                                                                                                                                               
    }                                                                                                                                                                                                                                                         
}                                                                                                                                                                                                                                                             
BENCHMARK(use_open);                                  
// Use access function                                                                                                                                                                                                                                        
void use_access(benchmark::State &state) {                                                                                                                                                                                                                    
    for (auto _ : state) {                                                                                                                                                                                                                                    
        benchmark::DoNotOptimize(access(fname.data(), R_OK));                                                                                                                                                                                                 
    }                                                                                                                                                                                                                                                         
}                                                                                                                                                                                                                                                             
BENCHMARK(use_access);                                                                                                                                                                                                                                        

// Use boost                                                                                                                                                                                                                                                  
void use_boost(benchmark::State &state) {                                                                                                                                                                                                                     
    for (auto _ : state) {                                                                                                                                                                                                                                    
        boost::filesystem::path p(fname);                                                                                                                                                                                                                     
        benchmark::DoNotOptimize(boost::filesystem::exists(p));                                                                                                                                                                                               
    }                                                                                                                                                                                                                                                         
}                                                                                                                                                                                                                                                             
BENCHMARK(use_boost);                                                                                                                                                                                                                                         

BENCHMARK_MAIN();   

测试文件是否存在的最快和最安全的方法是根本不单独/显式地测试它。也就是说,看看你是否能找到一种方法来取代普通

if(exists(file)) {                           /* point A */
    /* handle existence condition */
    return;
}

do_something_with(file);                     /* point B */

随着

r = do_something_with_unless_exists(file);

if(r == 0)
    success;
else if(errno == EEXIST)
    /* handle existence condition */
else
    /* handle other error */

除了速度更快之外,这还消除了第一个解决方案中固有的竞争条件(特别是“TOC/TOU”),即文件在点A和点B之间存在的可能性。

显然,第二个解决方案假定存在一种原子方法来执行do_something_with_unless_exists操作。通常总会有办法的,但有时你得四处寻找。

创建文件:使用O_CREAT和O_EXCL调用open()。 创建一个纯C文件,如果你有C11:调用fopen()与"wx"。(我昨天才知道这个。) 创建目录:只需调用mkdir(),然后检查errno == EEXIST。 获取锁:任何称职的锁定系统都已经拥有一个原子性的“只要没有其他人拥有就获取锁”原语。

(还有其他的,但这些是我现在能想到的。)

[脚注:在Unix的早期,没有特定的、专用的工具可用于普通进程进行锁定,所以如果你想建立一个互斥锁,这通常是通过创建一个特定的空目录来实现的,因为mkdir系统调用总是能够根据先前的存在或不存在而原子地失败或成功。]

你可以使用std::ifstream,函数如is_open, fail,例如下面的代码(cout“open”表示文件是否存在):

引自以下答案

与PherricOxide的建议相同,但在C中

#include <sys/stat.h>
int exist(const char *name)
{
  struct stat   buffer;
  return (stat (name, &buffer) == 0);
}

windows下还有3个选项:

1

inline bool exist(const std::string& name)
{
    OFSTRUCT of_struct;
    return OpenFile(name.c_str(), &of_struct, OF_EXIST) != INVALID_HANDLE_VALUE && of_struct.nErrCode == 0;
}

2

inline bool exist(const std::string& name)
{
    HANDLE hFile = CreateFile(name.c_str(), GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    if (hFile != NULL && hFile != INVALID_HANDLE)
    {
         CloseFile(hFile);
         return true;
    }
    return false;
}

3

inline bool exist(const std::string& name)
{
    return GetFileAttributes(name.c_str()) != INVALID_FILE_ATTRIBUTES;
}