我如何读整个文件到性病::字符串在c++ ?

我如何读一个文件到一个std::字符串，即，读取整个文件一次?

文本或二进制模式应该由调用者指定。解决方案应该是符合标准的、可移植的和高效的。它不应该不必要地复制字符串的数据，并且应该避免在读取字符串时重新分配内存。

一种方法是统计文件大小，调整std::string和fread()到std::string的const_cast<char*>()'ed data()。这要求std::string的数据是连续的，这不是标准所要求的，但它似乎是所有已知实现的情况。更糟糕的是，如果以文本模式读取文件，std::string的大小可能不等于文件的大小。

一个完全正确的、符合标准的、可移植的解决方案可以使用std::ifstream的rdbuf()构造成std::ostringstream，再从那里构造成std::string。但是，这可能会复制字符串数据和/或不必要地重新分配内存。

是否所有相关的标准库实现都足够智能以避免所有不必要的开销? 还有别的办法吗? 我是否错过了一些已经提供所需功能的隐藏Boost函数?

void slurp(std::string& data, bool is_binary)

当前回答

该解决方案将错误检查添加到基于rdbuf()的方法中。

std::string file_to_string(const std::string& file_name)
{
    std::ifstream file_stream{file_name};

    if (file_stream.fail())
    {
        // Error opening file.
    }

    std::ostringstream str_stream{};
    file_stream >> str_stream.rdbuf();  // NOT str_stream << file_stream.rdbuf()

    if (file_stream.fail() && !file_stream.eof())
    {
        // Error reading file.
    }

    return str_stream.str();
}

I'm adding this answer because adding error-checking to the original method is not as trivial as you'd expect. The original method uses stringstream's insertion operator (str_stream << file_stream.rdbuf()). The problem is that this sets the stringstream's failbit when no characters are inserted. That can be due to an error or it can be due to the file being empty. If you check for failures by inspecting the failbit, you'll encounter a false positive when you read an empty file. How do you disambiguate legitimate failure to insert any characters and "failure" to insert any characters because the file is empty?

您可能会认为显式地检查空文件，但这是更多的代码和相关的错误检查。

检查失败条件str_stream.fail() && !str_stream.eof()不工作，因为插入操作没有设置eofbit(在ostringstream或ifstream上)。

所以，解决办法就是改变操作。不要使用ostringstream的插入操作符(<<)，而是使用ifstream的提取操作符(>>)，它确实设置了eofbit。然后检查失败条件file_stream.fail() && !file_stream.eof()。

重要的是，当file_stream >> str_stream.rdbuf()遇到合法的失败时，它不应该设置eofbit(根据我对规范的理解)。这意味着上述检查足以检测出合法的故障。

2017-03-26 10:15:05

其他回答

最短的变体:Live On Coliru

std::string str(std::istreambuf_iterator<char>{ifs}, {});

它需要头文件<iterator>。

有一些报告说，这种方法比预先分配字符串和使用std::istream::read要慢。然而，在现代的编译器上，这种情况似乎不再存在，尽管各种方法的相对性能似乎高度依赖于编译器。

2008-09-22 17:13:40

这样的事情应该不会太糟糕:

void slurp(std::string& data, const std::string& filename, bool is_binary)
{
    std::ios_base::openmode openmode = ios::ate | ios::in;
    if (is_binary)
        openmode |= ios::binary;
    ifstream file(filename.c_str(), openmode);
    data.clear();
    data.reserve(file.tellg());
    file.seekg(0, ios::beg);
    data.append(istreambuf_iterator<char>(file.rdbuf()), 
                istreambuf_iterator<char>());
}

这样做的好处是，我们先做了预留，这样我们就不必在读入时增加字符串。缺点是我们一个字符一个字符地做。更聪明的版本可以抓取整个read buf，然后调用下流。

2008-09-22 17:14:24

#include <string>
#include <sstream>

using namespace std;

string GetStreamAsString(const istream& in)
{
    stringstream out;
    out << in.rdbuf();
    return out.str();
}

string GetFileAsString(static string& filePath)
{
    ifstream stream;
    try
    {
        // Set to throw on failure
        stream.exceptions(fstream::failbit | fstream::badbit);
        stream.open(filePath);
    }
    catch (system_error& error)
    {
        cerr << "Failed to open '" << filePath << "'\n" << error.code().message() << endl;
        return "Open fail";
    }

    return GetStreamAsString(stream);
}

用法:

const string logAsString = GetFileAsString(logFilePath);

2019-09-17 11:56:03

你可以使用'std::getline'函数，并指定'eof'作为分隔符。结果代码有点晦涩:

std::string data;
std::ifstream in( "test.txt" );
std::getline( in, data, std::string::traits_type::to_char_type( 
                  std::string::traits_type::eof() ) );

2008-09-22 17:16:23

我知道这是一个非常古老的问题，有很多答案，但没有一个人提到我认为最明显的方法。是的，我知道这是c++，使用libc是邪恶和错误的，但这是疯狂的。使用libc很好，特别是对于这样简单的事情。

本质上:只需打开文件，获取它的大小(不一定是按这个顺序)，然后读取它。

#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <sys/stat.h>

static constexpr char const filename[] = "foo.bar";

int main(void)
{
    FILE *fp = ::fopen(filename, "rb");
    if (!fp) {
        ::perror("fopen");
        ::exit(1);
    }

    struct stat st;
    if (::fstat(fileno(fp), &st) == (-1)) {
        ::perror("fstat");
        ::exit(1);
    }

    // You could simply allocate a buffer here and use std::string_view, or
    // even allocate a buffer and copy it to a std::string. Creating a
    // std::string and setting its size is simplest, but will pointlessly
    // initialize the buffer to 0. You can't win sometimes.
    std::string str;
    str.reserve(st.st_size + 1U);
    str.resize(st.st_size);
    ::fread(str.data(), 1, st.st_size, fp);
    str[st.st_size] = '\0';
    ::fclose(fp);
}

除了(在实践中)完全可移植之外，这看起来并不比其他一些解决方案更糟糕。当然，也可以抛出异常，而不是立即退出。它严重激怒我，调整std::string总是0初始化它，但这是没有办法的。

请注意，这只适用于c++ 17及以后的版本。早期版本(应该)禁止编辑std::string::data()。如果使用较早的版本，可以考虑使用std::string_view或简单地复制一个原始缓冲区。

2021-10-10 08:01:34

我如何读整个文件到性病::字符串在c++ ?

推荐文章

最新文章

标签