我如何读一个文件到一个std::字符串,即,读取整个文件一次?

文本或二进制模式应该由调用者指定。解决方案应该是符合标准的、可移植的和高效的。它不应该不必要地复制字符串的数据,并且应该避免在读取字符串时重新分配内存。

一种方法是统计文件大小,调整std::string和fread()到std::string的const_cast<char*>()'ed data()。这要求std::string的数据是连续的,这不是标准所要求的,但它似乎是所有已知实现的情况。更糟糕的是,如果以文本模式读取文件,std::string的大小可能不等于文件的大小。

一个完全正确的、符合标准的、可移植的解决方案可以使用std::ifstream的rdbuf()构造成std::ostringstream,再从那里构造成std::string。但是,这可能会复制字符串数据和/或不必要地重新分配内存。

是否所有相关的标准库实现都足够智能以避免所有不必要的开销? 还有别的办法吗? 我是否错过了一些已经提供所需功能的隐藏Boost函数?

void slurp(std::string& data, bool is_binary)

当前回答

这是我使用的函数,当处理大文件(1GB+)时,由于某种原因std::ifstream::read()比std::ifstream::rdbuf()快得多,当你知道文件大小时,所以整个“先检查文件大小”的事情实际上是一个速度优化

#include <string>
#include <fstream>
#include <sstream>
std::string file_get_contents(const std::string &$filename)
{
    std::ifstream file($filename, std::ifstream::binary);
    file.exceptions(std::ifstream::failbit | std::ifstream::badbit);
    file.seekg(0, std::istream::end);
    const std::streampos ssize = file.tellg();
    if (ssize < 0)
    {
        // can't get size for some reason, fallback to slower "just read everything"
        // because i dont trust that we could seek back/fourth in the original stream,
        // im creating a new stream.
        std::ifstream file($filename, std::ifstream::binary);
        file.exceptions(std::ifstream::failbit | std::ifstream::badbit);
        std::ostringstream ss;
        ss << file.rdbuf();
        return ss.str();
    }
    file.seekg(0, std::istream::beg);
    std::string result(size_t(ssize), 0);
    file.read(&result[0], std::streamsize(ssize));
    return result;
}

其他回答

#include <iostream>
#include <fstream>
#include <string.h>
using namespace std;
main(){
    fstream file;
    //Open a file
    file.open("test.txt");
    string copy,temp;
    //While loop to store whole document in copy string
    //Temp reads a complete line
    //Loop stops until temp reads the last line of document
    while(getline(file,temp)){
        //add new line text in copy
        copy+=temp;
        //adds a new line
        copy+="\n";
    }
    //Display whole document
    cout<<copy;
    //close the document
    file.close();
}

Use

#include <iostream>
#include <sstream>
#include <fstream>

int main()
{
  std::ifstream input("file.txt");
  std::stringstream sstr;

  while(input >> sstr.rdbuf());

  std::cout << sstr.str() << std::endl;
}

或者非常接近。我自己没有打开stdlib引用来进行双重检查。

是的,我知道我没有按照要求写slurp函数。

一种方法是将流缓冲区刷新到一个单独的内存流中,然后将其转换为std::string(错误处理省略):

std::string slurp(std::ifstream& in) {
    std::ostringstream sstr;
    sstr << in.rdbuf();
    return sstr.str();
}

这是非常简洁的。然而,正如问题中所指出的那样,这执行了冗余拷贝,不幸的是,基本上没有办法省略这个拷贝。

不幸的是,避免冗余拷贝的唯一真正解决方案是在循环中手动读取。由于c++现在保证了连续的字符串,可以编写以下代码(≥c++ 17,包含错误处理):

auto read_file(std::string_view path) -> std::string {
    constexpr auto read_size = std::size_t(4096);
    auto stream = std::ifstream(path.data());
    stream.exceptions(std::ios_base::badbit);
    
    auto out = std::string();
    auto buf = std::string(read_size, '\0');
    while (stream.read(& buf[0], read_size)) {
        out.append(buf, 0, stream.gcount());
    }
    out.append(buf, 0, stream.gcount());
    return out;
}

从几个地方提取信息…这应该是最快最好的方法:

#include <filesystem>
#include <fstream>
#include <string>

//Returns true if successful.
bool readInFile(std::string pathString)
{
  //Make sure the file exists and is an actual file.
  if (!std::filesystem::is_regular_file(pathString))
  {
    return false;
  }
  //Convert relative path to absolute path.
  pathString = std::filesystem::weakly_canonical(pathString);
  //Open the file for reading (binary is fastest).
  std::wifstream in(pathString, std::ios::binary);
  //Make sure the file opened.
  if (!in)
  {
    return false;
  }
  //Wide string to store the file's contents.
  std::wstring fileContents;
  //Jump to the end of the file to determine the file size.
  in.seekg(0, std::ios::end);
  //Resize the wide string to be able to fit the entire file (Note: Do not use reserve()!).
  fileContents.resize(in.tellg());
  //Go back to the beginning of the file to start reading.
  in.seekg(0, std::ios::beg);
  //Read the entire file's contents into the wide string.
  in.read(fileContents.data(), fileContents.size());
  //Close the file.
  in.close();
  //Do whatever you want with the file contents.
  std::wcout << fileContents << L" " << fileContents.size();
  return true;
}

这将宽字符读入std::wstring,但如果您只想要常规字符和std::string,则可以很容易地进行调整。

#include <string>
#include <sstream>

using namespace std;

string GetStreamAsString(const istream& in)
{
    stringstream out;
    out << in.rdbuf();
    return out.str();
}

string GetFileAsString(static string& filePath)
{
    ifstream stream;
    try
    {
        // Set to throw on failure
        stream.exceptions(fstream::failbit | fstream::badbit);
        stream.open(filePath);
    }
    catch (system_error& error)
    {
        cerr << "Failed to open '" << filePath << "'\n" << error.code().message() << endl;
        return "Open fail";
    }

    return GetStreamAsString(stream);
}

用法:

const string logAsString = GetFileAsString(logFilePath);