我需要将整个文件读入内存,并将其放在c++ std::string中。
如果我要将它读入char[],答案将非常简单:
std::ifstream t;
int length;
t.open("file.txt"); // open input file
t.seekg(0, std::ios::end); // go to the end
length = t.tellg(); // report location (this is the length)
t.seekg(0, std::ios::beg); // go back to the beginning
buffer = new char[length]; // allocate memory for a buffer of appropriate dimension
t.read(buffer, length); // read the whole file into the buffer
t.close(); // close file handle
// ... Do stuff with buffer here ...
现在,我想做完全相同的事情,但使用std::string而不是char[]。我想避免循环,即我不想:
std::ifstream t;
t.open("file.txt");
std::string buffer;
std::string line;
while(t){
std::getline(t, line);
// ... Append line to buffer and go on
}
t.close()
什么好主意吗?
更新:事实证明,这种方法虽然很好地遵循了STL习惯用法,但实际上效率非常低!不要对大文件这样做。(参见:http://insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html)
你可以从文件中创建一个streambuf迭代器,并用它初始化字符串:
#include <string>
#include <fstream>
#include <streambuf>
std::ifstream t("file.txt");
std::string str((std::istreambuf_iterator<char>(t)),
std::istreambuf_iterator<char>());
不确定你从哪里得到t.open("file.txt", "r")语法。据我所知,这不是一个std::ifstream的方法。看起来你把它和C的fopen搞混了。
编辑:还要注意字符串构造函数的第一个参数周围有额外的圆括号。这些都是必不可少的。它们防止了所谓的“最恼人的解析”问题,在这种情况下,它实际上不会像通常那样给你一个编译错误,但会给你一个有趣的(读作:错误)结果。
遵循KeithB在评论中的观点,这里有一种方法可以预先分配所有内存(而不是依赖于string类的自动重新分配):
#include <string>
#include <fstream>
#include <streambuf>
std::ifstream t("file.txt");
std::string str;
t.seekg(0, std::ios::end);
str.reserve(t.tellg());
t.seekg(0, std::ios::beg);
str.assign((std::istreambuf_iterator<char>(t)),
std::istreambuf_iterator<char>());
有几种可能性。我喜欢使用stringstream作为中间人:
std::ifstream t("file.txt");
std::stringstream buffer;
buffer << t.rdbuf();
现在"file.txt"的内容在buffer.str()字符串中可用。
另一种可能性(虽然我当然也不喜欢它)更像你原来的:
std::ifstream t("file.txt");
t.seekg(0, std::ios::end);
size_t size = t.tellg();
std::string buffer(size, ' ');
t.seekg(0);
t.read(&buffer[0], size);
官方说法,这在c++ 98或03标准下不需要工作(字符串不需要连续存储数据),但实际上它适用于所有已知的实现,c++ 11及后续版本确实需要连续存储,所以它可以保证与它们一起工作。
As to why I don't like the latter as well: first, because it's longer and harder to read. Second, because it requires that you initialize the contents of the string with data you don't care about, then immediately write over that data (yes, the time to initialize is usually trivial compared to the reading, so it probably doesn't matter, but to me it still feels kind of wrong). Third, in a text file, position X in the file doesn't necessarily mean you'll have read X characters to reach that point -- it's not required to take into account things like line-end translations. On real systems that do such translations (e.g., Windows) the translated form is shorter than what's in the file (i.e., "\r\n" in the file becomes "\n" in the translated string) so all you've done is reserved a little extra space you never use. Again, doesn't really cause a major problem but feels a little wrong anyway.