如何迭代由空格分隔的单词组成的字符串中的单词?
注意,我对C字符串函数或那种字符操作/访问不感兴趣。比起效率,我更喜欢优雅。我当前的解决方案:
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
string s = "Somewhere down the road";
istringstream iss(s);
do {
string subs;
iss >> subs;
cout << "Substring: " << subs << endl;
} while (iss);
}
最小的解决方案是一个函数,它将std::字符串和一组分隔符(作为std::string)作为输入,并返回std:::字符串的std::向量。
#include <string>
#include <vector>
std::vector<std::string>
tokenize(const std::string& str, const std::string& delimiters)
{
using ssize_t = std::string::size_type;
const ssize_t str_ln = str.length();
ssize_t last_pos = 0;
// container for the extracted tokens
std::vector<std::string> tokens;
while (last_pos < str_ln) {
// find the position of the next delimiter
ssize_t pos = str.find_first_of(delimiters, last_pos);
// if no delimiters found, set the position to the length of string
if (pos == std::string::npos)
pos = str_ln;
// if the substring is nonempty, store it in the container
if (pos != last_pos)
tokens.emplace_back(str.substr(last_pos, pos - last_pos));
// scan past the previous substring
last_pos = pos + 1;
}
return tokens;
}
用法示例:
#include <iostream>
int main()
{
std::string input_str = "one + two * (three - four)!!---! ";
const char* delimiters = "! +- (*)";
std::vector<std::string> tokens = tokenize(input_str, delimiters);
std::cout << "input = '" << input_str << "'\n"
<< "delimiters = '" << delimiters << "'\n"
<< "nr of tokens found = " << tokens.size() << std::endl;
for (const std::string& tk : tokens) {
std::cout << "token = '" << tk << "'\n";
}
return 0;
}
我喜欢下面的代码,因为它将结果放入一个向量中,支持字符串作为delim,并控制保持空值。但是,那时候看起来不太好。
#include <ostream>
#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;
vector<string> split(const string& s, const string& delim, const bool keep_empty = true) {
vector<string> result;
if (delim.empty()) {
result.push_back(s);
return result;
}
string::const_iterator substart = s.begin(), subend;
while (true) {
subend = search(substart, s.end(), delim.begin(), delim.end());
string temp(substart, subend);
if (keep_empty || !temp.empty()) {
result.push_back(temp);
}
if (subend == s.end()) {
break;
}
substart = subend + delim.size();
}
return result;
}
int main() {
const vector<string> words = split("So close no matter how far", " ");
copy(words.begin(), words.end(), ostream_iterator<string>(cout, "\n"));
}
当然,Boost有一个split(),它的部分功能与此类似。而且,如果“空白”是指任何类型的空白,那么使用Boost的split和is_any_of()都非常有用。
我的代码是:
#include <list>
#include <string>
template<class StringType = std::string, class ContainerType = std::list<StringType> >
class DSplitString:public ContainerType
{
public:
explicit DSplitString(const StringType& strString, char cChar, bool bSkipEmptyParts = true)
{
size_t iPos = 0;
size_t iPos_char = 0;
while(StringType::npos != (iPos_char = strString.find(cChar, iPos)))
{
StringType strTemp = strString.substr(iPos, iPos_char - iPos);
if((bSkipEmptyParts && !strTemp.empty()) || (!bSkipEmptyParts))
push_back(strTemp);
iPos = iPos_char + 1;
}
}
explicit DSplitString(const StringType& strString, const StringType& strSub, bool bSkipEmptyParts = true)
{
size_t iPos = 0;
size_t iPos_char = 0;
while(StringType::npos != (iPos_char = strString.find(strSub, iPos)))
{
StringType strTemp = strString.substr(iPos, iPos_char - iPos);
if((bSkipEmptyParts && !strTemp.empty()) || (!bSkipEmptyParts))
push_back(strTemp);
iPos = iPos_char + strSub.length();
}
}
};
例子:
#include <iostream>
#include <string>
int _tmain(int argc, _TCHAR* argv[])
{
DSplitString<> aa("doicanhden1;doicanhden2;doicanhden3;", ';');
for each (std::string var in aa)
{
std::cout << var << std::endl;
}
std::cin.get();
return 0;
}
使用vector作为基类的快速版本,可完全访问其所有运算符:
// Split string into parts.
class Split : public std::vector<std::string>
{
public:
Split(const std::string& str, char* delimList)
{
size_t lastPos = 0;
size_t pos = str.find_first_of(delimList);
while (pos != std::string::npos)
{
if (pos != lastPos)
push_back(str.substr(lastPos, pos-lastPos));
lastPos = pos + 1;
pos = str.find_first_of(delimList, lastPos);
}
if (lastPos < str.length())
push_back(str.substr(lastPos, pos-lastPos));
}
};
用于填充STL集的示例:
std::set<std::string> words;
Split split("Hello,World", ",");
words.insert(split.begin(), split.end());