如何迭代由空格分隔的单词组成的字符串中的单词?
注意,我对C字符串函数或那种字符操作/访问不感兴趣。比起效率,我更喜欢优雅。我当前的解决方案:
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
string s = "Somewhere down the road";
istringstream iss(s);
do {
string subs;
iss >> subs;
cout << "Substring: " << subs << endl;
} while (iss);
}
另一种灵活快速的方式
template<typename Operator>
void tokenize(Operator& op, const char* input, const char* delimiters) {
const char* s = input;
const char* e = s;
while (*e != 0) {
e = s;
while (*e != 0 && strchr(delimiters, *e) == 0) ++e;
if (e - s > 0) {
op(s, e - s);
}
s = e + 1;
}
}
要将其与字符串向量一起使用(编辑:由于有人指出不继承STL类…hrmf;):
template<class ContainerType>
class Appender {
public:
Appender(ContainerType& container) : container_(container) {;}
void operator() (const char* s, unsigned length) {
container_.push_back(std::string(s,length));
}
private:
ContainerType& container_;
};
std::vector<std::string> strVector;
Appender v(strVector);
tokenize(v, "A number of words to be tokenized", " \t");
就是这样!这只是使用tokenizer的一种方式,比如如何计数单词:
class WordCounter {
public:
WordCounter() : noOfWords(0) {}
void operator() (const char*, unsigned) {
++noOfWords;
}
unsigned noOfWords;
};
WordCounter wc;
tokenize(wc, "A number of words to be counted", " \t");
ASSERT( wc.noOfWords == 7 );
受限于想象力;)
我使用以下方法
void split(string in, vector<string>& parts, char separator) {
string::iterator ts, curr;
ts = curr = in.begin();
for(; curr <= in.end(); curr++ ) {
if( (curr == in.end() || *curr == separator) && curr > ts )
parts.push_back( string( ts, curr ));
if( curr == in.end() )
break;
if( *curr == separator ) ts = curr + 1;
}
}
PlasmaHH,我忘记包含删除带有空格的标记的额外检查(curr>ts)。
我一直在寻找用任意长度的分隔符分割字符串的方法,所以我开始从头开始编写,因为现有的解决方案不适合我。
这是我的小算法,仅使用STL:
//use like this
//std::vector<std::wstring> vec = Split<std::wstring> (L"Hello##world##!", L"##");
template <typename valueType>
static std::vector <valueType> Split (valueType text, const valueType& delimiter)
{
std::vector <valueType> tokens;
size_t pos = 0;
valueType token;
while ((pos = text.find(delimiter)) != valueType::npos)
{
token = text.substr(0, pos);
tokens.push_back (token);
text.erase(0, pos + delimiter.length());
}
tokens.push_back (text);
return tokens;
}
根据我的测试,它可以与任何长度和形式的分隔符一起使用。使用string或wstring类型实例化。
该算法所做的就是搜索分隔符,获取字符串中与分隔符相邻的部分,删除分隔符,然后再次搜索,直到不再找到为止。
当然,可以使用任意数量的空格作为分隔符。
我希望这有帮助。
作为一个业余爱好者,这是我想到的第一个解决方案。我有点好奇,为什么我还没有在这里看到类似的解决方案,是不是我的做法有根本问题?
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> split(const std::string &s, const std::string &delims)
{
std::vector<std::string> result;
std::string::size_type pos = 0;
while (std::string::npos != (pos = s.find_first_not_of(delims, pos))) {
auto pos2 = s.find_first_of(delims, pos);
result.emplace_back(s.substr(pos, std::string::npos == pos2 ? pos2 : pos2 - pos));
pos = pos2;
}
return result;
}
int main()
{
std::string text{"And then I said: \"I don't get it, why would you even do that!?\""};
std::string delims{" :;\".,?!"};
auto words = split(text, delims);
std::cout << "\nSentence:\n " << text << "\n\nWords:";
for (const auto &w : words) {
std::cout << "\n " << w;
}
return 0;
}
http://cpp.sh/7wmzy
这是我的版本获取了Kev的来源:
#include <string>
#include <vector>
void split(vector<string> &result, string str, char delim ) {
string tmp;
string::iterator i;
result.clear();
for(i = str.begin(); i <= str.end(); ++i) {
if((const char)*i != delim && i != str.end()) {
tmp += *i;
} else {
result.push_back(tmp);
tmp = "";
}
}
}
之后,调用函数并执行以下操作:
vector<string> hosts;
split(hosts, "192.168.1.2,192.168.1.3", ',');
for( size_t i = 0; i < hosts.size(); i++){
cout << "Connecting host : " << hosts.at(i) << "..." << endl;
}