如何迭代由空格分隔的单词组成的字符串中的单词?
注意,我对C字符串函数或那种字符操作/访问不感兴趣。比起效率,我更喜欢优雅。我当前的解决方案:
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
string s = "Somewhere down the road";
istringstream iss(s);
do {
string subs;
iss >> subs;
cout << "Substring: " << subs << endl;
} while (iss);
}
我刚刚写了一个很好的例子,说明如何按符号拆分一个字符,然后将每个字符数组(由符号分隔的单词)放入一个向量中。为了简单起见,我创建了std字符串的向量类型。
我希望这对你有帮助,并且对你可读。
#include <vector>
#include <string>
#include <iostream>
void push(std::vector<std::string> &WORDS, std::string &TMP){
WORDS.push_back(TMP);
TMP = "";
}
std::vector<std::string> mySplit(char STRING[]){
std::vector<std::string> words;
std::string s;
for(unsigned short i = 0; i < strlen(STRING); i++){
if(STRING[i] != ' '){
s += STRING[i];
}else{
push(words, s);
}
}
push(words, s);//Used to get last split
return words;
}
int main(){
char string[] = "My awesome string.";
std::cout << mySplit(string)[2];
std::cin.get();
return 0;
}
我相信还没有人发布这个解决方案。与其直接使用分隔符,它基本上与boost::split()相同,即它允许您传递一个谓词,如果字符是分隔符,则返回true,否则返回false。我认为这给了程序员更多的控制,最棒的是你不需要提升。
template <class Container, class String, class Predicate>
void split(Container& output, const String& input,
const Predicate& pred, bool trimEmpty = false) {
auto it = begin(input);
auto itLast = it;
while (it = find_if(it, end(input), pred), it != end(input)) {
if (not (trimEmpty and it == itLast)) {
output.emplace_back(itLast, it);
}
++it;
itLast = it;
}
}
然后可以这样使用:
struct Delim {
bool operator()(char c) {
return not isalpha(c);
}
};
int main() {
string s("#include<iostream>\n"
"int main() { std::cout << \"Hello world!\" << std::endl; }");
vector<string> v;
split(v, s, Delim(), true);
/* Which is also the same as */
split(v, s, [](char c) { return not isalpha(c); }, true);
for (const auto& i : v) {
cout << i << endl;
}
}
这里有一个仅使用标准正则表达式库的正则表达式解决方案。(我有点生疏,所以可能会有一些语法错误,但这至少是一般的想法)
#include <regex.h>
#include <string.h>
#include <vector.h>
using namespace std;
vector<string> split(string s){
regex r ("\\w+"); //regex matches whole words, (greedy, so no fragment words)
regex_iterator<string::iterator> rit ( s.begin(), s.end(), r );
regex_iterator<string::iterator> rend; //iterators to iterate thru words
vector<string> result<regex_iterator>(rit, rend);
return result; //iterates through the matches to fill the vector
}
另一种灵活快速的方式
template<typename Operator>
void tokenize(Operator& op, const char* input, const char* delimiters) {
const char* s = input;
const char* e = s;
while (*e != 0) {
e = s;
while (*e != 0 && strchr(delimiters, *e) == 0) ++e;
if (e - s > 0) {
op(s, e - s);
}
s = e + 1;
}
}
要将其与字符串向量一起使用(编辑:由于有人指出不继承STL类…hrmf;):
template<class ContainerType>
class Appender {
public:
Appender(ContainerType& container) : container_(container) {;}
void operator() (const char* s, unsigned length) {
container_.push_back(std::string(s,length));
}
private:
ContainerType& container_;
};
std::vector<std::string> strVector;
Appender v(strVector);
tokenize(v, "A number of words to be tokenized", " \t");
就是这样!这只是使用tokenizer的一种方式,比如如何计数单词:
class WordCounter {
public:
WordCounter() : noOfWords(0) {}
void operator() (const char*, unsigned) {
++noOfWords;
}
unsigned noOfWords;
};
WordCounter wc;
tokenize(wc, "A number of words to be counted", " \t");
ASSERT( wc.noOfWords == 7 );
受限于想象力;)
如果您需要通过非空格符号解析字符串,则字符串流可能很方便:
string s = "Name:JAck; Spouse:Susan; ...";
string dummy, name, spouse;
istringstream iss(s);
getline(iss, dummy, ':');
getline(iss, name, ';');
getline(iss, dummy, ':');
getline(iss, spouse, ';')