如何迭代由空格分隔的单词组成的字符串中的单词?

注意,我对C字符串函数或那种字符操作/访问不感兴趣。比起效率,我更喜欢优雅。我当前的解决方案:

#include <iostream>
#include <sstream>
#include <string>

using namespace std;

int main() {
    string s = "Somewhere down the road";
    istringstream iss(s);

    do {
        string subs;
        iss >> subs;
        cout << "Substring: " << subs << endl;
    } while (iss);
}

当前回答

作为一个业余爱好者,这是我想到的第一个解决方案。我有点好奇,为什么我还没有在这里看到类似的解决方案,是不是我的做法有根本问题?

#include <iostream>
#include <string>
#include <vector>

std::vector<std::string> split(const std::string &s, const std::string &delims)
{
    std::vector<std::string> result;
    std::string::size_type pos = 0;
    while (std::string::npos != (pos = s.find_first_not_of(delims, pos))) {
        auto pos2 = s.find_first_of(delims, pos);
        result.emplace_back(s.substr(pos, std::string::npos == pos2 ? pos2 : pos2 - pos));
        pos = pos2;
    }
    return result;
}

int main()
{
    std::string text{"And then I said: \"I don't get it, why would you even do that!?\""};
    std::string delims{" :;\".,?!"};
    auto words = split(text, delims);
    std::cout << "\nSentence:\n  " << text << "\n\nWords:";
    for (const auto &w : words) {
        std::cout << "\n  " << w;
    }
    return 0;
}

http://cpp.sh/7wmzy

其他回答

我喜欢下面的代码,因为它将结果放入一个向量中,支持字符串作为delim,并控制保持空值。但是,那时候看起来不太好。

#include <ostream>
#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;

vector<string> split(const string& s, const string& delim, const bool keep_empty = true) {
    vector<string> result;
    if (delim.empty()) {
        result.push_back(s);
        return result;
    }
    string::const_iterator substart = s.begin(), subend;
    while (true) {
        subend = search(substart, s.end(), delim.begin(), delim.end());
        string temp(substart, subend);
        if (keep_empty || !temp.empty()) {
            result.push_back(temp);
        }
        if (subend == s.end()) {
            break;
        }
        substart = subend + delim.size();
    }
    return result;
}

int main() {
    const vector<string> words = split("So close no matter how far", " ");
    copy(words.begin(), words.end(), ostream_iterator<string>(cout, "\n"));
}

当然,Boost有一个split(),它的部分功能与此类似。而且,如果“空白”是指任何类型的空白,那么使用Boost的split和is_any_of()都非常有用。

LazyString拆分器:

#include <string>
#include <algorithm>
#include <unordered_set>

using namespace std;

class LazyStringSplitter
{
    string::const_iterator start, finish;
    unordered_set<char> chop;

public:

    // Empty Constructor
    explicit LazyStringSplitter()
    {}

    explicit LazyStringSplitter (const string cstr, const string delims)
        : start(cstr.begin())
        , finish(cstr.end())
        , chop(delims.begin(), delims.end())
    {}

    void operator () (const string cstr, const string delims)
    {
        chop.insert(delims.begin(), delims.end());
        start = cstr.begin();
        finish = cstr.end();
    }

    bool empty() const { return (start >= finish); }

    string next()
    {
        // return empty string
        // if ran out of characters
        if (empty())
            return string("");

        auto runner = find_if(start, finish, [&](char c) {
            return chop.count(c) == 1;
        });

        // construct next string
        string ret(start, runner);
        start = runner + 1;

        // Never return empty string
        // + tail recursion makes this method efficient
        return !ret.empty() ? ret : next();
    }
};

我将此方法称为LazyStringSplitter是因为一个原因——它不会一次性拆分字符串。本质上,它的行为类似于python生成器它公开了一个名为next的方法,该方法返回从原始字符串拆分的下一个字符串我使用了c++11STL中的无序集,因此查找分隔符的速度要快得多下面是它的工作原理

测试程序

#include <iostream>
using namespace std;

int main()
{
    LazyStringSplitter splitter;

    // split at the characters ' ', '!', '.', ','
    splitter("This, is a string. And here is another string! Let's test and see how well this does.", " !.,");

    while (!splitter.empty())
        cout << splitter.next() << endl;
    return 0;
}

输出,输出

This
is
a
string
And
here
is
another
string
Let's
test
and
see
how
well
this
does

改进这一点的下一个计划是实施开始和结束方法,以便可以执行以下操作:

vector<string> split_string(splitter.begin(), splitter.end());

到目前为止,我在Boost中使用了这个,但我需要一些不依赖它的东西,所以我得出了这个结论:

static void Split(std::vector<std::string>& lst, const std::string& input, const std::string& separators, bool remove_empty = true)
{
    std::ostringstream word;
    for (size_t n = 0; n < input.size(); ++n)
    {
        if (std::string::npos == separators.find(input[n]))
            word << input[n];
        else
        {
            if (!word.str().empty() || !remove_empty)
                lst.push_back(word.str());
            word.str("");
        }
    }
    if (!word.str().empty() || !remove_empty)
        lst.push_back(word.str());
}

好的一点是,在分隔符中可以传递多个字符。

我刚刚写了一个很好的例子,说明如何按符号拆分一个字符,然后将每个字符数组(由符号分隔的单词)放入一个向量中。为了简单起见,我创建了std字符串的向量类型。

我希望这对你有帮助,并且对你可读。

#include <vector>
#include <string>
#include <iostream>

void push(std::vector<std::string> &WORDS, std::string &TMP){
    WORDS.push_back(TMP);
    TMP = "";
}
std::vector<std::string> mySplit(char STRING[]){
        std::vector<std::string> words;
        std::string s;
        for(unsigned short i = 0; i < strlen(STRING); i++){
            if(STRING[i] != ' '){
                s += STRING[i];
            }else{
                push(words, s);
            }
        }
        push(words, s);//Used to get last split
        return words;
}

int main(){
    char string[] = "My awesome string.";
    std::cout << mySplit(string)[2];
    std::cin.get();
    return 0;
}

我编写了以下代码。您可以指定分隔符,它可以是字符串。结果类似于Java的String.split,结果中包含空字符串。

例如,如果我们调用split(“ABCPICKABCANYABCTWO:ABC”,“ABC”),结果如下:

0  <len:0>
1 PICK <len:4>
2 ANY <len:3>
3 TWO: <len:4>
4  <len:0>

代码:

vector <string> split(const string& str, const string& delimiter = " ") {
    vector <string> tokens;

    string::size_type lastPos = 0;
    string::size_type pos = str.find(delimiter, lastPos);

    while (string::npos != pos) {
        // Found a token, add it to the vector.
        cout << str.substr(lastPos, pos - lastPos) << endl;
        tokens.push_back(str.substr(lastPos, pos - lastPos));
        lastPos = pos + delimiter.size();
        pos = str.find(delimiter, lastPos);
    }

    tokens.push_back(str.substr(lastPos, str.size() - lastPos));
    return tokens;
}