如何迭代由空格分隔的单词组成的字符串中的单词?

注意,我对C字符串函数或那种字符操作/访问不感兴趣。比起效率,我更喜欢优雅。我当前的解决方案:

#include <iostream>
#include <sstream>
#include <string>

using namespace std;

int main() {
    string s = "Somewhere down the road";
    istringstream iss(s);

    do {
        string subs;
        iss >> subs;
        cout << "Substring: " << subs << endl;
    } while (iss);
}

当前回答

LazyString拆分器:

#include <string>
#include <algorithm>
#include <unordered_set>

using namespace std;

class LazyStringSplitter
{
    string::const_iterator start, finish;
    unordered_set<char> chop;

public:

    // Empty Constructor
    explicit LazyStringSplitter()
    {}

    explicit LazyStringSplitter (const string cstr, const string delims)
        : start(cstr.begin())
        , finish(cstr.end())
        , chop(delims.begin(), delims.end())
    {}

    void operator () (const string cstr, const string delims)
    {
        chop.insert(delims.begin(), delims.end());
        start = cstr.begin();
        finish = cstr.end();
    }

    bool empty() const { return (start >= finish); }

    string next()
    {
        // return empty string
        // if ran out of characters
        if (empty())
            return string("");

        auto runner = find_if(start, finish, [&](char c) {
            return chop.count(c) == 1;
        });

        // construct next string
        string ret(start, runner);
        start = runner + 1;

        // Never return empty string
        // + tail recursion makes this method efficient
        return !ret.empty() ? ret : next();
    }
};

我将此方法称为LazyStringSplitter是因为一个原因——它不会一次性拆分字符串。本质上,它的行为类似于python生成器它公开了一个名为next的方法,该方法返回从原始字符串拆分的下一个字符串我使用了c++11STL中的无序集,因此查找分隔符的速度要快得多下面是它的工作原理

测试程序

#include <iostream>
using namespace std;

int main()
{
    LazyStringSplitter splitter;

    // split at the characters ' ', '!', '.', ','
    splitter("This, is a string. And here is another string! Let's test and see how well this does.", " !.,");

    while (!splitter.empty())
        cout << splitter.next() << endl;
    return 0;
}

输出,输出

This
is
a
string
And
here
is
another
string
Let's
test
and
see
how
well
this
does

改进这一点的下一个计划是实施开始和结束方法,以便可以执行以下操作:

vector<string> split_string(splitter.begin(), splitter.end());

其他回答

并不是说我们需要更多的答案,但这是我受到埃文·特兰启发后想到的。

std::vector <std::string> split(const string &input, auto delimiter, bool skipEmpty=true) {
  /*
  Splits a string at each delimiter and returns these strings as a string vector.
  If the delimiter is not found then nothing is returned.
  If skipEmpty is true then strings between delimiters that are 0 in length will be skipped.
  */
  bool delimiterFound = false;
  int pos=0, pPos=0;
  std::vector <std::string> result;
  while (true) {
    pos = input.find(delimiter,pPos);
    if (pos != std::string::npos) {
      if (skipEmpty==false or pos-pPos > 0) // if empty values are to be kept or not
        result.push_back(input.substr(pPos,pos-pPos));
      delimiterFound = true;
    } else {
      if (pPos < input.length() and delimiterFound) {
        if (skipEmpty==false or input.length()-pPos > 0) // if empty values are to be kept or not
          result.push_back(input.substr(pPos,input.length()-pPos));
      }
      break;
    }
    pPos = pos+1;
  }
  return result;
}

使用std::stringstream非常好,并且完全符合您的要求。如果您只是在寻找不同的方法,那么可以使用std::find()/std::find_first_of()和std::string::substr()。

下面是一个示例:

#include <iostream>
#include <string>

int main()
{
    std::string s("Somewhere down the road");
    std::string::size_type prev_pos = 0, pos = 0;

    while( (pos = s.find(' ', pos)) != std::string::npos )
    {
        std::string substring( s.substr(prev_pos, pos-prev_pos) );

        std::cout << substring << '\n';

        prev_pos = ++pos;
    }

    std::string substring( s.substr(prev_pos, pos-prev_pos) ); // Last word
    std::cout << substring << '\n';

    return 0;
}

这是另一种方法。。

void split_string(string text,vector<string>& words)
{
  int i=0;
  char ch;
  string word;

  while(ch=text[i++])
  {
    if (isspace(ch))
    {
      if (!word.empty())
      {
        words.push_back(word);
      }
      word = "";
    }
    else
    {
      word += ch;
    }
  }
  if (!word.empty())
  {
    words.push_back(word);
  }
}

我有两条线来解决这个问题:

char sep = ' ';
std::string s="1 This is an example";

for(size_t p=0, q=0; p!=s.npos; p=q)
  std::cout << s.substr(p+(p!=0), (q=s.find(sep, p+1))-p-(p!=0)) << std::endl;

然后你可以把它放到一个向量中,而不是打印。

我编写了以下代码。您可以指定分隔符,它可以是字符串。结果类似于Java的String.split,结果中包含空字符串。

例如,如果我们调用split(“ABCPICKABCANYABCTWO:ABC”,“ABC”),结果如下:

0  <len:0>
1 PICK <len:4>
2 ANY <len:3>
3 TWO: <len:4>
4  <len:0>

代码:

vector <string> split(const string& str, const string& delimiter = " ") {
    vector <string> tokens;

    string::size_type lastPos = 0;
    string::size_type pos = str.find(delimiter, lastPos);

    while (string::npos != pos) {
        // Found a token, add it to the vector.
        cout << str.substr(lastPos, pos - lastPos) << endl;
        tokens.push_back(str.substr(lastPos, pos - lastPos));
        lastPos = pos + delimiter.size();
        pos = str.find(delimiter, lastPos);
    }

    tokens.push_back(str.substr(lastPos, str.size() - lastPos));
    return tokens;
}