我如何在c++中标记一个字符串?

Java有一个方便的分割方法:

String str = "The quick brown fox";
String[] results = str.split(" ");

在c++中有简单的方法来做到这一点吗?

当前回答

使用strtok。在我看来，没有必要围绕标记化构建类，除非strtok不能提供您所需要的东西。可能不会，但在用C和c++编写各种解析代码的15年多时间里，我一直在使用strtok。这里有一个例子

char myString[] = "The quick brown fox";
char *p = strtok(myString, " ");
while (p) {
    printf ("Token: %s\n", p);
    p = strtok(NULL, " ");
}

一些注意事项(可能不适合您的需要)。该字符串在该过程中被“销毁”，这意味着EOS字符内联放置在分隔符点中。正确的用法可能需要创建字符串的非const版本。还可以在解析过程中更改分隔符列表。

在我看来，上面的代码比为它单独编写一个类要简单得多，也更容易使用。对我来说，这是语言提供的功能之一，而且它做得很好，很干净。这只是一个“基于C”的解决方案。它很合适，很简单，而且你不需要写很多额外的代码:-)

2008-09-10 13:37:33

其他回答

pystring是一个小型库，实现了Python的一系列字符串函数，包括split方法:

#include <string>
#include <vector>
#include "pystring.h"

std::vector<std::string> chunks;
pystring::split("this string", chunks);

// also can specify a separator
pystring::split("this-string", chunks, "-");

2011-12-29 15:17:58

Boost标记器类可以使这类事情变得非常简单:

#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/tokenizer.hpp>

using namespace std;
using namespace boost;

int main(int, char**)
{
    string text = "token, test   string";

    char_separator<char> sep(", ");
    tokenizer< char_separator<char> > tokens(text, sep);
    BOOST_FOREACH (const string& t, tokens) {
        cout << t << "." << endl;
    }
}

针对c++ 11更新:

#include <iostream>
#include <string>
#include <boost/tokenizer.hpp>

using namespace std;
using namespace boost;

int main(int, char**)
{
    string text = "token, test   string";

    char_separator<char> sep(", ");
    tokenizer<char_separator<char>> tokens(text, sep);
    for (const auto& t : tokens) {
        cout << t << "." << endl;
    }
}

2008-09-11 02:10:33

我知道这个问题已经有了答案，但我想有所贡献。也许我的解决方案有点简单，但这就是我想到的:

vector<string> get_words(string const& text, string const& separator)
{
    vector<string> result;
    string tmp = text;

    size_t first_pos = 0;
    size_t second_pos = tmp.find(separator);

    while (second_pos != string::npos)
    {
        if (first_pos != second_pos)
        {
            string word = tmp.substr(first_pos, second_pos - first_pos);
            result.push_back(word);
        }
        tmp = tmp.substr(second_pos + separator.length());
        second_pos = tmp.find(separator);
    }

    result.push_back(tmp);

    return result;
}

如果在我的代码中有更好的方法，或者有什么错误，请评论。

更新:添加通用分隔符

2018-05-09 07:12:21

我为自己编写了一个https://stackoverflow.com/a/50247503/3976739的简化版本(可能有一点效率)。我希望这能有所帮助。

void StrTokenizer(string& source, const char* delimiter, vector<string>& Tokens)
{   
   size_t new_index = 0;
   size_t old_index = 0;

   while (new_index != std::string::npos)   
   {
      new_index = source.find(delimiter, old_index);
      Tokens.emplace_back(source.substr(old_index, new_index-old_index));

      if (new_index != std::string::npos)
          old_index = ++new_index;
   }
}

2022-03-21 21:12:56

我知道你想要一个c++的解决方案，但你可能会认为这是有帮助的:

#include <QString>

...

QString str = "The quick brown fox"; 
QStringList results = str.split(" ");

在这个例子中，与Boost相比的优势在于，它直接一对一地映射到你的文章代码。

详见Qt文档

2010-08-04 17:34:03

我如何在c++中标记一个字符串?

推荐文章

最新文章

标签