我如何在c++中标记一个字符串?

Java有一个方便的分割方法:

String str = "The quick brown fox";
String[] results = str.split(" ");

在c++中有简单的方法来做到这一点吗?

当前回答

另一种快速方法是使用getline。喜欢的东西:

stringstream ss("bla bla");
string s;

while (getline(ss, s, ' ')) {
 cout << s << endl;
}

如果需要，可以创建一个简单的split()方法，返回vector<string>，即真的有用。

2008-11-28 04:17:39

其他回答

Adam Pierce的回答提供了一个采用const char*的手工标记器。使用迭代器会有一些问题，因为对字符串的结束迭代器进行递增是未定义的。也就是说，给定字符串str{"The quick brown fox"}，我们当然可以做到:

auto start = find(cbegin(str), cend(str), ' ');
vector<string> tokens{ string(cbegin(str), start) };

while (start != cend(str)) {
    const auto finish = find(++start, cend(str), ' ');

    tokens.push_back(string(start, finish));
    start = finish;
}

生活的例子

如果你想通过使用标准功能来抽象复杂性，On Freund建议strtok是一个简单的选择:

vector<string> tokens;

for (auto i = strtok(data(str), " "); i != nullptr; i = strtok(nullptr, " ")) tokens.push_back(i);

如果你不能访问c++ 17，你需要像这个例子一样替换data(str): http://ideone.com/8kAGoa

虽然在示例中没有演示，但strtok不需要为每个标记使用相同的分隔符。除了这个优势，还有几个缺点:

strtok cannot be used on multiple strings at the same time: Either a nullptr must be passed to continue tokenizing the current string or a new char* to tokenize must be passed (there are some non-standard implementations which do support this however, such as: strtok_s) For the same reason strtok cannot be used on multiple threads simultaneously (this may however be implementation defined, for example: Visual Studio's implementation is thread safe) Calling strtok modifies the string it is operating on, so it cannot be used on const strings, const char*s, or literal strings, to tokenize any of these with strtok or to operate on a string who's contents need to be preserved, str would have to be copied, then the copy could be operated on

c++20为我们提供了split_view来以非破坏性的方式标记字符串:https://topanswers.xyz/cplusplus?q=749#a874

前面的方法不能就地生成标记化的向量，这意味着如果不将它们抽象为辅助函数，它们就不能初始化const vector<string>令牌。该功能和接受任何空白分隔符的能力可以使用istream_iterator来利用。例如，给定const string str{"The quick \tbrown \nfox"}，我们可以这样做:

istringstream is{ str };
const vector<string> tokens{ istream_iterator<string>(is), istream_iterator<string>() };

生活的例子

对于这个选项，需要构造一个istringstream的代价比前面两个选项要大得多，但是这个代价通常隐藏在字符串分配的代价中。

如果上面的选项都不够灵活，不能满足您的标记化需求，那么最灵活的选项是使用regex_token_iterator，当然这种灵活性会带来更大的开销，但同样，这可能隐藏在字符串分配成本中。例如，我们想要基于非转义的逗号进行标记化，也吃空白，给定以下输入:const string str{" the,qu\\，ick，\tbrown, fox"}我们可以这样做:

const regex re{ "\\s*((?:[^\\\\,]|\\\\.)*?)\\s*(?:,|$)" };
const vector<string> tokens{ sregex_token_iterator(cbegin(str), cend(str), re, 1), sregex_token_iterator() };

生活的例子

2016-07-26 16:51:20

pystring是一个小型库，实现了Python的一系列字符串函数，包括split方法:

#include <string>
#include <vector>
#include "pystring.h"

std::vector<std::string> chunks;
pystring::split("this string", chunks);

// also can specify a separator
pystring::split("this-string", chunks, "-");

2011-12-29 15:17:58

我知道你想要一个c++的解决方案，但你可能会认为这是有帮助的:

#include <QString>

...

QString str = "The quick brown fox"; 
QStringList results = str.split(" ");

在这个例子中，与Boost相比的优势在于，它直接一对一地映射到你的文章代码。

详见Qt文档

2010-08-04 17:34:03

请看这个例子。它可能对你有帮助。

#include <iostream>
#include <sstream>

using namespace std;

int main ()
{
    string tmps;
    istringstream is ("the dellimiter is the space");
    while (is.good ()) {
        is >> tmps;
        cout << tmps << "\n";
    }
    return 0;
}

2010-12-20 12:25:35

我贴出了类似问题的答案。不要白费力气。我使用过许多库，我遇到过的最快、最灵活的库是:c++ String Toolkit Library。

这里有一个如何使用它的例子，我已经张贴在stackoverflow的其他地方。

#include <iostream>
#include <vector>
#include <string>
#include <strtk.hpp>

const char *whitespace  = " \t\r\n\f";
const char *whitespace_and_punctuation  = " \t\r\n\f;,=";

int main()
{
    {   // normal parsing of a string into a vector of strings
       std::string s("Somewhere down the road");
       std::vector<std::string> result;
       if( strtk::parse( s, whitespace, result ) )
       {
           for(size_t i = 0; i < result.size(); ++i )
            std::cout << result[i] << std::endl;
       }
    }

    {  // parsing a string into a vector of floats with other separators
       // besides spaces

       std::string s("3.0, 3.14; 4.0");
       std::vector<float> values;
       if( strtk::parse( s, whitespace_and_punctuation, values ) )
       {
           for(size_t i = 0; i < values.size(); ++i )
            std::cout << values[i] << std::endl;
       }
    }

    {  // parsing a string into specific variables

       std::string s("angle = 45; radius = 9.9");
       std::string w1, w2;
       float v1, v2;
       if( strtk::parse( s, whitespace_and_punctuation, w1, v1, w2, v2) )
       {
           std::cout << "word " << w1 << ", value " << v1 << std::endl;
           std::cout << "word " << w2 << ", value " << v2 << std::endl;
       }
    }

    return 0;
}

2014-01-07 20:33:06

我如何在c++中标记一个字符串?

推荐文章

最新文章

标签