如何迭代由空格分隔的单词组成的字符串中的单词?
注意,我对C字符串函数或那种字符操作/访问不感兴趣。比起效率,我更喜欢优雅。我当前的解决方案:
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
string s = "Somewhere down the road";
istringstream iss(s);
do {
string subs;
iss >> subs;
cout << "Substring: " << subs << endl;
} while (iss);
}
虽然有一些答案提供了C++20解决方案,但自从发布以来,已经做了一些更改,并将其作为缺陷报告应用于C++20。正因为如此,解决方案变得更短、更好:
#include <iostream>
#include <ranges>
#include <string_view>
namespace views = std::views;
using str = std::string_view;
constexpr str text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";
auto splitByWords(str input) {
return input
| views::split(' ')
| views::transform([](auto &&r) -> str {
return {r.begin(), r.end()};
});
}
auto main() -> int {
for (str &&word : splitByWords(text)) {
std::cout << word << '\n';
}
}
到今天为止,它仍然只在GCC的主干分支(Godbolt链接)上可用。它基于两个更改:P1391迭代器构造函数用于std::string_view和P2210 DR修复std::views::split以保留范围类型。
在C++23中,不需要任何转换样板,因为P1989向std::string_view:添加了一个范围构造函数
#include <iostream>
#include <ranges>
#include <string_view>
namespace views = std::views;
constexpr std::string_view text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";
auto main() -> int {
for (std::string_view&& word : text | views::split(' ')) {
std::cout << word << '\n';
}
}
(螺栓连杆)
如果您希望按某些字符分割字符串,可以使用
#include<iostream>
#include<string>
#include<vector>
#include<iterator>
#include<sstream>
#include<string>
using namespace std;
void replaceOtherChars(string &input, vector<char> ÷rs)
{
const char divider = dividers.at(0);
int replaceIndex = 0;
vector<char>::iterator it_begin = dividers.begin()+1,
it_end= dividers.end();
for(;it_begin!=it_end;++it_begin)
{
replaceIndex = 0;
while(true)
{
replaceIndex=input.find_first_of(*it_begin,replaceIndex);
if(replaceIndex==-1)
break;
input.at(replaceIndex)=divider;
}
}
}
vector<string> split(string str, vector<char> chars, bool missEmptySpace =true )
{
vector<string> result;
const char divider = chars.at(0);
replaceOtherChars(str,chars);
stringstream stream;
stream<<str;
string temp;
while(getline(stream,temp,divider))
{
if(missEmptySpace && temp.empty())
continue;
result.push_back(temp);
}
return result;
}
int main()
{
string str ="milk, pigs.... hot-dogs ";
vector<char> arr;
arr.push_back(' '); arr.push_back(','); arr.push_back('.');
vector<string> result = split(str,arr);
vector<string>::iterator it_begin= result.begin(),
it_end= result.end();
for(;it_begin!=it_end;++it_begin)
{
cout<<*it_begin<<endl;
}
return 0;
}
最近我不得不将一个骆驼大小写的单词拆分成子单词。没有分隔符,只有大写字符。
#include <string>
#include <list>
#include <locale> // std::isupper
template<class String>
const std::list<String> split_camel_case_string(const String &s)
{
std::list<String> R;
String w;
for (String::const_iterator i = s.begin(); i < s.end(); ++i) { {
if (std::isupper(*i)) {
if (w.length()) {
R.push_back(w);
w.clear();
}
}
w += *i;
}
if (w.length())
R.push_back(w);
return R;
}
例如,这将“AQueryTrades”拆分为“A”、“Query”和“Trades”。该函数适用于窄字符串和宽字符串。因为它尊重当前的语言环境,所以将“RaumfahrtÜberwachungsVerordnung”分为“Raumfahrt”、“Überwachungs”和“Verordnug”。
注意std::upper应该真正作为函数模板参数传递。然后,此函数的更广义的from也可以在分隔符(如“、”、“;”或“”)处拆分。
下面是一个更好的方法。它可以接受任何字符,除非您愿意,否则不会拆分行。不需要特殊的库(嗯,除了std,但谁真的认为这是一个额外的库),没有指针,没有引用,而且它是静态的。只是简单的C++。
#pragma once
#include <vector>
#include <sstream>
using namespace std;
class Helpers
{
public:
static vector<string> split(string s, char delim)
{
stringstream temp (stringstream::in | stringstream::out);
vector<string> elems(0);
if (s.size() == 0 || delim == 0)
return elems;
for(char c : s)
{
if(c == delim)
{
elems.push_back(temp.str());
temp = stringstream(stringstream::in | stringstream::out);
}
else
temp << c;
}
if (temp.str().size() > 0)
elems.push_back(temp.str());
return elems;
}
//Splits string s with a list of delimiters in delims (it's just a list, like if we wanted to
//split at the following letters, a, b, c we would make delims="abc".
static vector<string> split(string s, string delims)
{
stringstream temp (stringstream::in | stringstream::out);
vector<string> elems(0);
bool found;
if(s.size() == 0 || delims.size() == 0)
return elems;
for(char c : s)
{
found = false;
for(char d : delims)
{
if (c == d)
{
elems.push_back(temp.str());
temp = stringstream(stringstream::in | stringstream::out);
found = true;
break;
}
}
if(!found)
temp << c;
}
if(temp.str().size() > 0)
elems.push_back(temp.str());
return elems;
}
};
这里有一个仅使用标准正则表达式库的正则表达式解决方案。(我有点生疏,所以可能会有一些语法错误,但这至少是一般的想法)
#include <regex.h>
#include <string.h>
#include <vector.h>
using namespace std;
vector<string> split(string s){
regex r ("\\w+"); //regex matches whole words, (greedy, so no fragment words)
regex_iterator<string::iterator> rit ( s.begin(), s.end(), r );
regex_iterator<string::iterator> rend; //iterators to iterate thru words
vector<string> result<regex_iterator>(rit, rend);
return result; //iterates through the matches to fill the vector
}