如何转换一个实例的std::字符串小写

我想转换一个std::字符串小写。我知道tolower()函数。然而，在过去，我有这个函数的问题，它几乎不是理想的无论如何使用std::string将需要迭代每个字符。

有没有一种替代方案能100%有效?

当前回答

我自己的模板函数，执行大写/小写。

#include <string>
#include <algorithm>

//
//  Lowercases string
//
template <typename T>
std::basic_string<T> lowercase(const std::basic_string<T>& s)
{
    std::basic_string<T> s2 = s;
    std::transform(s2.begin(), s2.end(), s2.begin(), tolower);
    return s2;
}

//
// Uppercases string
//
template <typename T>
std::basic_string<T> uppercase(const std::basic_string<T>& s)
{
    std::basic_string<T> s2 = s;
    std::transform(s2.begin(), s2.end(), s2.begin(), toupper);
    return s2;
}

2019-05-18 14:40:40

其他回答

看看优秀的c++17 cppp -unicodelib (GitHub)。它是单文件且只包含头文件。


#include <exception>
#include <iostream>
#include <codecvt>

// cpp-unicodelib, downloaded from GitHub
#include "unicodelib.h"
#include "unicodelib_encodings.h"

using namespace std;
using namespace unicode;

// converter that allows displaying a Unicode32 string
wstring_convert<codecvt_utf8<char32_t>, char32_t> converter;

std::u32string  in = U"Je suis là!";
cout << converter.to_bytes(in) << endl;

std::u32string  lc = to_lowercase(in);
cout << converter.to_bytes(lc) << endl;

输出

Je suis là!
je suis là!

2022-04-25 13:18:34

博士tl;

使用ICU图书馆。如果您不这样做，您的转换例程将在您可能甚至没有意识到存在的情况下无声地中断。

首先你必须回答一个问题:std::string的编码是什么?是ISO-8859-1吗?或者ISO-8859-8?或者Windows Codepage 1252?不管你用什么来转换大写字母还是小写字母，你知道吗?(或者对于0x7f以上的字符会失败吗?)

如果您使用UTF-8(8位编码中唯一明智的选择)和std::string作为容器，如果您认为您仍然在控制事情，那么您已经欺骗了自己。您正在将一个多字节字符序列存储在一个不知道多字节概念的容器中，您可以对其执行的大多数操作也不知道多字节的概念!即使是像.substr()这样简单的东西也可能导致无效的(子)字符串，因为您在多字节序列中间进行了分割。

As soon as you try something like std::toupper( 'ß' ), or std::tolower( 'Σ' ) in any encoding, you are in trouble. Because 1), the standard only ever operates on one character at a time, so it simply cannot turn ß into SS as would be correct. And 2), the standard only ever operates on one character at a time, so it cannot decide whether Σ is in the middle of a word (where σ would be correct), or at the end (ς). Another example would be std::tolower( 'I' ), which should yield different results depending on the locale -- virtually everywhere you would expect i, but in Turkey ı (LATIN SMALL LETTER DOTLESS I) is the correct answer (which, again, is more than one byte in UTF-8 encoding).

因此，任何一次处理一个字符的大小写转换，或者更糟，一次处理一个字节的大小写转换，都在设计上被破坏了。这包括目前存在的所有std::变体。

还有一点，标准库能够做什么，取决于运行软件的机器支持哪些地区…如果您的目标区域位于客户机上不支持的区域之一，该怎么办?

因此，您真正要寻找的是一个能够正确处理所有这些问题的字符串类，而不是std::basic_string<>变量。

(c++ 11注:std::u16string和std::u32string较好，但仍不完美。c++ 20带来了std::u8string，但所有这些都是指定编码。在许多其他方面，他们仍然对Unicode机制一无所知，比如标准化、排序……)

虽然Boost看起来不错，API方面，Boost。Locale基本上是ICU的包装器。如果Boost是使用ICU支持编译的……如果不是，Boost。区域设置仅限于为标准库编译的区域设置支持。

相信我，让Boost与ICU一起编译有时真的很痛苦。(Windows中没有包含ICU的预编译二进制文件，所以你必须在应用程序中提供它们，这就打开了一个全新的蠕虫…)

所以我个人建议直接从马的嘴里获得完整的Unicode支持，并直接使用ICU库:

#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <unicode/locid.h>

#include <iostream>

int main()
{
    /*                          "Odysseus" */
    char const * someString = u8"ΟΔΥΣΣΕΥΣ";
    icu::UnicodeString someUString( someString, "UTF-8" );
    // Setting the locale explicitly here for completeness.
    // Usually you would use the user-specified system locale,
    // which *does* make a difference (see ı vs. i above).
    std::cout << someUString.toLower( "el_GR" ) << "\n";
    std::cout << someUString.toUpper( "el_GR" ) << "\n";
    return 0;
}

编译(本例中使用g++):

g++ -Wall example.cpp -licuuc -licuio

这给:

ὀδυσσεύς

注意，单词中间的Σ<-> Σ转换，单词末尾的Σ<->ς转换。没有<算法>的解决方案可以给你。

2014-06-05 15:06:39

我尝试了std::transform，我得到的是可恶的stl criptic编译错误，只有200年前的德鲁伊才能理解(不能从flibidi flabidi流感转换)

这工作得很好，可以很容易地调整

string LowerCase(string s)
{
    int dif='a'-'A';
    for(int i=0;i<s.length();i++)
    {
        if((s[i]>='A')&&(s[i]<='Z'))
            s[i]+=dif;
    }
   return s;
}

string UpperCase(string s)
{
   int dif='a'-'A';
    for(int i=0;i<s.length();i++)
    {
        if((s[i]>='a')&&(s[i]<='z'))
            s[i]-=dif;
    }
   return s;
}

2014-07-10 14:20:34

这是Stefan Mai的回应的后续:如果你想把转换的结果放在另一个字符串中，你需要在调用std::transform之前预先分配它的存储空间。由于STL将转换后的字符存储在目标迭代器中(在每次循环迭代时递增)，因此目标字符串不会自动调整大小，并且可能会占用内存。

#include <string>
#include <algorithm>
#include <iostream>

int main (int argc, char* argv[])
{
  std::string sourceString = "Abc";
  std::string destinationString;

  // Allocate the destination space
  destinationString.resize(sourceString.size());

  // Convert the source string to lower case
  // storing the result in destination string
  std::transform(sourceString.begin(),
                 sourceString.end(),
                 destinationString.begin(),
                 ::tolower);

  // Output the result of the conversion
  std::cout << sourceString
            << " -> "
            << destinationString
            << std::endl;
}

2013-03-28 06:25:54

c++不需要为std::string实现ower或toupper方法，但可以用于char。人们可以很容易地读取字符串的每个字符，将其转换为所需的大小写，并将其放回字符串。不使用任何第三方库的示例代码:

#include<iostream>
    
int main(){
    std::string str = std::string("How ARe You");
    for(char &ch : str){
        ch = std::tolower(ch);
    }
    std::cout<<str<<std::endl;
    return 0;
}

对于字符串上基于字符的操作:对于字符串中的每个字符

2019-03-17 14:35:38

如何转换一个实例的std::字符串小写

推荐文章

最新文章

标签