在c++中进行不区分大小写字符串比较而不将字符串转换为全大写或全小写的最佳方法是什么?

请指出这些方法是否对unicode友好,以及它们的可移植性如何。


当前回答

Boost包含一个方便的算法:

#include <boost/algorithm/string.hpp>
// Or, for fewer header dependencies:
//#include <boost/algorithm/string/predicate.hpp>

std::string str1 = "hello, world!";
std::string str2 = "HELLO, WORLD!";

if (boost::iequals(str1, str2))
{
    // Strings are identical
}

其他回答

boost的问题在于,你必须与boost相关联并依赖于boost。在某些情况下并不容易(例如android)。

使用char_traits意味着所有的比较都是不区分大小写的,这通常不是你想要的。

这就足够了。它应该是相当有效的。它不处理unicode或任何东西。

bool iequals(const string& a, const string& b)
{
    unsigned int sz = a.size();
    if (b.size() != sz)
        return false;
    for (unsigned int i = 0; i < sz; ++i)
        if (tolower(a[i]) != tolower(b[i]))
            return false;
    return true;
}

更新:c++ 14版本(#include <算法>):

bool iequals(const string& a, const string& b)
{
    return std::equal(a.begin(), a.end(),
                      b.begin(), b.end(),
                      [](char a, char b) {
                          return tolower(a) == tolower(b);
                      });
}

c++ 20版本使用std::ranges:

#include <ranges>
#include <algorithm>
#include <string>

bool iequals(const std::string_view& lhs, const std::string_view& rhs) {
    auto to_lower{ std::ranges::views::transform(std::tolower) };
    return std::ranges::equal(lhs | to_lower, rhs | to_lower);
}

仅供参考,strcmp()和stricmp()容易受到缓冲区溢出的影响,因为它们只处理到遇到空结束符为止。使用_strncmp()和_strnicmp()更安全。

看到std:: lexicographical_compare:

// lexicographical_compare example
#include <iostream>  // std::cout, std::boolalpha
#include <algorithm>  // std::lexicographical_compare
#include <cctype>  // std::tolower

// a case-insensitive comparison function:
bool mycomp (char c1, char c2) {
    return std::tolower(c1) < std::tolower(c2);
}

int main () {
    char foo[] = "Apple";
    char bar[] = "apartment";

    std::cout << std::boolalpha;

    std::cout << "Comparing foo and bar lexicographically (foo < bar):\n";

    std::cout << "Using default comparison (operator<): ";
    std::cout << std::lexicographical_compare(foo, foo + 5, bar, bar + 9);
    std::cout << '\n';

    std::cout << "Using mycomp as comparison object: ";
    std::cout << std::lexicographical_compare(foo, foo + 5, bar, bar + 9, mycomp);
    std::cout << '\n';

    return 0;
}

Demo

我写了一个不区分大小写的char_traits版本,用于std::basic_string,以便在使用内置的std::basic_string成员函数进行比较、搜索等时生成一个不区分大小写的std::string。

换句话说,我想这样做。

std::string a = "Hello, World!";
std::string b = "hello, world!";

assert( a == b );

...这是std::string不能处理的。下面是我的新char_traits的用法:

std::istring a = "Hello, World!";
std::istring b = "hello, world!";

assert( a == b );

...这是它的实现:

/*  ---

        Case-Insensitive char_traits for std::string's

        Use:

            To declare a std::string which preserves case but ignores case in comparisons & search,
            use the following syntax:

                std::basic_string<char, char_traits_nocase<char> > noCaseString;

            A typedef is declared below which simplifies this use for chars:

                typedef std::basic_string<char, char_traits_nocase<char> > istring;

    --- */

    template<class C>
    struct char_traits_nocase : public std::char_traits<C>
    {
        static bool eq( const C& c1, const C& c2 )
        { 
            return ::toupper(c1) == ::toupper(c2); 
        }

        static bool lt( const C& c1, const C& c2 )
        { 
            return ::toupper(c1) < ::toupper(c2);
        }

        static int compare( const C* s1, const C* s2, size_t N )
        {
            return _strnicmp(s1, s2, N);
        }

        static const char* find( const C* s, size_t N, const C& a )
        {
            for( size_t i=0 ; i<N ; ++i )
            {
                if( ::toupper(s[i]) == ::toupper(a) ) 
                    return s+i ;
            }
            return 0 ;
        }

        static bool eq_int_type( const int_type& c1, const int_type& c2 )
        { 
            return ::toupper(c1) == ::toupper(c2) ; 
        }       
    };

    template<>
    struct char_traits_nocase<wchar_t> : public std::char_traits<wchar_t>
    {
        static bool eq( const wchar_t& c1, const wchar_t& c2 )
        { 
            return ::towupper(c1) == ::towupper(c2); 
        }

        static bool lt( const wchar_t& c1, const wchar_t& c2 )
        { 
            return ::towupper(c1) < ::towupper(c2);
        }

        static int compare( const wchar_t* s1, const wchar_t* s2, size_t N )
        {
            return _wcsnicmp(s1, s2, N);
        }

        static const wchar_t* find( const wchar_t* s, size_t N, const wchar_t& a )
        {
            for( size_t i=0 ; i<N ; ++i )
            {
                if( ::towupper(s[i]) == ::towupper(a) ) 
                    return s+i ;
            }
            return 0 ;
        }

        static bool eq_int_type( const int_type& c1, const int_type& c2 )
        { 
            return ::towupper(c1) == ::towupper(c2) ; 
        }       
    };

    typedef std::basic_string<char, char_traits_nocase<char> > istring;
    typedef std::basic_string<wchar_t, char_traits_nocase<wchar_t> > iwstring;

支持unicode的Visual c++字符串函数:http://msdn.microsoft.com/en-us/library/cc194799.aspx

您可能正在寻找的是_wcsnicmp