在c++中进行不区分大小写字符串比较而不将字符串转换为全大写或全小写的最佳方法是什么?
请指出这些方法是否对unicode友好,以及它们的可移植性如何。
在c++中进行不区分大小写字符串比较而不将字符串转换为全大写或全小写的最佳方法是什么?
请指出这些方法是否对unicode友好,以及它们的可移植性如何。
当前回答
支持unicode的Visual c++字符串函数:http://msdn.microsoft.com/en-us/library/cc194799.aspx
您可能正在寻找的是_wcsnicmp
其他回答
看到std:: lexicographical_compare:
// lexicographical_compare example
#include <iostream> // std::cout, std::boolalpha
#include <algorithm> // std::lexicographical_compare
#include <cctype> // std::tolower
// a case-insensitive comparison function:
bool mycomp (char c1, char c2) {
return std::tolower(c1) < std::tolower(c2);
}
int main () {
char foo[] = "Apple";
char bar[] = "apartment";
std::cout << std::boolalpha;
std::cout << "Comparing foo and bar lexicographically (foo < bar):\n";
std::cout << "Using default comparison (operator<): ";
std::cout << std::lexicographical_compare(foo, foo + 5, bar, bar + 9);
std::cout << '\n';
std::cout << "Using mycomp as comparison object: ";
std::cout << std::lexicographical_compare(foo, foo + 5, bar, bar + 9, mycomp);
std::cout << '\n';
return 0;
}
Demo
迟来的派对,但这里有一个变种,使用std::locale,因此正确处理土耳其语:
auto tolower = std::bind1st(
std::mem_fun(
&std::ctype<char>::tolower),
&std::use_facet<std::ctype<char> >(
std::locale()));
提供一个函子,使用活动区域设置将字符转换为小写,然后可以通过std::transform生成小写字符串:
std::string left = "fOo";
transform(left.begin(), left.end(), left.begin(), tolower);
这也适用于基于wchar_t的字符串。
比较只有小写字符和大写字符不同的字符串的一个简单方法是进行ascii比较。所有的大写字母和小写字母在ascii表中相差32位,使用这些信息,我们有以下…
for( int i = 0; i < string2.length(); i++)
{
if (string1[i] == string2[i] || int(string1[i]) == int(string2[j])+32 ||int(string1[i]) == int(string2[i])-32)
{
count++;
continue;
}
else
{
break;
}
if(count == string2.length())
{
//then we have a match
}
}
我写了一个不区分大小写的char_traits版本,用于std::basic_string,以便在使用内置的std::basic_string成员函数进行比较、搜索等时生成一个不区分大小写的std::string。
换句话说,我想这样做。
std::string a = "Hello, World!";
std::string b = "hello, world!";
assert( a == b );
...这是std::string不能处理的。下面是我的新char_traits的用法:
std::istring a = "Hello, World!";
std::istring b = "hello, world!";
assert( a == b );
...这是它的实现:
/* ---
Case-Insensitive char_traits for std::string's
Use:
To declare a std::string which preserves case but ignores case in comparisons & search,
use the following syntax:
std::basic_string<char, char_traits_nocase<char> > noCaseString;
A typedef is declared below which simplifies this use for chars:
typedef std::basic_string<char, char_traits_nocase<char> > istring;
--- */
template<class C>
struct char_traits_nocase : public std::char_traits<C>
{
static bool eq( const C& c1, const C& c2 )
{
return ::toupper(c1) == ::toupper(c2);
}
static bool lt( const C& c1, const C& c2 )
{
return ::toupper(c1) < ::toupper(c2);
}
static int compare( const C* s1, const C* s2, size_t N )
{
return _strnicmp(s1, s2, N);
}
static const char* find( const C* s, size_t N, const C& a )
{
for( size_t i=0 ; i<N ; ++i )
{
if( ::toupper(s[i]) == ::toupper(a) )
return s+i ;
}
return 0 ;
}
static bool eq_int_type( const int_type& c1, const int_type& c2 )
{
return ::toupper(c1) == ::toupper(c2) ;
}
};
template<>
struct char_traits_nocase<wchar_t> : public std::char_traits<wchar_t>
{
static bool eq( const wchar_t& c1, const wchar_t& c2 )
{
return ::towupper(c1) == ::towupper(c2);
}
static bool lt( const wchar_t& c1, const wchar_t& c2 )
{
return ::towupper(c1) < ::towupper(c2);
}
static int compare( const wchar_t* s1, const wchar_t* s2, size_t N )
{
return _wcsnicmp(s1, s2, N);
}
static const wchar_t* find( const wchar_t* s, size_t N, const wchar_t& a )
{
for( size_t i=0 ; i<N ; ++i )
{
if( ::towupper(s[i]) == ::towupper(a) )
return s+i ;
}
return 0 ;
}
static bool eq_int_type( const int_type& c1, const int_type& c2 )
{
return ::towupper(c1) == ::towupper(c2) ;
}
};
typedef std::basic_string<char, char_traits_nocase<char> > istring;
typedef std::basic_string<wchar_t, char_traits_nocase<wchar_t> > iwstring;
对于非unicode版本,我的第一个想法是这样做的:
bool caseInsensitiveStringCompare(const string& str1, const string& str2) {
if (str1.size() != str2.size()) {
return false;
}
for (string::const_iterator c1 = str1.begin(), c2 = str2.begin(); c1 != str1.end(); ++c1, ++c2) {
if (tolower(static_cast<unsigned char>(*c1)) != tolower(static_cast<unsigned char>(*c2))) {
return false;
}
}
return true;
}