如何在Python中以不区分大小写的方式比较字符串?

我想使用简单的python代码将常规字符串的比较封装到存储库字符串中。我也想有能力在字典中查找值哈希字符串使用常规的python字符串。


当前回答

def search_specificword(key, stng):
    key = key.lower()
    stng = stng.lower()
    flag_present = False
    if stng.startswith(key+" "):
        flag_present = True
    symb = [',','.']
    for i in symb:
        if stng.find(" "+key+i) != -1:
            flag_present = True
    if key == stng:
        flag_present = True
    if stng.endswith(" "+key):
        flag_present = True
    if stng.find(" "+key+" ") != -1:
        flag_present = True
    print(flag_present)
    return flag_present

输出: search_specificword(“经济适用房”,“欧洲经济适用房的核心”) 假

search_specificword(“经济适用房”,“经济适用房的核心,在欧洲”) 真正的

其他回答

考虑使用jaraco.text中的FoldedCase:

>>> from jaraco.text import FoldedCase
>>> FoldedCase('Hello World') in ['hello world']
True

如果你想要一个不考虑大小写的字典,使用来自jaraco.collections的FoldedCaseKeyedDict:

>>> from jaraco.collections import FoldedCaseKeyedDict
>>> d = FoldedCaseKeyedDict()
>>> d['heLlo'] = 'world'
>>> list(d.keys()) == ['heLlo']
True
>>> d['hello'] == 'world'
True
>>> 'hello' in d
True
>>> 'HELLO' in d
True

假设ASCII字符串:

string1 = 'Hello'
string2 = 'hello'

if string1.lower() == string2.lower():
    print("The strings are the same (case insensitive)")
else:
    print("The strings are NOT the same (case insensitive)")

从Python 3.3开始,casefold()是一个更好的选择:

string1 = 'Hello'
string2 = 'hello'

if string1.casefold() == string2.casefold():
    print("The strings are the same (case insensitive)")
else:
    print("The strings are NOT the same (case insensitive)")

如果您想要一个处理更复杂unicode比较的更全面的解决方案,请参阅其他答案。

我找到了一个干净的解决方案,我正在使用一些固定的文件扩展名。

from pathlib import Path


class CaseInsitiveString(str):
   def __eq__(self, __o: str) -> bool:
      return self.casefold() == __o.casefold()

GZ = CaseInsitiveString(".gz")
ZIP = CaseInsitiveString(".zip")
TAR = CaseInsitiveString(".tar")

path = Path("/tmp/ALL_CAPS.TAR.GZ")

GZ in path.suffixes, ZIP in path.suffixes, TAR in path.suffixes, TAR == ".tAr"

# (True, False, True, True)

我看到这个用正则表达式的解。

import re
if re.search('mandy', 'Mandy Pande', re.IGNORECASE):
# is True

它和口音很搭

In [42]: if re.search("ê","ê", re.IGNORECASE):
....:        print(1)
....:
1

但是,它不适用于unicode字符不区分大小写的情况。谢谢@Rhymoid指出,我的理解是,它需要确切的符号,因为情况是真实的。回显如下:

In [36]: "ß".lower()
Out[36]: 'ß'
In [37]: "ß".upper()
Out[37]: 'SS'
In [38]: "ß".upper().lower()
Out[38]: 'ss'
In [39]: if re.search("ß","ßß", re.IGNORECASE):
....:        print(1)
....:
1
In [40]: if re.search("SS","ßß", re.IGNORECASE):
....:        print(1)
....:
In [41]: if re.search("ß","SS", re.IGNORECASE):
....:        print(1)
....:
from re import search, IGNORECASE

def is_string_match(word1, word2):
    #  Case insensitively function that checks if two words are the same
    # word1: string
    # word2: string | list

    # if the word1 is in a list of words
    if isinstance(word2, list):
        for word in word2:
            if search(rf'\b{word1}\b', word, IGNORECASE):
                return True
        return False

    # if the word1 is same as word2
    if search(rf'\b{word1}\b', word2, IGNORECASE):
        return True
    return False

is_match_word = is_string_match("Hello", "hELLO") 
True

is_match_word = is_string_match("Hello", ["Bye", "hELLO", "@vagavela"])
True

is_match_word = is_string_match("Hello", "Bye")
False