我想使用.replace函数替换多个字符串。

我目前有

string.replace("condition1", "")

但想要一些像

string.replace("condition1", "").replace("condition2", "text")

尽管这样的语法感觉不太好

正确的做法是什么?有点像在grep/regex中,你可以用\1和\2来替换某些搜索字符串的字段


当前回答

从安德鲁的宝贵答案开始,我开发了一个脚本,从一个文件加载字典,并详细说明所有文件上打开的文件夹做替换。脚本从一个外部文件加载映射,您可以在该文件中设置分隔符。我是一个初学者,但我发现这个脚本在多个文件中做多个替换时非常有用。它在几秒钟内加载了一个包含1000多个条目的字典。这并不优雅,但对我来说很管用

import glob
import re

mapfile = input("Enter map file name with extension eg. codifica.txt: ")
sep = input("Enter map file column separator eg. |: ")
mask = input("Enter search mask with extension eg. 2010*txt for all files to be processed: ")
suff = input("Enter suffix with extension eg. _NEW.txt for newly generated files: ")

rep = {} # creation of empy dictionary

with open(mapfile) as temprep: # loading of definitions in the dictionary using input file, separator is prompted
    for line in temprep:
        (key, val) = line.strip('\n').split(sep)
        rep[key] = val

for filename in glob.iglob(mask): # recursion on all the files with the mask prompted

    with open (filename, "r") as textfile: # load each file in the variable text
        text = textfile.read()

        # start replacement
        #rep = dict((re.escape(k), v) for k, v in rep.items()) commented to enable the use in the mapping of re reserved characters
        pattern = re.compile("|".join(rep.keys()))
        text = pattern.sub(lambda m: rep[m.group(0)], text)

        #write of te output files with the prompted suffice
        target = open(filename[:-4]+"_NEW.txt", "w")
        target.write(text)
        target.close()

其他回答

下面是一个支持基本正则表达式替换的版本。主要的限制是表达式不能包含子组,并且可能存在一些边缘情况:

基于@bgusach和其他的代码

import re

class StringReplacer:

    def __init__(self, replacements, ignore_case=False):
        patterns = sorted(replacements, key=len, reverse=True)
        self.replacements = [replacements[k] for k in patterns]
        re_mode = re.IGNORECASE if ignore_case else 0
        self.pattern = re.compile('|'.join(("({})".format(p) for p in patterns)), re_mode)
        def tr(matcher):
            index = next((index for index,value in enumerate(matcher.groups()) if value), None)
            return self.replacements[index]
        self.tr = tr

    def __call__(self, string):
        return self.pattern.sub(self.tr, string)

测试

table = {
    "aaa"    : "[This is three a]",
    "b+"     : "[This is one or more b]",
    r"<\w+>" : "[This is a tag]"
}

replacer = StringReplacer(table, True)

sample1 = "whatever bb, aaa, <star> BBB <end>"

print(replacer(sample1))

# output: 
# whatever [This is one or more b], [This is three a], [This is a tag] [This is one or more b] [This is a tag]

诀窍是通过位置来识别匹配的组。它不是超级高效(O(n)),但它是有效的。

index = next((index for index,value in enumerate(matcher.groups()) if value), None)

替换是一次完成的。

对于只替换一个字符,使用翻译和str.maketrans是我最喜欢的方法。

Tl;dr > result_string = your_string.translate(str.maketrans(dict_mapping))


demo

my_string = 'This is a test string.'
dict_mapping = {'i': 's', 's': 'S'}
result_good = my_string.translate(str.maketrans(dict_mapping))
result_bad = my_string
for x, y in dict_mapping.items():
    result_bad = result_bad.replace(x, y)
print(result_good)  # ThsS sS a teSt Strsng.
print(result_bad)   # ThSS SS a teSt StrSng.

在我的情况下,我需要一个简单的唯一键替换名称,所以我想到了这个:

a = 'This is a test string.'
b = {'i': 'I', 's': 'S'}
for x,y in b.items():
    a = a.replace(x, y)
>>> a
'ThIS IS a teSt StrIng.'

这是我对这个问题的解决办法。我把它用在聊天机器人上,一次替换不同的单词。

def mass_replace(text, dct):
    new_string = ""
    old_string = text
    while len(old_string) > 0:
        s = ""
        sk = ""
        for k in dct.keys():
            if old_string.startswith(k):
                s = dct[k]
                sk = k
        if s:
            new_string+=s
            old_string = old_string[len(sk):]
        else:
            new_string+=old_string[0]
            old_string = old_string[1:]
    return new_string

print mass_replace("The dog hunts the cat", {"dog":"cat", "cat":"dog"})

这就成了猫捉狗

下面是一个简短的例子,应该做的技巧与正则表达式:

import re

rep = {"condition1": "", "condition2": "text"} # define desired replacements here

# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems()) 
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)

例如:

>>> pattern.sub(lambda m: rep[re.escape(m.group(0))], "(condition1) and --condition2--")
'() and --text--'