如何使用python3搜索和替换文件中的文本?
这是我的代码:
import os
import sys
import fileinput
print ("Text to search for:")
textToSearch = input( "> " )
print ("Text to replace it with:")
textToReplace = input( "> " )
print ("File to perform Search-Replace on:")
fileToSearch = input( "> " )
#fileToSearch = 'D:\dummy1.txt'
tempFile = open( fileToSearch, 'r+' )
for line in fileinput.input( fileToSearch ):
if textToSearch in line :
print('Match Found')
else:
print('Match Not Found!!')
tempFile.write( line.replace( textToSearch, textToReplace ) )
tempFile.close()
input( '\n\n Press Enter to exit...' )
输入文件:
hi this is abcd hi this is abcd
This is dummy text file.
This is how search and replace works abcd
当我在上面的输入文件中搜索并将“ram”替换为“abcd”时,它就像一个咒语。但当我反过来做,即替换'abcd'由'ram',一些垃圾字符被留在最后。
将'abcd'替换为'ram'
hi this is ram hi this is ram
This is dummy text file.
This is how search and replace works rambcd
使用re.subn可以对替换过程进行更多的控制,例如将单词分成两行,区分大小写的匹配。此外,它返回匹配的数量,如果没有找到字符串,可以使用这些匹配来避免浪费资源。
import re
file = # path to file
# they can be also raw string and regex
textToSearch = r'Ha.*O' # here an example with a regex
textToReplace = 'hallo'
# read and replace
with open(file, 'r') as fd:
# sample case-insensitive find-and-replace
text, counter = re.subn(textToSearch, textToReplace, fd.read(), re.I)
# check if there is at least a match
if counter > 0:
# edit the file
with open(file, 'w') as fd:
fd.write(text)
# summary result
print(f'{counter} occurence of "{textToSearch}" were replaced with "{textToReplace}".')
一些正则表达式:
添加re.I标志,re.IGNORECASE的缩写形式,用于不区分大小写的匹配
对于多行替换re.subn(r'\n*'.join(textToSearch), textToReplace, fd.read())),取决于数据也'\n{,1}'。注意,在这种情况下,textToSearch必须是纯字符串,而不是正则表达式!
正如michaelb958所指出的,不能用不同长度的数据替换现有的部分,因为这会使其余部分不合适。我不同意其他人建议你从一个文件读到另一个文件。相反,我将把文件读入内存,修复数据,然后在单独的步骤中将其写入相同的文件。
# Read in the file
with open('file.txt', 'r') as file :
filedata = file.read()
# Replace the target string
filedata = filedata.replace('abcd', 'ram')
# Write the file out again
with open('file.txt', 'w') as file:
file.write(filedata)
除非你有一个巨大的文件要处理,它太大了,无法一次性加载到内存中,或者你担心如果在向文件写入数据的第二步过程中中断,可能会导致数据丢失。