我想循环一个文本文件的内容,并在一些行上进行搜索和替换,并将结果写回文件。我可以先把整个文件加载到内存中,然后再把它写回来,但这可能不是最好的方法。
在下面的代码中,做到这一点的最佳方法是什么?
f = open(file)
for line in f:
if line.contains('foo'):
newline = line.replace('foo', 'bar')
# how to write this newline back to the file
如果您删除缩进如下所示,它将在多行中搜索和替换。
请看下面的例子。
def replace(file, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
print fh, abs_path
new_file = open(abs_path,'w')
old_file = open(file)
for line in old_file:
new_file.write(line.replace(pattern, subst))
#close temp file
new_file.close()
close(fh)
old_file.close()
#Remove original file
remove(file)
#Move new file
move(abs_path, file)
Fileinput非常简单,就像之前的答案中提到的那样:
import fileinput
def replace_in_file(file_path, search_text, new_text):
with fileinput.input(file_path, inplace=True) as file:
for line in file:
new_line = line.replace(search_text, new_text)
print(new_line, end='')
解释:
fileinput可以接受多个文件,但我更喜欢在处理每个文件时立即关闭它。因此,将单个file_path放在with语句中。
当inplace=True时,print语句不打印任何东西,因为STDOUT被转发到原始文件。
End = " in print语句是消除中间空白的新行。
你可以这样使用它:
file_path = '/path/to/my/file'
replace_in_file(file_path, 'old-text', 'new-text')
如果你想要一个通用函数,用其他文本替换任何文本,这可能是最好的方法,特别是如果你是regex的粉丝:
import re
def replace( filePath, text, subs, flags=0 ):
with open( filePath, "r+" ) as file:
fileContents = file.read()
textPattern = re.compile( re.escape( text ), flags )
fileContents = textPattern.sub( subs, fileContents )
file.seek( 0 )
file.truncate()
file.write( fileContents )
使用hamishmcn的答案作为模板,我能够在文件中搜索与我的正则表达式匹配的一行,并将其替换为空字符串。
import re
fin = open("in.txt", 'r') # in file
fout = open("out.txt", 'w') # out file
for line in fin:
p = re.compile('[-][0-9]*[.][0-9]*[,]|[-][0-9]*[,]') # pattern
newline = p.sub('',line) # replace matching strings with empty string
print newline
fout.write(newline)
fin.close()
fout.close()
扩展@Kiran的回答,我认为它更简洁和python化,这增加了编解码器来支持UTF-8的读写:
import codecs
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with codecs.open(target_file_path, 'w', 'utf-8') as target_file:
with codecs.open(source_file_path, 'r', 'utf-8') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)