我需要遍历给定目录的子目录并搜索文件。如果我得到一个文件,我必须打开它,改变它的内容,用我自己的行替换它。

我试了一下:

import os

rootdir ='C:/Users/sid/Desktop/test'

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        f=open(file,'r')
        lines=f.readlines()
        f.close()
        f=open(file,'w')
        for line in lines:
            newline = "No you are not"
            f.write(newline)
        f.close()

但是我得到了一个错误。我做错了什么?


实际遍历目录的工作与您编写的一样。如果你用一个简单的print语句替换内部循环的内容,你可以看到每个文件都被找到了:

import os
rootdir = 'C:/Users/sid/Desktop/test'

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        print(os.path.join(subdir, file))

如果运行上述操作时仍然出错,请提供错误消息。


另一种返回子目录中所有文件的方法是使用pathlib模块,该模块在Python 3.4中引入,它提供了一种面向对象的方法来处理文件系统路径(pathlib在Python 2.7中也可以通过PyPi上的pathlib2模块使用):

from pathlib import Path

rootdir = Path('C:/Users/sid/Desktop/test')
# Return a list of regular files only, not directories
file_list = [f for f in rootdir.glob('**/*') if f.is_file()]

# For absolute paths instead of relative the current dir
file_list = [f for f in rootdir.resolve().glob('**/*') if f.is_file()]

从Python 3.5开始,glob模块也支持递归文件查找:

import os
from glob import iglob

rootdir_glob = 'C:/Users/sid/Desktop/test/**/*' # Note the added asterisks
# This will return absolute paths
file_list = [f for f in iglob(rootdir_glob, recursive=True) if os.path.isfile(f)]

以上任何一种方法的file_list都可以在不需要嵌套循环的情况下迭代:

for f in file_list:
    print(f) # Replace with desired operations

从python >= 3.5开始,你可以使用**,glob。iglob(path/**, recursive=True),这似乎是最python化的解决方案,即:

import glob, os

for filename in glob.iglob('/pardadox-music/**', recursive=True):
    if os.path.isfile(filename): # filter dirs
        print(filename)

输出:

/pardadox-music/modules/her1.mod
/pardadox-music/modules/her2.mod
...

注:

glob.iglob glob.iglob(pathname, recursive=False) Return an iterator which yields the same values as glob() without actually storing them all simultaneously. If recursive is True, the pattern '**' will match any files and zero or more directories and subdirectories. If the directory contains files starting with . they won’t be matched by default. For example, consider a directory containing card.gif and .card.gif: >>> import glob >>> glob.glob('*.gif') ['card.gif'] >>> glob.glob('.c*')['.card.gif'] You can also use rglob(pattern), which is the same as calling glob() with **/ added in front of the given relative pattern.