我有c++ /Obj-C背景,我刚刚发现Python(写了大约一个小时)。 我正在写一个脚本递归地读取文件夹结构中的文本文件的内容。

我的问题是,我写的代码将只工作于一个文件夹深度。我可以在代码中看到为什么(见#hardcoded path),我只是不知道如何使用Python,因为我对它的经验只是全新的。

Python代码:

import os
import sys

rootdir = sys.argv[1]

for root, subFolders, files in os.walk(rootdir):

    for folder in subFolders:
        outfileName = rootdir + "/" + folder + "/py-outfile.txt" # hardcoded path
        folderOut = open( outfileName, 'w' )
        print "outfileName is " + outfileName

        for file in files:
            filePath = rootdir + '/' + file
            f = open( filePath, 'r' )
            toWrite = f.read()
            print "Writing '" + toWrite + "' to" + filePath
            folderOut.write( toWrite )
            f.close()

        folderOut.close()

当前回答

如果你更喜欢(几乎)联机:

from pathlib import Path

lookuppath = '.' #use your path
filelist = [str(item) for item in Path(lookuppath).glob("**/*") if Path(item).is_file()]

在这种情况下,您将得到一个列表,其中仅包含递归位于lookppath下的所有文件的路径。 如果没有str(),你将得到PosixPath()添加到每个路径。

其他回答

如果你更喜欢(几乎)联机:

from pathlib import Path

lookuppath = '.' #use your path
filelist = [str(item) for item in Path(lookuppath).glob("**/*") if Path(item).is_file()]

在这种情况下,您将得到一个列表,其中仅包含递归位于lookppath下的所有文件的路径。 如果没有str(),你将得到PosixPath()添加到每个路径。

这招对我很管用:

import glob

root_dir = "C:\\Users\\Scott\\" # Don't forget trailing (last) slashes    
for filename in glob.iglob(root_dir + '**/*.jpg', recursive=True):
     print(filename)
     # do stuff

确保你理解os.walk的三个返回值:

for root, subdirs, files in os.walk(rootdir):

具有以下含义:

root:被“遍历”的当前路径 subdirs:目录类型根目录下的文件 files:根目录下(不是subdirs目录下)的非directory类型的文件

请使用os.path.join而不是用斜杠连接!您的问题是filePath = rootdir + '/' + file -您必须连接当前“行走”的文件夹,而不是最上面的文件夹。filePath = os。path。加入(根、文件)。顺便说一句,“文件”是内置的,所以你通常不使用它作为变量名。

另一个问题是你的循环,应该是这样的,例如:

import os
import sys

walk_dir = sys.argv[1]

print('walk_dir = ' + walk_dir)

# If your current working directory may change during script execution, it's recommended to
# immediately convert program arguments to an absolute path. Then the variable root below will
# be an absolute path as well. Example:
# walk_dir = os.path.abspath(walk_dir)
print('walk_dir (absolute) = ' + os.path.abspath(walk_dir))

for root, subdirs, files in os.walk(walk_dir):
    print('--\nroot = ' + root)
    list_file_path = os.path.join(root, 'my-directory-list.txt')
    print('list_file_path = ' + list_file_path)

    with open(list_file_path, 'wb') as list_file:
        for subdir in subdirs:
            print('\t- subdirectory ' + subdir)

        for filename in files:
            file_path = os.path.join(root, filename)

            print('\t- file %s (full path: %s)' % (filename, file_path))

            with open(file_path, 'rb') as f:
                f_content = f.read()
                list_file.write(('The file %s contains:\n' % filename).encode('utf-8'))
                list_file.write(f_content)
                list_file.write(b'\n')

如果你不知道,文件的with语句是一种简写:

with open('filename', 'rb') as f:
    dosomething()

# is effectively the same as

f = open('filename', 'rb')
try:
    dosomething()
finally:
    f.close()

使用os.path.join()来构造你的路径-这样更整洁:

import os
import sys
rootdir = sys.argv[1]
for root, subFolders, files in os.walk(rootdir):
    for folder in subFolders:
        outfileName = os.path.join(root,folder,"py-outfile.txt")
        folderOut = open( outfileName, 'w' )
        print "outfileName is " + outfileName
        for file in files:
            filePath = os.path.join(root,file)
            toWrite = open( filePath).read()
            print "Writing '" + toWrite + "' to" + filePath
            folderOut.write( toWrite )
        folderOut.close()

试试这个:

import os
import sys

for root, subdirs, files in os.walk(path):

    for file in os.listdir(root):

        filePath = os.path.join(root, file)

        if os.path.isdir(filePath):
            pass

        else:
            f = open (filePath, 'r')
            # Do Stuff