有没有办法在Python中返回当前目录中所有子目录的列表?
我知道您可以对文件执行此操作,但我需要获得目录列表。
有没有办法在Python中返回当前目录中所有子目录的列表?
我知道您可以对文件执行此操作,但我需要获得目录列表。
您是指直接子目录,还是树下的每个目录?
无论哪种方式,你都可以使用os。走路做这个:
os.walk(directory)
将为每个子目录生成一个元组。三元组中的第一个条目是一个目录名,因此
[x[0] for x in os.walk(directory)]
应该会递归地给出所有子目录。
注意,元组中的第二个条目是第一个位置的条目的子目录列表,所以您可以使用这个代替,但它不太可能节省太多时间。
但是,你可以使用它来给你直接的子目录:
next(os.walk('.'))[1]
或者查看已经发布的其他解决方案,使用os。Listdir和os.path。isdir,包括“如何在Python中获取所有直接子目录”中的那些。
如果您需要一个递归的解决方案来查找子目录中的所有子目录,请使用前面建议的walk。
如果您只需要当前目录的子目录,请将os. xml目录合并在一起。使用os.path.isdir
import os
d = '.'
[os.path.join(d, o) for o in os.listdir(d)
if os.path.isdir(os.path.join(d,o))]
谢谢你们的建议,伙计们。我遇到了软链接(无限递归)作为dirs返回的问题。Softlinks吗?我们不想要臭软链接!所以…
这只是渲染dirs,而不是软链接:
>>> import os
>>> inf = os.walk('.')
>>> [x[0] for x in inf]
['.', './iamadir']
使用python-os-walk实现。(http://www.pythonforbeginners.com/code-snippets-source-code/python-os-walk/)
import os
print("root prints out directories only from what you specified")
print("dirs prints out sub-directories from root")
print("files prints out all files from root and directories")
print("*" * 20)
for root, dirs, files in os.walk("/var/log"):
print(root)
print(dirs)
print(files)
全路径,计算路径为。,.., \\, ..\\…\ \文件夹等:
import os, pprint
pprint.pprint([os.path.join(os.path.abspath(path), x[0]) \
for x in os.walk(os.path.abspath(path))])
在Python 2.7中,可以使用os.listdir(path)获取子目录(和文件)列表
import os
os.listdir(path) # list of subdirectories and files
由于我在使用Python 3.4和Windows UNC路径时偶然发现了这个问题,下面是这个环境的一个变体:
from pathlib import WindowsPath
def SubDirPath (d):
return [f for f in d.iterdir() if f.is_dir()]
subdirs = SubDirPath(WindowsPath(r'\\file01.acme.local\home$'))
print(subdirs)
Pathlib是Python 3.4中的新功能,它使得在不同操作系统下使用路径更加容易: https://docs.python.org/3.4/library/pathlib.html
我更喜欢使用滤镜(https://docs.python.org/2/library/functions.html#filter),但这只是个人喜好问题。
d='.'
filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d))
以Eli Bendersky的解决方案为基础,使用以下示例:
import os
test_directory = <your_directory>
for child in os.listdir(test_directory):
test_path = os.path.join(test_directory, child)
if os.path.isdir(test_path):
print test_path
# Do stuff to the directory "test_path"
>是要遍历的目录的路径。
使用过滤函数os.path.isdir over os.listdir() 类似这样的过滤器(os.path.isdir,[os.path.join(os.path.abspath('PATH'),p) for p in os.listdir('PATH/')])
你可以用glob。glob
from glob import glob
glob("/path/to/directory/*/", recursive = True)
不要忘记*后面的/。
虽然这个问题很久以前就有答案了。我想推荐使用pathlib模块,因为这是在Windows和Unix操作系统上工作的一种健壮的方式。
要获取特定目录下的所有路径,包括子目录:
from pathlib import Path
paths = list(Path('myhomefolder', 'folder').glob('**/*.txt'))
# all sorts of operations
file = paths[0]
file.name
file.stem
file.parent
file.suffix
etc.
只列出目录
print("\nWe are listing out only the directories in current directory -")
directories_in_curdir = list(filter(os.path.isdir, os.listdir(os.curdir)))
print(directories_in_curdir)
只列出当前目录中的文件
files = list(filter(os.path.isfile, os.listdir(os.curdir)))
print("\nThe following are the list of all files in the current directory -")
print(files)
比上面的要好得多,因为你不需要几个os.path.join(),你将直接获得完整的路径(如果你愿意的话),你可以在Python 3.5及以上版本中这样做。
subfolders = [ f.path for f in os.scandir(folder) if f.is_dir() ]
这将给出子目录的完整路径。 如果您只想要子目录的名称,请使用f.name而不是f.path
https://docs.python.org/3/library/os.html#os.scandir
稍微OT:如果你需要递归所有子文件夹和/或递归所有文件,看看这个函数,它比os更快。Walk & glob将返回所有子文件夹以及这些(子)子文件夹中的所有文件的列表:https://stackoverflow.com/a/59803793/2441026
如果你只需要递归的所有子文件夹:
def fast_scandir(dirname):
subfolders= [f.path for f in os.scandir(dirname) if f.is_dir()]
for dirname in list(subfolders):
subfolders.extend(fast_scandir(dirname))
return subfolders
返回所有子文件夹及其完整路径的列表。这个还是比os快。走,比glob快多了。
所有功能的分析
tl;博士: -如果你想获取一个文件夹的所有直接子目录,请使用os.scandir。 —如果您想获取所有子目录,甚至是嵌套的子目录,请使用os。行走或者——稍微快一点——上面的fast_scandir函数。 —不要使用操作系统。只遍历顶级子目录,因为它可能比os.scandir慢数百倍(!)。
If you run the code below, make sure to run it once so that your OS will have accessed the folder, discard the results and run the test, otherwise results will be screwed. You might want to mix up the function calls, but I tested it, and it did not really matter. All examples will give the full path to the folder. The pathlib example as a (Windows)Path object. The first element of os.walk will be the base folder. So you will not get only subdirectories. You can use fu.pop(0) to remove it. None of the results will use natural sorting. This means results will be sorted like this: 1, 10, 2. To get natural sorting (1, 2, 10), please have a look at https://stackoverflow.com/a/48030307/2441026
结果:
os.scandir took 1 ms. Found dirs: 439
os.walk took 463 ms. Found dirs: 441 -> it found the nested one + base folder.
glob.glob took 20 ms. Found dirs: 439
pathlib.iterdir took 18 ms. Found dirs: 439
os.listdir took 18 ms. Found dirs: 439
用W7x64测试,Python 3.8.1。
# -*- coding: utf-8 -*-
# Python 3
import time
import os
from glob import glob
from pathlib import Path
directory = r"<insert_folder>"
RUNS = 1
def run_os_walk():
a = time.time_ns()
for i in range(RUNS):
fu = [x[0] for x in os.walk(directory)]
print(f"os.walk\t\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_glob():
a = time.time_ns()
for i in range(RUNS):
fu = glob(directory + "/*/")
print(f"glob.glob\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_pathlib_iterdir():
a = time.time_ns()
for i in range(RUNS):
dirname = Path(directory)
fu = [f for f in dirname.iterdir() if f.is_dir()]
print(f"pathlib.iterdir\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_os_listdir():
a = time.time_ns()
for i in range(RUNS):
dirname = Path(directory)
fu = [os.path.join(directory, o) for o in os.listdir(directory) if os.path.isdir(os.path.join(directory, o))]
print(f"os.listdir\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_os_scandir():
a = time.time_ns()
for i in range(RUNS):
fu = [f.path for f in os.scandir(directory) if f.is_dir()]
print(f"os.scandir\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms.\tFound dirs: {len(fu)}")
if __name__ == '__main__':
run_os_scandir()
run_os_walk()
run_glob()
run_pathlib_iterdir()
run_os_listdir()
Python 3.4在标准库中引入了pathlib模块,它提供了一种面向对象的方法来处理文件系统路径:
from pathlib import Path
p = Path('./')
# All subdirectories in the current directory, not recursive.
[f for f in p.iterdir() if f.is_dir()]
要递归地列出所有子目录,路径通配符可以与**模式一起使用。
# This will also include the current directory '.'
list(p.glob('**'))
请注意,一个*作为glob模式将非递归地包括文件和目录。为了只获取目录,可以在后面追加一个/,但这只在直接使用glob库时有效,而不是通过pathlib使用glob时:
import glob
# These three lines return both files and directories
list(p.glob('*'))
list(p.glob('*/'))
glob.glob('*')
# Whereas this returns only directories
glob.glob('*/')
因此Path('./').glob('**')匹配与glob相同的路径。一团(“* * /”,递归= True)。
Pathlib也可以通过PyPi上的pathlib2模块在Python 2.7中使用。
下面是基于@Blair Conrad的例子的几个简单函数
import os
def get_subdirs(dir):
"Get a list of immediate subdirectories"
return next(os.walk(dir))[1]
def get_subfiles(dir):
"Get a list of immediate subfiles"
return next(os.walk(dir))[2]
我最近也遇到过类似的问题,我发现python 3.6的最佳答案(用户havlock添加的)是使用os.scandir。由于似乎没有使用它的解决方案,所以我将添加自己的解决方案。首先是一种非递归解决方案,它只列出根目录下的子目录。
def get_dirlist(rootdir):
dirlist = []
with os.scandir(rootdir) as rit:
for entry in rit:
if not entry.name.startswith('.') and entry.is_dir():
dirlist.append(entry.path)
dirlist.sort() # Optional, in case you want sorted directory names
return dirlist
递归的版本是这样的:
def get_dirlist(rootdir):
dirlist = []
with os.scandir(rootdir) as rit:
for entry in rit:
if not entry.name.startswith('.') and entry.is_dir():
dirlist.append(entry.path)
dirlist += get_dirlist(entry.path)
dirlist.sort() # Optional, in case you want sorted directory names
return dirlist
记住这一项。Path使用子目录的绝对路径。如果您只需要文件夹名称,您可以使用entry.name代替。参考os。DirEntry获取关于条目对象的其他详细信息。
在ipython中复制粘贴友好:
import os
d='.'
folders = list(filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d)))
从打印(文件夹)输出:
['folderA', 'folderB']
这将列出文件树的所有子目录。
import pathlib
def list_dir(dir):
path = pathlib.Path(dir)
dir = []
try:
for item in path.iterdir():
if item.is_dir():
dir.append(item)
dir = dir + list_dir(item)
return dir
except FileNotFoundError:
print('Invalid directory')
Pathlib是3.4版的新功能
这个函数,对于给定的父目录,递归地遍历它的所有目录,并打印它在其中找到的所有文件名。也有用。
import os
def printDirectoryFiles(directory):
for filename in os.listdir(directory):
full_path=os.path.join(directory, filename)
if not os.path.isdir(full_path):
print( full_path + "\n")
def checkFolders(directory):
dir_list = next(os.walk(directory))[1]
#print(dir_list)
for dir in dir_list:
print(dir)
checkFolders(directory +"/"+ dir)
printDirectoryFiles(directory)
main_dir="C:/Users/S0082448/Desktop/carpeta1"
checkFolders(main_dir)
input("Press enter to exit ;")
函数返回给定文件路径内所有子目录的List。将搜索整个文件树。
import os
def get_sub_directory_paths(start_directory, sub_directories):
"""
This method iterates through all subdirectory paths of a given
directory to collect all directory paths.
:param start_directory: The starting directory path.
:param sub_directories: A List that all subdirectory paths will be
stored to.
:return: A List of all sub-directory paths.
"""
for item in os.listdir(start_directory):
full_path = os.path.join(start_directory, item)
if os.path.isdir(full_path):
sub_directories.append(full_path)
# Recursive call to search through all subdirectories.
get_sub_directory_paths(full_path, sub_directories)
return sub_directories
我就是这么做的。
import os
for x in os.listdir(os.getcwd()):
if os.path.isdir(x):
print(x)
我们可以使用os.walk()来获取所有文件夹的列表
import os
path = os.getcwd()
pathObject = os.walk(path)
这个pathObject是一个对象,我们可以通过
arr = [x for x in pathObject]
arr is of type [('current directory', [array of folder in current directory], [files in current directory]),('subdirectory', [array of folder in subdirectory], [files in subdirectory]) ....]
我们可以通过遍历arr并打印中间的数组来获得所有子目录的列表
for i in arr:
for j in i[1]:
print(j)
这将打印所有子目录。
获取所有文件:
for i in arr:
for j in i[2]:
print(i[0] + "/" + j)
通过从这里加入多个解决方案,这是我最终使用的:
import os
import glob
def list_dirs(path):
return [os.path.basename(x) for x in filter(
os.path.isdir, glob.glob(os.path.join(path, '*')))]
有很多很好的答案,但如果你来这里寻找一个简单的方法来获得所有文件或文件夹的列表。你可以利用linux和mac上提供的find操作系统,它比os.walk快得多
import os
all_files_list = os.popen("find path/to/my_base_folder -type f").read().splitlines()
all_sub_directories_list = os.popen("find path/to/my_base_folder -type d").read().splitlines()
OR
import os
def get_files(path):
all_files_list = os.popen(f"find {path} -type f").read().splitlines()
return all_files_list
def get_sub_folders(path):
all_sub_directories_list = os.popen(f"find {path} -type d").read().splitlines()
return all_sub_directories_list
最简单的方法:
from pathlib import Path
from glob import glob
current_dir = Path.cwd()
all_sub_dir_paths = glob(str(current_dir) + '/*/') # returns list of sub directory paths
all_sub_dir_names = [Path(sub_dir).name for sub_dir in all_sub_dir_paths]
这应该可以工作,因为它还创建了一个目录树;
import os
import pathlib
def tree(directory):
print(f'+ {directory}')
print("There are " + str(len(os.listdir(os.getcwd()))) + \
" folders in this directory;")
for path in sorted(directory.glob('*')):
depth = len(path.relative_to(directory).parts)
spacer = ' ' * depth
print(f'{spacer}+ {path.name}')
这应该列出使用pathlib库的文件夹中的所有目录。path.relative_to(目录)。Parts获取相对于当前工作目录的元素。
下面这个类将能够获得一个给定目录中的文件,文件夹和所有子文件夹的列表
import os
import json
class GetDirectoryList():
def __init__(self, path):
self.main_path = path
self.absolute_path = []
self.relative_path = []
def get_files_and_folders(self, resp, path):
all = os.listdir(path)
resp["files"] = []
for file_folder in all:
if file_folder != "." and file_folder != "..":
if os.path.isdir(path + "/" + file_folder):
resp[file_folder] = {}
self.get_files_and_folders(resp=resp[file_folder], path= path + "/" + file_folder)
else:
resp["files"].append(file_folder)
self.absolute_path.append(path.replace(self.main_path + "/", "") + "/" + file_folder)
self.relative_path.append(path + "/" + file_folder)
return resp, self.relative_path, self.absolute_path
@property
def get_all_files_folder(self):
self.resp = {self.main_path: {}}
all = self.get_files_and_folders(self.resp[self.main_path], self.main_path)
return all
if __name__ == '__main__':
mylib = GetDirectoryList(path="sample_folder")
file_list = mylib.get_all_files_folder
print (json.dumps(file_list))
而样本目录看起来像
sample_folder/
lib_a/
lib_c/
lib_e/
__init__.py
a.txt
__init__.py
b.txt
c.txt
lib_d/
__init__.py
__init__.py
d.txt
lib_b/
__init__.py
e.txt
__init__.py
结果
[
{
"files": [
"__init__.py"
],
"lib_b": {
"files": [
"__init__.py",
"e.txt"
]
},
"lib_a": {
"files": [
"__init__.py",
"d.txt"
],
"lib_c": {
"files": [
"__init__.py",
"c.txt",
"b.txt"
],
"lib_e": {
"files": [
"__init__.py",
"a.txt"
]
}
},
"lib_d": {
"files": [
"__init__.py"
]
}
}
},
[
"sample_folder/lib_b/__init__.py",
"sample_folder/lib_b/e.txt",
"sample_folder/__init__.py",
"sample_folder/lib_a/lib_c/lib_e/__init__.py",
"sample_folder/lib_a/lib_c/lib_e/a.txt",
"sample_folder/lib_a/lib_c/__init__.py",
"sample_folder/lib_a/lib_c/c.txt",
"sample_folder/lib_a/lib_c/b.txt",
"sample_folder/lib_a/lib_d/__init__.py",
"sample_folder/lib_a/__init__.py",
"sample_folder/lib_a/d.txt"
],
[
"lib_b/__init__.py",
"lib_b/e.txt",
"sample_folder/__init__.py",
"lib_a/lib_c/lib_e/__init__.py",
"lib_a/lib_c/lib_e/a.txt",
"lib_a/lib_c/__init__.py",
"lib_a/lib_c/c.txt",
"lib_a/lib_c/b.txt",
"lib_a/lib_d/__init__.py",
"lib_a/__init__.py",
"lib_a/d.txt"
]
]
使用OS walk
sub_folders = []
for dir, sub_dirs, files in os.walk(test_folder):
sub_folders.extend(sub_dirs)
import os
path = "test/"
files = [x[0] + "/" + y for x in os.walk(path) if len(x[-1]) > 0 for y in x[-1]]
对于像我这样只需要一个目录中直接文件夹名称的人来说,这在Windows上是可行的。
import os
for f in os.scandir(mypath):
print(f.name)
这是简单的递归解
import os
def fn(dir=r"C:\Users\aryan\Downloads\opendatakit"): # 1.Get file names from directory
file_list = os.listdir(dir)
res = []
# print(file_list)
for file in file_list:
if os.path.isfile(os.path.join(dir, file)):
res.append(file)
else:
result = fn(os.path.join(dir, file))
if result:
res.extend(fn(os.path.join(dir, file)))
return res
res = fn()
print(res)
print(len(res))