有没有更好的方法来使用glob。Glob在python中获取多个文件类型的列表,如.txt, .mdown和.markdown?现在我有这样的东西:

projectFiles1 = glob.glob( os.path.join(projectDir, '*.txt') )
projectFiles2 = glob.glob( os.path.join(projectDir, '*.mdown') )
projectFiles3 = glob.glob( os.path.join(projectDir, '*.markdown') )

当前回答

与@BPL相同的答案(计算效率高),但它可以处理任何glob模式,而不是扩展:

import os
from fnmatch import fnmatch

folder = "path/to/folder/"
patterns = ("*.txt", "*.md", "*.markdown")

files = [f.path for f in os.scandir(folder) if any(fnmatch(f, p) for p in patterns)]

这种解决方案既高效又方便。它还与glob的行为紧密匹配(请参阅文档)。

注意,使用内置包pathlib会更简单:

from pathlib import Path

folder = Path("/path/to/folder")
patterns = ("*.txt", "*.md", "*.markdown")

files = [f for f in folder.iterdir() if any(f.match(p) for p in patterns)]

其他回答

下面的函数_glob用于多个文件扩展名。

import glob
import os
def _glob(path, *exts):
    """Glob for multiple file extensions

    Parameters
    ----------
    path : str
        A file name without extension, or directory name
    exts : tuple
        File extensions to glob for

    Returns
    -------
    files : list
        list of files matching extensions in exts in path

    """
    path = os.path.join(path, "*") if os.path.isdir(path) else path + "*"
    return [f for files in [glob.glob(path + ext) for ext in exts] for f in files]

files = _glob(projectDir, ".txt", ".mdown", ".markdown")

来这里寻求帮助后,我有了自己的解决方案,想和大家分享。它基于user2363986的答案,但我认为这更具可伸缩性。这意味着,即使您有1000个扩展,代码仍然看起来很优雅。

from glob import glob

directoryPath  = "C:\\temp\\*." 
fileExtensions = [ "jpg", "jpeg", "png", "bmp", "gif" ]
listOfFiles    = []

for extension in fileExtensions:
    listOfFiles.extend( glob( directoryPath + extension ))

for file in listOfFiles:
    print(file)   # Or do other stuff

例如:

import glob
lst_img = []
base_dir = '/home/xy/img/'

# get all the jpg file in base_dir 
lst_img += glob.glob(base_dir + '*.jpg')
print lst_img
# ['/home/xy/img/2.jpg', '/home/xy/img/1.jpg']

# append all the png file in base_dir to lst_img
lst_img += glob.glob(base_dir + '*.png')
print lst_img
# ['/home/xy/img/2.jpg', '/home/xy/img/1.jpg', '/home/xy/img/3.png']

一个函数:

import glob
def get_files(base_dir='/home/xy/img/', lst_extension=['*.jpg', '*.png']):
    """
    :param base_dir:base directory
    :param lst_extension:lst_extension: list like ['*.jpg', '*.png', ...]
    :return:file lists like ['/home/xy/img/2.jpg','/home/xy/img/3.png']
    """
    lst_files = []
    for ext in lst_extension:
        lst_files += glob.glob(base_dir+ext)
    return lst_files

从前面的答案

glob('*.jpg') + glob('*.png')

这是一个较短的问题,

from glob import glob
extensions = ['jpg', 'png'] # to find these filename extensions

# Method 1: loop one by one and extend to the output list
output = []
[output.extend(glob(f'*.{name}')) for name in extensions]
print(output)

# Method 2: even shorter
# loop filename extension to glob() it and flatten it to a list
output = [p for p2 in [glob(f'*.{name}') for name in extensions] for p in p2]
print(output)

我也有同样的问题,这是我想到的

import os, sys, re

#without glob

src_dir = '/mnt/mypics/'
src_pics = []
ext = re.compile('.*\.(|{}|)$'.format('|'.join(['png', 'jpeg', 'jpg']).encode('utf-8')))
for root, dirnames, filenames in os.walk(src_dir):
  for filename in filter(lambda name:ext.search(name),filenames):
    src_pics.append(os.path.join(root, filename))