我可以使用哪个Python库从路径中提取文件名,而不管操作系统或路径格式是什么?
例如,我希望所有这些路径都返回c:
a/b/c/
a/b/c
\a\b\c
\a\b\c\
a\b\c
a/b/../../a/b/c/
a/b/../../a/b/c
我可以使用哪个Python库从路径中提取文件名,而不管操作系统或路径格式是什么?
例如,我希望所有这些路径都返回c:
a/b/c/
a/b/c
\a\b\c
\a\b\c\
a\b\c
a/b/../../a/b/c/
a/b/../../a/b/c
当前回答
如果你想自动获取文件名,你可以这样做
import glob
for f in glob.glob('/your/path/*'):
print(os.path.split(f)[-1])
其他回答
我在Windows和Ubuntu (WSL)上使用此方法,它只使用“import os”即可工作(我): 基本上,replace()根据当前操作系统平台设置正确的路径分隔符。
如果路径以斜杠'/'结束,那么它不是一个文件而是一个目录,因此它返回一个空字符串。
import os
my_fullpath = r"D:\MY_FOLDER\TEST\20201108\20201108_073751.DNG"
os.path.basename(my_fullpath.replace('\\',os.sep))
my_fullpath = r"/MY_FOLDER/TEST/20201108/20201108_073751.DNG"
os.path.basename(my_fullpath.replace('\\',os.sep))
my_fullpath = r"/MY_FOLDER/TEST/20201108/"
os.path.basename(my_fullpath.replace('\\',os.sep))
my_fullpath = r"/MY_FOLDER/TEST/20201108"
os.path.basename(my_fullpath.replace('\\',os.sep))
在Windows(左)和Ubuntu(通过WSL,右)上:
像其他人建议的那样使用os.path.split或os.path.basename并不能在所有情况下工作:如果您在Linux上运行脚本并试图处理经典的windows样式的路径,它将失败。
Windows路径可以使用反斜杠或正斜杠作为路径分隔符。因此,ntpath模块(相当于os. path)在windows上运行时的路径)将适用于所有平台上的所有(1)路径。
import ntpath
ntpath.basename("a/b/c")
当然,如果文件以斜杠结束,basename将为空,所以创建自己的函数来处理它:
def path_leaf(path):
head, tail = ntpath.split(path)
return tail or ntpath.basename(head)
验证:
>>> paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
... 'a/b/../../a/b/c/', 'a/b/../../a/b/c']
>>> [path_leaf(path) for path in paths]
['c', 'c', 'c', 'c', 'c', 'c', 'c']
(1) There's one caveat: Linux filenames may contain backslashes. So on linux, r'a/b\c' always refers to the file b\c in the a folder, while on Windows, it always refers to the c file in the b subfolder of the a folder. So when both forward and backward slashes are used in a path, you need to know the associated platform to be able to interpret it correctly. In practice it's usually safe to assume it's a windows path since backslashes are seldom used in Linux filenames, but keep this in mind when you code so you don't create accidental security holes.
这是适用于linux和windows以及标准库
paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
'a/b/../../a/b/c/', 'a/b/../../a/b/c']
def path_leaf(path):
return path.strip('/').strip('\\').split('/')[-1].split('\\')[-1]
[path_leaf(path) for path in paths]
结果:
['c', 'c', 'c', 'c', 'c', 'c', 'c']
import os
file_location = '/srv/volume1/data/eds/eds_report.csv'
file_name = os.path.basename(file_location ) #eds_report.csv
location = os.path.dirname(file_location ) #/srv/volume1/data/eds
在Python 2和3中,使用模块pathlib2:
import posixpath # to generate unix paths
from pathlib2 import PurePath, PureWindowsPath, PurePosixPath
def path2unix(path, nojoin=True, fromwinpath=False):
"""From a path given in any format, converts to posix path format
fromwinpath=True forces the input path to be recognized as a Windows path (useful on Unix machines to unit test Windows paths)"""
if not path:
return path
if fromwinpath:
pathparts = list(PureWindowsPath(path).parts)
else:
pathparts = list(PurePath(path).parts)
if nojoin:
return pathparts
else:
return posixpath.join(*pathparts)
用法:
In [9]: path2unix('lala/lolo/haha.dat')
Out[9]: ['lala', 'lolo', 'haha.dat']
In [10]: path2unix(r'C:\lala/lolo/haha.dat')
Out[10]: ['C:\\', 'lala', 'lolo', 'haha.dat']
In [11]: path2unix(r'C:\lala/lolo/haha.dat') # works even with malformatted cases mixing both Windows and Linux path separators
Out[11]: ['C:\\', 'lala', 'lolo', 'haha.dat']
使用您的测试用例:
In [12]: testcase = paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
...: ... 'a/b/../../a/b/c/', 'a/b/../../a/b/c']
In [14]: for t in testcase:
...: print(path2unix(t)[-1])
...:
...:
c
c
c
c
c
c
c
这里的思想是将所有路径转换为pathlib2的统一内部表示形式,根据平台使用不同的解码器。幸运的是,pathlib2包含一个名为PurePath的通用解码器,它可以在任何路径上工作。如果这不起作用,您可以使用fromwinpath=True强制识别windows路径。这将把输入字符串分成几个部分,最后一个是你要找的叶子,因此是path2unix(t)[-1]。
如果参数nojoin=False,则路径将被连接回来,因此输出只是转换为Unix格式的输入字符串,这对于跨平台比较子路径非常有用。