我可以使用哪个Python库从路径中提取文件名,而不管操作系统或路径格式是什么?
例如,我希望所有这些路径都返回c:
a/b/c/
a/b/c
\a\b\c
\a\b\c\
a\b\c
a/b/../../a/b/c/
a/b/../../a/b/c
我可以使用哪个Python库从路径中提取文件名,而不管操作系统或路径格式是什么?
例如,我希望所有这些路径都返回c:
a/b/c/
a/b/c
\a\b\c
\a\b\c\
a\b\c
a/b/../../a/b/c/
a/b/../../a/b/c
当前回答
为了完整起见,这里是python 3.2+的pathlib解决方案:
>>> from pathlib import PureWindowsPath
>>> paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
... 'a/b/../../a/b/c/', 'a/b/../../a/b/c']
>>> [PureWindowsPath(path).name for path in paths]
['c', 'c', 'c', 'c', 'c', 'c', 'c']
这在Windows和Linux上都适用。
其他回答
我从来没有见过双开的路,它们存在吗?python模块os的内置特性在这些方面失败了。所有其他工作,还有你用os.path.normpath()给出的警告:
paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
... 'a/b/../../a/b/c/', 'a/b/../../a/b/c', 'a/./b/c', 'a\b/c']
for path in paths:
os.path.basename(os.path.normpath(path))
这是适用于linux和windows以及标准库
paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
'a/b/../../a/b/c/', 'a/b/../../a/b/c']
def path_leaf(path):
return path.strip('/').strip('\\').split('/')[-1].split('\\')[-1]
[path_leaf(path) for path in paths]
结果:
['c', 'c', 'c', 'c', 'c', 'c', 'c']
为了完整起见,这里是python 3.2+的pathlib解决方案:
>>> from pathlib import PureWindowsPath
>>> paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
... 'a/b/../../a/b/c/', 'a/b/../../a/b/c']
>>> [PureWindowsPath(path).name for path in paths]
['c', 'c', 'c', 'c', 'c', 'c', 'c']
这在Windows和Linux上都适用。
这是一个仅适用于regex的解决方案,它似乎适用于任何OS上的任何OS路径。
不需要其他模块,也不需要预处理:
import re
def extract_basename(path):
"""Extracts basename of a given path. Should Work with any OS Path on any OS"""
basename = re.search(r'[^\\/]+(?=[\\/]?$)', path)
if basename:
return basename.group(0)
paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',
'a/b/../../a/b/c/', 'a/b/../../a/b/c']
print([extract_basename(path) for path in paths])
# ['c', 'c', 'c', 'c', 'c', 'c', 'c']
extra_paths = ['C:\\', 'alone', '/a/space in filename', 'C:\\multi\nline']
print([extract_basename(path) for path in extra_paths])
# ['C:', 'alone', 'space in filename', 'multi\nline']
更新:
If you only want a potential filename, if present (i.e., /a/b/ is a dir and so is c:\windows\), change the regex to: r'[^\\/]+(?![\\/])$' . For the "regex challenged," this changes the positive forward lookahead for some sort of slash to a negative forward lookahead, causing pathnames that end with said slash to return nothing instead of the last sub-directory in the pathname. Of course there is no guarantee that the potential filename actually refers to a file and for that os.path.is_dir() or os.path.is_file() would need to be employed.
这将匹配如下:
/a/b/c/ # nothing, pathname ends with the dir 'c'
c:\windows\ # nothing, pathname ends with the dir 'windows'
c:hello.txt # matches potential filename 'hello.txt'
~it_s_me/.bashrc # matches potential filename '.bashrc'
c:\windows\system32 # matches potential filename 'system32', except
# that is obviously a dir. os.path.is_dir()
# should be used to tell us for sure
正则表达式可以在这里测试。
import os
head, tail = os.path.split('path/to/file.exe')
尾巴是你想要的,文件名。
详见python os模块文档