我有一个字符串变量,它代表一个dos路径,例如:
var = “d:\stuff\morestuff\Furtherdown\THEFILE.txt”
我想把这个字符串分成:
[ “d”, “stuff”, “morestuff”, “Furtherdown”, “THEFILE.txt” ]
我尝试过使用split()和replace(),但它们要么只处理第一个反斜杠,要么将十六进制数字插入字符串。
我需要以某种方式将这个字符串变量转换为原始字符串,以便我可以解析它。
最好的方法是什么?
我还应该添加,var的内容,即我试图解析的路径,实际上是一个命令行查询的返回值。这不是我自己生成的路径数据。它存储在一个文件中,命令行工具不会转义反斜杠。
我曾多次被那些编写自己的路径篡改函数并出错的人所困扰。空格、斜杠、反斜杠、冒号——造成混淆的可能性不是无穷尽的,但无论如何都很容易犯错误。所以我很坚持使用操作系统。路径,并在此基础上推荐它。
(However, the path to virtue is not the one most easily taken, and many people when finding this are tempted to take a slippery path straight to damnation. They won't realise until one day everything falls to pieces, and they -- or, more likely, somebody else -- has to work out why everything has gone wrong, and it turns out somebody made a filename that mixes slashes and backslashes -- and some person suggests that the answer is "not to do that". Don't be any of these people. Except for the one who mixed up slashes and backslashes -- you could be them if you like.)
你可以像这样获得驱动器和路径+文件:
drive, path_and_file = os.path.splitdrive(path)
获取路径和文件:
path, file = os.path.split(path_and_file)
获取单个文件夹名称并不是特别方便,但这是一种诚实的中等不舒服,这增加了后来发现一些实际工作良好的东西的乐趣:
folders = []
while 1:
path, folder = os.path.split(path)
if folder != "":
folders.append(folder)
elif path != "":
folders.append(path)
break
folders.reverse()
(如果路径原本是绝对路径,则会在文件夹的开头弹出“\”。如果你不想这样做,你可能会丢失一些代码。)
在Python >=3.4中,这变得简单得多。您现在可以使用pathlib.Path.parts来获取路径的所有部分。
例子:
>>> from pathlib import Path
>>> Path('C:/path/to/file.txt').parts
('C:\\', 'path', 'to', 'file.txt')
>>> Path(r'C:\path\to\file.txt').parts
('C:\\', 'path', 'to', 'file.txt')
在Python 3的Windows安装上,这将假设您使用的是Windows路径,而在*nix上,它将假设您使用的是posix路径。这通常是你想要的,但如果不是,你可以使用类pathlib。PurePosixPath或pathlib。PureWindowsPath:
>>> from pathlib import PurePosixPath, PureWindowsPath
>>> PurePosixPath('/path/to/file.txt').parts
('/', 'path', 'to', 'file.txt')
>>> PureWindowsPath(r'C:\path\to\file.txt').parts
('C:\\', 'path', 'to', 'file.txt')
>>> PureWindowsPath(r'\\host\share\path\to\file.txt').parts
('\\\\host\\share\\', 'path', 'to', 'file.txt')
编辑:
还有一个python 2的反向端口:pathlib2
from os import path as os_path
然后
def split_path_iter(string, lst):
head, tail = os_path.split(string)
if head == '':
return [string] + lst
else:
return split_path_iter(head, [tail] + lst)
def split_path(string):
return split_path_iter(string, [])
或者,受以上答案启发(更优雅):
def split_path(string):
head, tail = os_path.split(string)
if head == '':
return [string]
else:
return split_path(head) + [tail]
Re.split()比string.split()更有帮助
import re
var = "d:\stuff\morestuff\furtherdown\THEFILE.txt"
re.split( r'[\\/]', var )
['d:', 'stuff', 'morestuff', 'furtherdown', 'THEFILE.txt']
如果你还想支持Linux和Mac路径,只需添加filter(None,result),这样它就会从split()中删除不需要的",因为它们的路径以'/'或'//'开头。例如'//mount/…'/var/tmp/'
import re
var = "/var/stuff/morestuff/furtherdown/THEFILE.txt"
result = re.split( r'[\\/]', var )
filter( None, result )
['var', 'stuff', 'morestuff', 'furtherdown', 'THEFILE.txt']