我想知道最简单的方法是将如下列表的字符串表示转换为列表:
x = '[ "A","B","C" , " D"]'
即使在用户在逗号之间加空格,在引号内加空格的情况下,我也需要处理它并将其转换为:
x = ["A", "B", "C", "D"]
我知道我可以用strip()和split()分隔空格,并检查非字母字符。但代码变得非常笨拙。有没有我不知道的快速功能?
我想知道最简单的方法是将如下列表的字符串表示转换为列表:
x = '[ "A","B","C" , " D"]'
即使在用户在逗号之间加空格,在引号内加空格的情况下,我也需要处理它并将其转换为:
x = ["A", "B", "C", "D"]
我知道我可以用strip()和split()分隔空格,并检查非字母字符。但代码变得非常笨拙。有没有我不知道的快速功能?
当前回答
不需要导入任何内容或进行评估。对于大多数基本用例,包括原始问题中给出的用例,您可以在一行中完成此操作。
一个衬垫
l_x = [i.strip() for i in x[1:-1].replace('"',"").split(',')]
解释
x = '[ "A","B","C" , " D"]'
# String indexing to eliminate the brackets.
# Replace, as split will otherwise retain the quotes in the returned list
# Split to convert to a list
l_x = x[1:-1].replace('"',"").split(',')
输出:
for i in range(0, len(l_x)):
print(l_x[i])
# vvvv output vvvvv
'''
A
B
C
D
'''
print(type(l_x)) # out: class 'list'
print(len(l_x)) # out: 4
您可以根据需要使用列表理解来解析和清理此列表。
l_x = [i.strip() for i in l_x] # list comprehension to clean up
for i in range(0, len(l_x)):
print(l_x[i])
# vvvvv output vvvvv
'''
A
B
C
D
'''
嵌套列表
如果您有嵌套列表,它确实会变得有点烦人。如果不使用正则表达式(这将简化替换),并且假设您希望返回一个扁平列表(python的zen表示扁平优于嵌套):
x = '[ "A","B","C" , " D", ["E","F","G"]]'
l_x = x[1:-1].split(',')
l_x = [i
.replace(']', '')
.replace('[', '')
.replace('"', '')
.strip() for i in l_x
]
# returns ['A', 'B', 'C', 'D', 'E', 'F', 'G']
如果您需要保留嵌套列表,它会变得有点难看,但仍然可以通过正则表达式和列表理解来完成:
import re
x = '[ "A","B","C" , " D", "["E","F","G"]","Z", "Y", "["H","I","J"]", "K", "L"]'
# Clean it up so the regular expression is simpler
x = x.replace('"', '').replace(' ', '')
# Look ahead for the bracketed text that signifies nested list
l_x = re.split(r',(?=\[[A-Za-z0-9\',]+\])|(?<=\]),', x[1:-1])
print(l_x)
# Flatten and split the non nested list items
l_x0 = [item for items in l_x for item in items.split(',') if not '[' in items]
# Convert the nested lists to lists
l_x1 = [
i[1:-1].split(',') for i in l_x if '[' in i
]
# Add the two lists
l_x = l_x0 + l_x1
最后一个解决方案可以处理任何以字符串形式存储的列表,无论是否嵌套。
其他回答
如果它只是一个一维列表,则可以在不导入任何内容的情况下完成此操作:
>>> x = u'[ "A","B","C" , " D"]'
>>> ls = x.strip('[]').replace('"', '').replace(' ', '').split(',')
>>> ls
['A', 'B', 'C', 'D']
不需要导入任何内容或进行评估。对于大多数基本用例,包括原始问题中给出的用例,您可以在一行中完成此操作。
一个衬垫
l_x = [i.strip() for i in x[1:-1].replace('"',"").split(',')]
解释
x = '[ "A","B","C" , " D"]'
# String indexing to eliminate the brackets.
# Replace, as split will otherwise retain the quotes in the returned list
# Split to convert to a list
l_x = x[1:-1].replace('"',"").split(',')
输出:
for i in range(0, len(l_x)):
print(l_x[i])
# vvvv output vvvvv
'''
A
B
C
D
'''
print(type(l_x)) # out: class 'list'
print(len(l_x)) # out: 4
您可以根据需要使用列表理解来解析和清理此列表。
l_x = [i.strip() for i in l_x] # list comprehension to clean up
for i in range(0, len(l_x)):
print(l_x[i])
# vvvvv output vvvvv
'''
A
B
C
D
'''
嵌套列表
如果您有嵌套列表,它确实会变得有点烦人。如果不使用正则表达式(这将简化替换),并且假设您希望返回一个扁平列表(python的zen表示扁平优于嵌套):
x = '[ "A","B","C" , " D", ["E","F","G"]]'
l_x = x[1:-1].split(',')
l_x = [i
.replace(']', '')
.replace('[', '')
.replace('"', '')
.strip() for i in l_x
]
# returns ['A', 'B', 'C', 'D', 'E', 'F', 'G']
如果您需要保留嵌套列表,它会变得有点难看,但仍然可以通过正则表达式和列表理解来完成:
import re
x = '[ "A","B","C" , " D", "["E","F","G"]","Z", "Y", "["H","I","J"]", "K", "L"]'
# Clean it up so the regular expression is simpler
x = x.replace('"', '').replace(' ', '')
# Look ahead for the bracketed text that signifies nested list
l_x = re.split(r',(?=\[[A-Za-z0-9\',]+\])|(?<=\]),', x[1:-1])
print(l_x)
# Flatten and split the non nested list items
l_x0 = [item for items in l_x for item in items.split(',') if not '[' in items]
# Convert the nested lists to lists
l_x1 = [
i[1:-1].split(',') for i in l_x if '[' in i
]
# Add the two lists
l_x = l_x0 + l_x1
最后一个解决方案可以处理任何以字符串形式存储的列表,无论是否嵌套。
在处理存储为Pandas DataFrame的报废数据时,可能会遇到这样的问题。
如果值列表以文本形式显示,则此解决方案非常有用。
def textToList(hashtags):
return hashtags.strip('[]').replace('\'', '').replace(' ', '').split(',')
hashtags = "[ 'A','B','C' , ' D']"
hashtags = textToList(hashtags)
Output: ['A', 'B', 'C', 'D']
不需要外部库。
import ast
l = ast.literal_eval('[ "A","B","C" , " D"]')
l = [i.strip() for i in l]
这是你能做到的,
**
x = '[ "A","B","C" , " D"]'
print(list(eval(x)))
**最好的答案是公认的答案
虽然这不是一个安全的方法,但最好的答案是公认的。当答案发布时,我并没有意识到评估的危险。