如何从字符串剥离所有空白

我如何剥离所有的空间在一个python字符串?例如，我想要一个像stripmyspaces这样的字符串变成stripmyspaces，但我似乎不能用strip()来完成:

>>> 'strip my spaces'.strip()
'strip my spaces'

当前回答

如果不需要最佳性能，你只想要一些非常简单的东西，你可以定义一个基本函数来测试每个字符，使用string类内置的"isspace"方法:

def remove_space(input_string):
    no_white_space = ''
    for c in input_string:
        if not c.isspace():
            no_white_space += c
    return no_white_space

以这种方式构建no_white_space字符串不会有理想的性能，但解决方案很容易理解。

>>> remove_space('strip my spaces')
'stripmyspaces'

如果不想定义函数，可以将其转换为与列表推导式略有相似的内容。借用顶部答案的连接解决方案:

>>> "".join([c for c in "strip my spaces" if not c.isspace()])
'stripmyspaces'

2019-11-08 22:15:22

其他回答

对于Python 3:

>>> import re
>>> re.sub(r'\s+', '', 'strip my \n\t\r ASCII and \u00A0 \u2003 Unicode spaces')
'stripmyASCIIandUnicodespaces'
>>> # Or, depending on the situation:
>>> re.sub(r'(\s|\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF)+', '', \
... '\uFEFF\t\t\t strip all \u000A kinds of \u200B whitespace \n')
'stripallkindsofwhitespace'

.．.处理任何你没有想到的空白字符——相信我们，有很多。

\s本身总是覆盖ASCII空白:

(定期)空间选项卡新行(\n) 回车(\r) 换页垂直制表符

另外:

对于启用了re.UNICODE的Python 2， Python 3，无需任何额外操作，

.．.\s还包括Unicode空白字符，例如:

插入空格, 他们的空间, 表意的空间,

…等。在“带有White_Space属性的Unicode字符”下面可以看到完整的列表。

但是\s不覆盖不属于空格的字符，这些字符实际上是空格，例如:

任意工匠, 蒙古语元音分隔符，零宽度不间断空格(又称字节顺序标记)，

…等。请在“没有White_Space属性的相关Unicode字符”下面查看完整列表。

所以这6个字符包含在第二个正则表达式的列表中，\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF。

来源:

https://docs.python.org/2/library/re.html https://docs.python.org/3/library/re.html https://en.wikipedia.org/wiki/Unicode_character_property

2010-09-18 00:48:21

如果不需要最佳性能，你只想要一些非常简单的东西，你可以定义一个基本函数来测试每个字符，使用string类内置的"isspace"方法:

def remove_space(input_string):
    no_white_space = ''
    for c in input_string:
        if not c.isspace():
            no_white_space += c
    return no_white_space

以这种方式构建no_white_space字符串不会有理想的性能，但解决方案很容易理解。

>>> remove_space('strip my spaces')
'stripmyspaces'

如果不想定义函数，可以将其转换为与列表推导式略有相似的内容。借用顶部答案的连接解决方案:

>>> "".join([c for c in "strip my spaces" if not c.isspace()])
'stripmyspaces'

2019-11-08 22:15:22

利用str.split没有sep参数的行为:

>>> s = " \t foo \n bar "
>>> "".join(s.split())
'foobar'

如果你只是想删除空格而不是所有的空白:

>>> s.replace(" ", "")
'\tfoo\nbar'

过早优化

尽管效率不是首要目标——编写清晰的代码——这里有一些初始时间安排:

$ python -m timeit '"".join(" \t foo \n bar ".split())'
1000000 loops, best of 3: 1.38 usec per loop
$ python -m timeit -s 'import re' 're.sub(r"\s+", "", " \t foo \n bar ")'
100000 loops, best of 3: 15.6 usec per loop

注意，正则表达式是缓存的，所以它并不像你想象的那么慢。预先编译它会有所帮助，但只有在实践中调用多次才会有影响:

$ python -m timeit -s 'import re; e = re.compile(r"\s+")' 'e.sub("", " \t foo \n bar ")'
100000 loops, best of 3: 7.76 usec per loop

尽管re.sub慢了11.3倍，但请记住，瓶颈肯定在其他地方。大多数程序不会注意到这3个选项之间的区别。

2010-09-18 00:54:26

import re
re.sub(' ','','strip my spaces')

2016-10-24 13:14:42

另外,

"strip my spaces".translate( None, string.whitespace )

下面是Python3版本:

"strip my spaces".translate(str.maketrans('', '', string.whitespace))

2013-05-20 16:16:31

如何从字符串剥离所有空白

推荐文章

最新文章

标签