从字符串中删除标点符号的最佳方法

似乎应该有一种比以下更简单的方法：

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("",""), string.punctuation)

有？

当前回答

>>> s = "string. With. Punctuation?"
>>> s = re.sub(r'[^\w\s]','',s)
>>> re.split(r'\s*', s)


['string', 'With', 'Punctuation']

2016-08-24 05:43:58

其他回答

# FIRST METHOD
# Storing all punctuations in a variable    
punctuation='!?,.:;"\')(_-'
newstring ='' # Creating empty string
word = raw_input("Enter string: ")
for i in word:
     if(i not in punctuation):
                  newstring += i
print ("The string without punctuation is", newstring)

# SECOND METHOD
word = raw_input("Enter string: ")
punctuation = '!?,.:;"\')(_-'
newstring = word.translate(None, punctuation)
print ("The string without punctuation is",newstring)


# Output for both methods
Enter string: hello! welcome -to_python(programming.language)??,
The string without punctuation is: hello welcome topythonprogramminglanguage

2017-01-02 08:56:57

这里有一个使用RegEx的简单方法

import re

punct = re.compile(r'(\w+)')

sentence = 'This ! is : a # sample $ sentence.' # Text with punctuation
tokenized = [m.group() for m in punct.finditer(sentence)]
sentence = ' '.join(tokenized)
print(sentence) 
'This is a sample sentence'

2020-08-20 08:05:39

不一定更简单，但如果你更熟悉re家族的话，就另辟蹊径。

import re, string
s = "string. With. Punctuation?" # Sample string 
out = re.sub('[%s]' % re.escape(string.punctuation), '', s)

2008-11-05 17:39:55

对于Python 3 str或Python 2 unicode值，str.translate（）只使用字典；在该映射中查找代码点（整数），并删除映射到None的任何内容。

要删除（某些？）标点符号，请使用：

import string

remove_punct_map = dict.fromkeys(map(ord, string.punctuation))
s.translate(remove_punct_map)

dict.fromkeys（）类方法使创建映射变得简单，根据键的顺序将所有值设置为None。

要删除所有标点符号，而不仅仅是ASCII标点符号，您的表需要稍微大一点；参见J.F.Sebastian的答案（Python 3版本）：

import unicodedata
import sys

remove_punct_map = dict.fromkeys(i for i in range(sys.maxunicode)
                                 if unicodedata.category(chr(i)).startswith('P'))

2013-09-02 09:57:54

我通常用这样的词：

>>> s = "string. With. Punctuation?" # Sample string
>>> import string
>>> for c in string.punctuation:
...     s= s.replace(c,"")
...
>>> s
'string With Punctuation'

2008-11-05 17:41:27

从字符串中删除标点符号的最佳方法

推荐文章

最新文章

标签