似乎应该有一种比以下更简单的方法:

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("",""), string.punctuation)

有?


当前回答

下面是Python 3.5的一行代码:

import string
"l*ots! o(f. p@u)n[c}t]u[a'ti\"on#$^?/".translate(str.maketrans({a:None for a in string.punctuation}))

其他回答

这里有一个没有正则表达式的解决方案。

import string

input_text = "!where??and!!or$$then:)"
punctuation_replacer = string.maketrans(string.punctuation, ' '*len(string.punctuation))    
print ' '.join(input_text.translate(punctuation_replacer).split()).strip()

Output>> where and or then

用空格替换标点用单个空格替换单词之间的多个空格删除尾随空格(如果有)条带()

myString.translate(None, string.punctuation)

从效率的角度来看,你不会击败

s.translate(None, string.punctuation)

对于更高版本的Python,请使用以下代码:

s.translate(str.maketrans('', '', string.punctuation))

它使用查找表在C语言中执行原始字符串操作——除了编写自己的C代码之外,没有什么能比这更好的了。

如果速度不令人担忧,另一个选择是:

exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)

这比用每个字符替换s.replace更快,但不会像正则表达式或字符串转换等非纯python方法那样执行得好,正如您从下面的计时中看到的那样。对于这种类型的问题,在尽可能低的水平上解决是有回报的。

计时代码:

import re, string, timeit

s = "string. With. Punctuation"
exclude = set(string.punctuation)
table = string.maketrans("","")
regex = re.compile('[%s]' % re.escape(string.punctuation))

def test_set(s):
    return ''.join(ch for ch in s if ch not in exclude)

def test_re(s):  # From Vinko's solution, with fix.
    return regex.sub('', s)

def test_trans(s):
    return s.translate(table, string.punctuation)

def test_repl(s):  # From S.Lott's solution
    for c in string.punctuation:
        s=s.replace(c,"")
    return s

print "sets      :",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)
print "regex     :",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)
print "translate :",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)
print "replace   :",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)

结果如下:

sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802

我还没有看到这个答案。只需使用正则表达式;它删除了除单词字符(\w)和数字字符(\d)之外的所有字符,后跟一个空白字符(\s):

import re
s = "string. With. Punctuation?" # Sample string 
out = re.sub(ur'[^\w\d\s]+', '', s)

这是我写的一个函数。它不是很有效,但很简单,您可以添加或删除任何您想要的标点符号:

def stripPunc(wordList):
    """Strips punctuation from list of words"""
    puncList = [".",";",":","!","?","/","\\",",","#","@","$","&",")","(","\""]
    for punc in puncList:
        for word in wordList:
            wordList=[word.replace(punc,'') for word in wordList]
    return wordList