似乎应该有一种比以下更简单的方法:
import string
s = "string. With. Punctuation?" # Sample string
out = s.translate(string.maketrans("",""), string.punctuation)
有?
似乎应该有一种比以下更简单的方法:
import string
s = "string. With. Punctuation?" # Sample string
out = s.translate(string.maketrans("",""), string.punctuation)
有?
当前回答
# FIRST METHOD
# Storing all punctuations in a variable
punctuation='!?,.:;"\')(_-'
newstring ='' # Creating empty string
word = raw_input("Enter string: ")
for i in word:
if(i not in punctuation):
newstring += i
print ("The string without punctuation is", newstring)
# SECOND METHOD
word = raw_input("Enter string: ")
punctuation = '!?,.:;"\')(_-'
newstring = word.translate(None, punctuation)
print ("The string without punctuation is",newstring)
# Output for both methods
Enter string: hello! welcome -to_python(programming.language)??,
The string without punctuation is: hello welcome topythonprogramminglanguage
其他回答
在不太严格的情况下,单行线可能会有所帮助:
''.join([c for c in s if c.isalnum() or c.isspace()])
这里有一个使用RegEx的简单方法
import re
punct = re.compile(r'(\w+)')
sentence = 'This ! is : a # sample $ sentence.' # Text with punctuation
tokenized = [m.group() for m in punct.finditer(sentence)]
sentence = ' '.join(tokenized)
print(sentence)
'This is a sample sentence'
myString.translate(None, string.punctuation)
显然,我无法对所选答案进行编辑,所以这里有一个适用于Python3的更新。在进行非平凡转换时,转换方法仍然是最有效的选择。
上面的@Brian为最初的繁重工作做出了贡献。感谢@ddejohn对原始测试的改进建议。
#!/usr/bin/env python3
"""Determination of most efficient way to remove punctuation in Python 3.
Results in Python 3.8.10 on my system using the default arguments:
set : 51.897
regex : 17.901
translate : 2.059
replace : 13.209
"""
import argparse
import re
import string
import timeit
parser = argparse.ArgumentParser()
parser.add_argument("--filename", "-f", default=argparse.__file__)
parser.add_argument("--iterations", "-i", type=int, default=10000)
opts = parser.parse_args()
with open(opts.filename) as fp:
s = fp.read()
exclude = set(string.punctuation)
table = str.maketrans("", "", string.punctuation)
regex = re.compile(f"[{re.escape(string.punctuation)}]")
def test_set(s):
return "".join(ch for ch in s if ch not in exclude)
def test_regex(s): # From Vinko's solution, with fix.
return regex.sub("", s)
def test_translate(s):
return s.translate(table)
def test_replace(s): # From S.Lott's solution
for c in string.punctuation:
s = s.replace(c, "")
return s
opts = dict(globals=globals(), number=opts.iterations)
solutions = "set", "regex", "translate", "replace"
for solution in solutions:
elapsed = timeit.timeit(f"test_{solution}(s)", **opts)
print(f"{solution:<10}: {elapsed:6.3f}")
这是我写的一个函数。它不是很有效,但很简单,您可以添加或删除任何您想要的标点符号:
def stripPunc(wordList):
"""Strips punctuation from list of words"""
puncList = [".",";",":","!","?","/","\\",",","#","@","$","&",")","(","\""]
for punc in puncList:
for word in wordList:
wordList=[word.replace(punc,'') for word in wordList]
return wordList