似乎应该有一种比以下更简单的方法:

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("",""), string.punctuation)

有?


当前回答

您也可以这样做:

import string
' '.join(word.strip(string.punctuation) for word in 'text'.split())

其他回答

我喜欢使用这样的函数:

def scrub(abc):
    while abc[-1] is in list(string.punctuation):
        abc=abc[:-1]
    while abc[0] is in list(string.punctuation):
        abc=abc[1:]
    return abc

我还没有看到这个答案。只需使用正则表达式;它删除了除单词字符(\w)和数字字符(\d)之外的所有字符,后跟一个空白字符(\s):

import re
s = "string. With. Punctuation?" # Sample string 
out = re.sub(ur'[^\w\d\s]+', '', s)

考虑unicode。代码已在python3中检查。

from unicodedata import category
text = 'hi, how are you?'
text_without_punc = ''.join(ch for ch in text if not category(ch).startswith('P'))

不一定更简单,但如果你更熟悉re家族的话,就另辟蹊径。

import re, string
s = "string. With. Punctuation?" # Sample string 
out = re.sub('[%s]' % re.escape(string.punctuation), '', s)

使用Python从文本文件中删除停止词

print('====THIS IS HOW TO REMOVE STOP WORS====')

with open('one.txt','r')as myFile:

    str1=myFile.read()

    stop_words ="not", "is", "it", "By","between","This","By","A","when","And","up","Then","was","by","It","If","can","an","he","This","or","And","a","i","it","am","at","on","in","of","to","is","so","too","my","the","and","but","are","very","here","even","from","them","then","than","this","that","though","be","But","these"

    myList=[]

    myList.extend(str1.split(" "))

    for i in myList:

        if i not in stop_words:

            print ("____________")

            print(i,end='\n')