如何找到一个子字符串的所有事件?

Python有string.find()和string.rfind()来获取字符串中子字符串的索引。

我想知道是否有像string.find_all()这样的东西可以返回所有找到的索引(不仅是从开始的第一个索引，还是从结束的第一个索引)。

例如:

string = "test test test test"

print string.find('test') # 0
print string.rfind('test') # 15

#this is the goal
print string.find_all('test') # [0,5,10,15]

要统计出现次数，请参见计算字符串中子字符串出现的次数。

当前回答

这不完全是OP要求的，但你也可以使用split函数来获得所有子字符串不出现的列表。OP没有指定代码的最终目标，但如果您的目标是删除子字符串，那么这可能是一个简单的一行程序。对于更大的字符串，可能有更有效的方法来做到这一点;在这种情况下，正则表达式更可取

# Extract all non-substrings
s = "an-example-string"
s_no_dash = s.split('-')
# >>> s_no_dash
# ['an', 'example', 'string']

# Or extract and join them into a sentence
s_no_dash2 = ' '.join(s.split('-'))
# >>> s_no_dash2
# 'an example string'

我简单浏览了一下其他的答案，如果这个已经在上面了，我很抱歉。

2021-05-19 13:43:55

其他回答

你可以试试:

>>> string = "test test test test"
>>> for index,value in enumerate(string):
    if string[index:index+(len("test"))] == "test":
        print index

0
5
10
15

2018-02-27 06:44:02

我认为最干净的解决方法是没有库和yield:

def find_all_occurrences(string, sub):
    index_of_occurrences = []
    current_index = 0
    while True:
        current_index = string.find(sub, current_index)
        if current_index == -1:
            return index_of_occurrences
        else:
            index_of_occurrences.append(current_index)
            current_index += len(sub)

find_all_occurrences(string, substr)

注意:find()方法在找不到任何东西时返回-1

2022-10-13 20:06:12

如果您只想使用numpy，这里是一个解决方案

import numpy as np

S= "test test test test"
S2 = 'test'
inds = np.cumsum([len(k)+len(S2) for k in S.split(S2)[:-1]])- len(S2)
print(inds)

2021-06-10 16:46:44

这个帖子有点老了，但对我来说很管用:

numberString = "onetwothreefourfivesixseveneightninefiveten"
testString = "five"

marker = 0
while marker < len(numberString):
    try:
        print(numberString.index("five",marker))
        marker = numberString.index("five", marker) + 1
    except ValueError:
        print("String not found")
        marker = len(numberString)

2014-09-01 12:48:11

当在一份文件中寻找大量的关键词时，使用flash文本

from flashtext import KeywordProcessor
words = ['test', 'exam', 'quiz']
txt = 'this is a test'
kwp = KeywordProcessor()
kwp.add_keywords_from_list(words)
result = kwp.extract_keywords(txt, span_info=True)

在大量搜索词列表上，Flashtext比正则表达式运行得更快。

2018-09-28 17:29:11

如何找到一个子字符串的所有事件?

推荐文章

最新文章

标签