re.search和re.match的区别是什么?

Python re模块中的search()和match()函数有什么区别?

我读过Python 2文档(Python 3文档)，但我似乎从来都不记得它。我得不停地查资料，重新学习。我希望有人能用例子清楚地回答这个问题，这样(也许)我就能记住它了。或者至少我可以有一个更好的地方带着我的问题回来，重新学习的时间也会更少。

当前回答

Re.search在整个字符串中搜索模式，而re.match不搜索模式;如果不匹配，则只能在字符串的开头匹配它。

2008-10-08 01:07:26

其他回答

Re.match锚定在字符串的开头。这与换行符无关，因此它与在模式中使用^是不同的。

正如re.match文档所说:

属性中的零或多个字符字符串开头匹配正则表达式模式，返回对应的MatchObject实例。如果字符串没有，则返回None 匹配模式;注意这是不同于零长度匹配。注意:如果您想找到一个匹配在字符串中的任何位置，使用search() 代替。

Re.search搜索整个字符串，如文档所示:

扫描字符串寻找a 正则表达式所在的位置模式生成匹配，并返回对应的MatchObject实例。类中没有位置则返回None 字符串匹配模式;请注意, 这和求中的某个点上的零长度匹配字符串。

所以如果你需要匹配字符串的开头，或者匹配整个字符串使用match。它更快。否则使用搜索。

文档中有一个特定的部分用于匹配与搜索，也包括多行字符串:

Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default). Note that match may differ from search even when using a regular expression beginning with '^': '^' matches only at the start of the string, or in MULTILINE mode also immediately following a newline. The “match” operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optional pos argument regardless of whether a newline precedes it.

好了，说够了。下面来看一些示例代码:

# example code:
string_with_newlines = """something
someotherthing"""

import re

print re.match('some', string_with_newlines) # matches
print re.match('someother', 
               string_with_newlines) # won't match
print re.match('^someother', string_with_newlines, 
               re.MULTILINE) # also won't match
print re.search('someother', 
                string_with_newlines) # finds something
print re.search('^someother', string_with_newlines, 
                re.MULTILINE) # also finds something

m = re.compile('thing$', re.MULTILINE)

print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines, 
               re.MULTILINE) # also matches

2008-10-08 00:53:12

匹配比搜索快得多，所以如果要处理数百万个样本，可以使用regex.match((.*?)word(.*?))而不是使用regex.search("word")，从而获得大量性能。

@ivan_bilan在上面接受的答案下的评论让我思考这样的黑客是否真的能加速任何东西，所以让我们来看看你能真正获得多少吨的性能。

我准备了以下测试套件:

import random
import re
import string
import time

LENGTH = 10
LIST_SIZE = 1000000

def generate_word():
    word = [random.choice(string.ascii_lowercase) for _ in range(LENGTH)]
    word = ''.join(word)
    return word

wordlist = [generate_word() for _ in range(LIST_SIZE)]

start = time.time()
[re.search('python', word) for word in wordlist]
print('search:', time.time() - start)

start = time.time()
[re.match('(.*?)python(.*?)', word) for word in wordlist]
print('match:', time.time() - start)

我做了10次测量(1M, 2M，…)， 1000万字)，这给了我以下的情节:

可以看到，搜索模式'python'比匹配模式'(.*?)python(.*?)'要快。

Python很聪明。不要试图变得更聪明。

2018-04-07 19:03:18

更短:

搜索扫描整个字符串。 Match只扫描字符串的开头。

以下是前任说的:

>>> a = "123abc"
>>> re.match("[a-z]+",a)
None
>>> re.search("[a-z]+",a)
abc

2018-10-31 00:22:53

迅速的回答

re.search('test', ' test')      # returns a Truthy match object (because the search starts from any index) 

re.match('test', ' test')       # returns None (because the search start from 0 index)
re.match('test', 'test')        # returns a Truthy match object (match at 0 index)

2022-06-20 07:32:02

Re.match尝试匹配字符串开头的模式。Re.search尝试在整个字符串中匹配模式，直到找到匹配。

2008-10-08 00:54:57

re.search和re.match的区别是什么?

推荐文章

最新文章

标签