在正则表达式的上下文中，'lazy'和'greedy'是什么意思?

这两个术语是什么?

当前回答

贪婪匹配。正则表达式的默认行为是贪婪的。这意味着它会尝试提取尽可能多的数据，直到它符合某个模式，即使在语法上只需要较小的部分就足够了。

例子:

import re
text = "<body>Regex Greedy Matching Example </body>"
re.findall('<.*>', text)
#> ['<body>Regex Greedy Matching Example </body>']

它提取了整个字符串，而不是直到' > '第一次出现才匹配。这是regex默认的贪婪或“全部拿走”行为。

另一方面，懒惰匹配“需要的越少越好”。这可以通过添加一个?在图案的最后。

例子:

re.findall('<.*?>', text)
#> ['<body>', '</body>']

如果只希望检索第一个匹配项，则使用search方法。

re.search('<.*?>', text).group()
#> '<body>'

来源:Python Regex Examples

2018-01-21 05:35:59

其他回答

据我所知，大多数正则表达式引擎默认是贪婪的。在量词末尾添加问号将启用惰性匹配。

正如@Andre S在评论中提到的。

贪婪:继续搜索，直到条件不满足。 Lazy:当条件满足时停止搜索。

参考下面的例子，了解什么是贪婪的，什么是懒惰的。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String args[]){
        String money = "100000000999";
        String greedyRegex = "100(0*)";
        Pattern pattern = Pattern.compile(greedyRegex);
        Matcher matcher = pattern.matcher(money);
        while(matcher.find()){
            System.out.println("I'm greedy and I want " + matcher.group() + " dollars. This is the most I can get.");
        }
        
        String lazyRegex = "100(0*?)";
        pattern = Pattern.compile(lazyRegex);
        matcher = pattern.matcher(money);
        while(matcher.find()){
            System.out.println("I'm too lazy to get so much money, only " + matcher.group() + " dollars is enough for me");
        }
    }
}

The result is:

I'm greedy and I want 100000000 dollars. This is the most I can get.

I'm too lazy to get so much money, only 100 dollars is enough for me

2016-11-09 16:39:06

'Greedy'表示匹配最长的字符串。

'Lazy'表示匹配最短的字符串。

例如，贪婪的h.+l匹配'hello'中的'hell'，但懒惰的h.+?L和“hel”匹配。

2010-02-20 06:19:41

试着理解以下行为:

    var input = "0014.2";

Regex r1 = new Regex("\\d+.{0,1}\\d+");
Regex r2 = new Regex("\\d*.{0,1}\\d*");

Console.WriteLine(r1.Match(input).Value); // "0014.2"
Console.WriteLine(r2.Match(input).Value); // "0014.2"

input = " 0014.2";

Console.WriteLine(r1.Match(input).Value); // "0014.2"
Console.WriteLine(r2.Match(input).Value); // " 0014"

input = "  0014.2";

Console.WriteLine(r1.Match(input).Value); // "0014.2"
Console.WriteLine(r2.Match(input).Value); // ""

2016-10-30 06:31:14

贪婪意味着它将消耗你的模式，直到没有剩下的，它不能再看了。

Lazy会在遇到您请求的第一个模式时立即停止。

我经常遇到的一个常见的例子是\s*-\s*?([0-9]{2}\s*-\s*?[0-9]{7})

第一个\s*被归类为贪婪的，因为有*，它会在遇到数字后寻找尽可能多的空白，然后寻找破折号“-”。第二个s*在哪里?懒惰是因为*的存在吗?这意味着它将查看第一个空白字符并在那里停止。

2018-02-06 15:41:32

例子:

import re
text = "<body>Regex Greedy Matching Example </body>"
re.findall('<.*>', text)
#> ['<body>Regex Greedy Matching Example </body>']