如何在正则表达式中匹配跨多行任意字符?

例如，这个正则表达式

(.*)<FooBar>

将匹配:

abcde<FooBar>

但我如何让它在多行之间匹配呢?

abcde
fghij<FooBar>

在语言内部使用的上下文中，正则表达式作用于字符串，而不是行。因此，假设输入字符串有多行，您应该能够正常使用正则表达式。

在这种情况下，给定的正则表达式将匹配整个字符串，因为存在"<FooBar>"。根据regex实现的具体情况，$1值(从"(.*)"中获得)将是"fghij"或"abcde\nfghij"。正如其他人所说，一些实现允许您控制“.”是否匹配换行符，从而让您做出选择。

基于行的正则表达式通常用于命令行，例如egrep。

2008-10-01 18:49:42

这取决于语言，但应该有一个可以添加到正则表达式模式的修饰符。在PHP中是:

/(.*)<FooBar>/s

结尾的s使点匹配所有字符，包括换行符。

2008-10-01 18:52:18

试试这个:

((.|\n)*)<FooBar>

它基本上是说“任何字符或换行符”重复0次或多次。

2008-10-01 18:52:27

"."通常不匹配换行符。大多数正则表达式引擎允许您添加s标志(也称为DOTALL和SINGLELINE)来使“.”也匹配换行符。如果失败了，你可以做一些类似[\S\ S]的事情。

2008-10-01 18:52:28

一般来说,。不匹配换行符，因此try ((.|\n)*)<foobar>。

2008-10-01 18:52:56

Use:

/(.*)<FooBar>/s

s使点(.)匹配回车符。

2008-10-01 18:54:07

请注意，(.|\n)*的效率可能低于(例如)[\s\ s]*(如果您的语言的正则表达式支持这种转译)，也低于查找如何指定制造的修饰符。还要匹配换行符。或者你也可以使用POSIXy选项，比如[[:space:][:^space:]]*。

2008-10-02 03:31:26

我也遇到过同样的问题，我解决的方法可能不是最好的，但确实有效。在我做真正的比赛之前，我替换了所有换行符:

mystring = Regex.Replace(mystring, "\r\n", "")

我在操作HTML，所以在这种情况下换行对我来说并不重要。

我尝试了上面所有的建议，但都没有成功。我使用的是。net 3.5供你参考。

2009-03-26 14:57:08

使用RegexOptions.Singleline。它改变了…的意思。要包含换行符。

Regex.Replace(content, searchText, replaceText, RegexOptions.Singleline);

2010-04-13 00:42:03

我想在Java中匹配一个特定的if块:

   ...
   ...
   if(isTrue){
       doAction();

   }
...
...
}

如果我使用regExp

if \(isTrue(.|\n)*}

它包含方法块的右大括号，所以我使用

if \(!isTrue([^}.]|\n)*}

从通配符匹配中排除结束大括号。

2011-01-18 09:31:21

在许多正则表达式方言中，/[\S\ S]*<Foobar>/将满足您的需要。源

2011-07-30 13:03:56

如果您正在使用Eclipse搜索，您可以启用“DOTALL”选项来生成'。'匹配任何字符，包括行分隔符:只需在搜索字符串的开头添加“(?s)”。例子:

(?s).*<FooBar>

2011-11-25 13:16:55

解决方案:

使用模式修饰符sU将在PHP中获得所需的匹配。

例子:

preg_match('/(.*)/sU', $content, $match);

来源:

模式修饰符

2012-04-04 11:00:26

通常，我们必须修改子字符串，在子字符串前面的行中散布一些关键字。考虑一个XML元素:

<TASK>
  <UID>21</UID>
  <Name>Architectural design</Name>
  <PercentComplete>81</PercentComplete>
</TASK>

假设我们想将81修改为其他值，比如40。首先识别。UID.21…，然后跳过包括\n在内的所有字符，直到。percentcompleted ..正则表达式模式和replace规范是:

String hw = new String("<TASK>\n  <UID>21</UID>\n  <Name>Architectural design</Name>\n  <PercentComplete>81</PercentComplete>\n</TASK>");
String pattern = new String ("(<UID>21</UID>)((.|\n)*?)(<PercentComplete>)(\\d+)(</PercentComplete>)");
String replaceSpec = new String ("$1$2$440$6");
// Note that the group (<PercentComplete>) is $4 and the group ((.|\n)*?) is $2.

String iw = hw.replaceFirst(pattern, replaceSpec);
System.out.println(iw);

<TASK>
  <UID>21</UID>
  <Name>Architectural design</Name>
  <PercentComplete>40</PercentComplete>
</TASK>

子组(.|\n)可能是缺失的组$3。如果我们通过(?:.|\n)使它不捕获，那么$3是(<PercentComplete>)。因此，pattern和replaceSpec也可以是:

pattern = new String("(<UID>21</UID>)((?:.|\n)*?)(<PercentComplete>)(\\d+)(</PercentComplete>)");
replaceSpec = new String("$1$2$340$5")

而且替换后的机器和以前一样工作正常。

2012-04-21 20:05:32

([\ s \ s] *) < FooBar >

点匹配除换行符(\r\n)以外的所有字符。所以使用\s\ s，它将匹配所有字符。

2012-07-19 17:59:45

在Ruby中，你可以使用'm'选项(多行):

/YOUR_REGEXP/m

有关更多信息，请参阅ruby-doc.org上的Regexp文档。

2012-08-03 07:52:16

对于Eclipse，下面的表达式是有效的:

喷火 jadajada酒吧”

正则表达式:

Foo[\S\s]{1,10}.*Bar*

2013-01-03 11:32:13

在基于java的正则表达式中，可以使用[\s\ s]。

2013-06-03 06:22:19

问题是，能否。模式匹配任何字符?答案因引擎而异。主要区别在于该模式是由POSIX正则库使用还是由非POSIX正则库使用。

关于lua-pattern需要特别注意:它们不被认为是正则表达式，但是。匹配任何字符，与基于posix的引擎相同。

关于matlab和八度音阶的另一个注意事项:默认匹配任何字符(演示):str = "abcde\n fghij<Foobar>";expression = '(.*)<Foobar>*';[tokens,matches] = regexp(str,expression，'tokens'，'match');(令牌包含abcde\n fghij项)。

此外，在boost的所有正则表达式语法中，点默认匹配换行符。Boost的ECMAScript语法允许您使用regex_constants::no_mod_m (source)关闭此功能。

对于oracle(它是基于POSIX的)，使用n选项(演示):select regexp_substr('abcde' || chr(10) ||' fghij<Foobar>'， '(.*)<Foobar>'， 1,1， 'n'， 1) As results from dual

基于posix的引擎:

一个纯粹的。已经匹配换行符，所以不需要使用任何修饰符，参见bash (demo)。

tcl (demo)， postgresql (demo)， r (TRE, base r默认引擎不带perl=TRUE，对于base r带perl=TRUE或对于stringr/stringi模式，使用(?s)内联修饰符)(demo)也可以处理。同样的方法。

但是，大多数基于posix的工具都是逐行处理输入的。因此,。不匹配换行符，因为换行符不在范围内。下面是一些如何覆盖它的例子:

sed - There are multiple workarounds. The most precise, but not very safe, is sed 'H;1h;$!d;x; s/$.*$><Foobar>/\1/' (H;1h;$!d;x; slurps the file into memory). If whole lines must be included, sed '/start_pattern/,/end_pattern/d' file (removing from start will end with matched lines included) or sed '/start_pattern/,/end_pattern/{{//!d;};}' file (with matching lines excluded) can be considered. perl - perl -0pe 's/(.*)<FooBar>/$1/gs' <<< "$str" (-0 slurps the whole file into memory, -p prints the file after applying the script given by -e). Note that using -000pe will slurp the file and activate 'paragraph mode' where Perl uses consecutive newlines (\n\n) as the record separator. gnu-grep - grep -Poz '(?si)abc\K.*?(?=<Foobar>)' file. Here, z enables file slurping, (?s) enables the DOTALL mode for the . pattern, (?i) enables case insensitive mode, \K omits the text matched so far, *? is a lazy quantifier, (?=<Foobar>) matches the location before <Foobar>. pcregrep - pcregrep -Mi "(?si)abc\K.*?(?=<Foobar>)" file (M enables file slurping here). Note pcregrep is a good solution for macOS grep users.

看演示。

Non-POSIX-based引擎:

php - Use the s modifier PCRE_DOTALL modifier: preg_match('~(.*)<Foobar>~s', $s, $m) (demo) c# - Use RegexOptions.Singleline flag (demo): - var result = Regex.Match(s, @"(.*)<Foobar>", RegexOptions.Singleline).Groups[1].Value;- var result = Regex.Match(s, @"(?s)(.*)<Foobar>").Groups[1].Value; powershell - Use the (?s) inline option: $s = "abcde`nfghij<FooBar>"; $s -match "(?s)(.*)<Foobar>"; $matches[1] perl - Use the s modifier (or (?s) inline version at the start) (demo): /(.*)<FooBar>/s python - Use the re.DOTALL (or re.S) flags or (?s) inline modifier (demo): m = re.search(r"(.*)<FooBar>", s, flags=re.S) (and then if m:, print(m.group(1))) java - Use Pattern.DOTALL modifier (or inline (?s) flag) (demo): Pattern.compile("(.*)<FooBar>", Pattern.DOTALL) kotlin - Use RegexOption.DOT_MATCHES_ALL : "(.*)<FooBar>".toRegex(RegexOption.DOT_MATCHES_ALL) groovy - Use (?s) in-pattern modifier (demo): regex = /(?s)(.*)<FooBar>/ scala - Use (?s) modifier (demo): "(?s)(.*)<Foobar>".r.findAllIn("abcde\n fghij<Foobar>").matchData foreach { m => println(m.group(1)) } javascript - Use [^] or workarounds [\d\D] / [\w\W] / [\s\S] (demo): s.match(/([\s\S]*)<FooBar>/)[1] c++ (std::regex) Use [\s\S] or the JavaScript workarounds (demo): regex rex(R"(([\s\S]*)<FooBar>)"); vba vbscript - Use the same approach as in JavaScript, ([\s\S]*)<Foobar>. (NOTE: The MultiLine property of the RegExp object is sometimes erroneously thought to be the option to allow . match across line breaks, while, in fact, it only changes the ^ and $ behavior to match start/end of lines rather than strings, the same as in JavaScript regex) behavior.) ruby - Use the /m MULTILINE modifier (demo): s[/(.*)<Foobar>/m, 1] rtrebase-r - Base R PCRE regexps - use (?s): regmatches(x, regexec("(?s)(.*)<FooBar>",x, perl=TRUE))[[1]][2] (demo) ricustringrstringi - in stringr/stringi regex funtions that are powered with the ICU regex engine. Also use (?s): stringr::str_match(x, "(?s)(.*)<FooBar>")[,2] (demo) go - Use the inline modifier (?s) at the start (demo): re: = regexp.MustCompile(`(?s)(.*)<FooBar>`) swift - Use dotMatchesLineSeparators or (easier) pass the (?s) inline modifier to the pattern: let rx = "(?s)(.*)<Foobar>" objective-c - The same as Swift. (?s) works the easiest, but here is how the option can be used: NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionDotMatchesLineSeparators error:&regexError]; re2, google-apps-script - Use the (?s) modifier (demo): "(?s)(.*)<Foobar>" (in Google Spreadsheets, =REGEXEXTRACT(A2,"(?s)(.*)<Foobar>"))

关于(?s)的说明:

在大多数非posix引擎中，可以使用(?s)内联修饰符(或嵌入式标志选项)来强制执行。匹配换行符。

If placed at the start of the pattern, (?s) changes the bahavior of all . in the pattern. If the (?s) is placed somewhere after the beginning, only those .s will be affected that are located to the right of it unless this is a pattern passed to Python's re. In Python re, regardless of the (?s) location, the whole pattern . is affected. The (?s) effect is stopped using (?-s). A modified group can be used to only affect a specified range of a regex pattern (e.g., Delim1(?s:.*?)\nDelim2.* will make the first .*? match across newlines and the second .* will only match the rest of the line).

POSIX注意:

在非posix正则表达式引擎中，为了匹配任何字符，可以使用[\s\ s] / [\d\ d] / [\w\ w]结构。

在POSIX中，[\s\ s]不匹配任何字符(就像在JavaScript或任何非POSIX引擎中一样)，因为括号表达式内不支持正则转义序列。[\s\ s]被解析为匹配单个字符\或s或s的括号表达式。

2017-08-31 12:47:20

我们也可以用

(.*?\n)*?

匹配所有内容，包括换行符，而不是贪心。

这将使新行成为可选的

(.*?|\n)*?

2018-08-06 07:48:29

在JavaScript中，你可以使用[^]*来搜索0到无限个字符，包括换行符。

$ (" # find_and_replace ") .click(函数(){ Var text = $("#textarea").val(); search_term = new RegExp("[^]*<Foobar>"， "gi");; replace_term = "替换项"; Var new_text = text。替换(search_term replace_term); $ (" # textarea) .val (new_text); })； < script src = " https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js " > < /脚本> </button id="find_and_replace">查找并替换</button> . < br > < textarea ID = " textarea”>中的 fghij< Foobar&gt textarea > < /

2019-02-27 12:50:43

通常在PowerShell中搜索三个连续的行，它看起来像这样:

$file = Get-Content file.txt -raw

$pattern = 'lineone\r\nlinetwo\r\nlinethree\r\n'     # "Windows" text
$pattern = 'lineone\nlinetwo\nlinethree\n'           # "Unix" text
$pattern = 'lineone\r?\nlinetwo\r?\nlinethree\r?\n'  # Both

$file -match $pattern

# output
True

奇怪的是，这将是Unix文本在提示符，但Windows文本在文件中:

$pattern = 'lineone
linetwo
linethree
'

下面是打印行结束符的方法:

'lineone
linetwo
linethree
' -replace "`r",'\r' -replace "`n",'\n'

# Output
lineone\nlinetwo\nlinethree\n

2019-07-05 14:12:02

选项1

一种方法是使用s标志(就像接受的答案一样):

/(.*)<FooBar>/s

演示1

选项2

第二种方法是使用m (multiline)标志和以下任何模式:

/([\s\S]*)<FooBar>/m

/([\d\D]*)<FooBar>/m

/([\w\W]*)<FooBar>/m

演示2

RegEx电路

jex。Im可视化正则表达式:

2019-10-06 19:41:28

试题:* \ n *。*<FooBar>假设你也允许空换行。因为你允许任何字符在<FooBar>之前不包括任何字符。

2020-08-28 16:21:39

在notepad++中你可以使用这个

<table (.|\r\n)*</table>

它将匹配从。开始的整个表

rows and columns

你可以让它成为贪婪的，使用下面的方法，这样它就会匹配第一个，第二个等等表，而不是一次全部匹配

<table (.|\r\n)*?</table>

2022-01-29 02:28:34

这对我来说是最简单的方法:

(\X*)<FooBar>

2022-08-25 18:28:38

如何在正则表达式中匹配跨多行任意字符?

推荐文章

最新文章

标签