grep能否只显示匹配搜索模式的单词?

是否有一种方法使grep从匹配搜索表达式的文件中输出“单词”?

如果我想在一些文件中找到“th”的所有实例，我可以这样做:

grep "th" *

但是输出会是这样的(粗体是我的);

some-text-file : the cat sat on the mat  
some-other-text-file : the quick brown fox  
yet-another-text-file : i hope this explains it thoroughly

我想让它输出什么，使用相同的搜索，是:

the
the
the
this
thoroughly

这可能使用grep吗?或者使用另一种工具组合?

只需awk，不需要组合工具。

# awk '{for(i=1;i<=NF;i++){if($i~/^th/){print $i}}}' file
the
the
the
this
thoroughly

2009-10-10 00:54:12

试试grep -o:

grep -oh "\w*th\w*" *

编辑:匹配菲尔的评论。

从文档中可以看出:

-h, --no-filename
    Suppress the prefixing of file names on output. This is the default
    when there is only  one  file  (or only standard input) to search.
-o, --only-matching
    Print  only  the matched (non-empty) parts of a matching line,
    with each such part on a separate output line.

2009-10-10 01:01:36

你可以像这样将你的grep输出管道到Perl中:

grep "th" * | perl -n -e'while(/(\w*th\w*)/g) {print "$1\n"}'

2009-10-10 01:06:09

你可以将空格转换为换行符，然后再转换为grep，例如:

cat * | tr ' ' '\n' | grep th

2009-10-10 01:43:06

你也可以试试pcregrep。在grep中也有一个-w选项，但在某些情况下，它不能像预期的那样工作。

从维基百科:

cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple

grep -w apple fruitlist.txt
apple
apple-
apple-fruit
fruit-apple

2009-11-14 12:15:02

cat *-text-file | grep -Eio "th[a-z]+"

2010-09-14 15:30:51

我对awk难以记忆的语法不满意，但我喜欢用一个实用程序来做这件事的想法。

似乎ack(或者ack-grep如果你使用Ubuntu)可以很容易地做到这一点:

# ack-grep -ho "\bth.*?\b" *

the
the
the
this
thoroughly

如果你省略-h标志，你会得到:

# ack-grep -o "\bth.*?\b" *

some-other-text-file
1:the

some-text-file
1:the
the

yet-another-text-file
1:this
thoroughly

作为奖励，你可以使用——output标志来完成更复杂的搜索，使用我发现的最简单的语法:

# echo "bug: 1, id: 5, time: 12/27/2010" > test-file
# ack-grep -ho "bug: (\d*), id: (\d*), time: (.*)" --output '$1, $2, $3' test-file

1, 5, 12/27/2010

2011-01-11 21:25:48

$ grep -w

摘自grep手册页:

-w:只选择包含完整单词的匹配行。测试是匹配的子字符串必须在行首，或者前面有一个非单词组成字符。

2012-05-29 06:32:31

Grep命令只与perl匹配

grep -o -P 'th.*? ' filename

2012-11-29 09:11:26

我有一个类似的问题，寻找grep/pattern regex和“匹配的模式找到”作为输出。

最后，我使用了选项-o的egrep(相同的正则表达式在grep -e或-G上没有给我相同的egrep结果)

所以，我认为这可能是类似于(我不是一个正则表达式大师):

egrep -o "the*|this{1}|thoroughly{1}" filename

2013-02-14 16:39:19

交叉分发安全答案(含windows minGW?)

grep -h "[[:alpha:]]*th[[:alpha:]]*" 'filename' | tr ' ' '\n' | grep -h "[[:alpha:]]*th[[:alpha:]]*"

如果你使用的是不包含-o选项的旧版本的grep(如2.4.2)，那么使用上面的方法。否则使用下面的简单版本来维护。

Linux交叉分发安全答案

grep -oh "[[:alpha:]]*th[[:alpha:]]*" 'filename'

总结一下:-oh输出正则表达式匹配到文件内容(而不是文件名)，就像你期望正则表达式在vim/etc中工作一样…然后，您将搜索什么单词或正则表达式，这取决于您!只要你继续使用POSIX而不是perl语法(请参阅下文)

更多内容来自grep手册

-o      Print each match, but only the match, not the entire line.
-h      Never print filename headers (i.e. filenames) with output lines.
-w      The expression is searched for as a word (as if surrounded by
         `[[:<:]]' and `[[:>:]]';

为什么最初的答案并不适用于所有人

\w的用法因平台而异，因为它是一种扩展的“perl”语法。因此，那些局限于POSIX字符类的grep安装使用[[:alpha:]]，而不是perl中的\w。有关正则表达式的更多信息，请参见维基百科页面

最终，上面的POSIX答案将更加可靠，不管grep使用的是什么平台(是原始平台)

对于不带-o选项的grep的支持，第一个grep输出相关的行，tr将空格分割为新行，最后一个grep只过滤相应的行。

(PS:我知道现在大多数平台都已经为\w....打了补丁但总有落后的)

感谢@AdamRosenfield回答中的“-o”

2013-04-14 08:17:27

要搜索所有以“icon-”开头的单词，下面的命令非常有效。我在这里使用的是Ack，它类似于grep，但有更好的选项和良好的格式。

ack -oh --type=html "\w*icon-\w*" | sort | uniq

2014-01-16 15:46:55

这比你想的要简单。试试这个:

egrep -wo 'th.[a-z]*' filename.txt #### (Case Sensitive)

egrep -iwo 'th.[a-z]*' filename.txt  ### (Case Insensitive)

在那里,

 egrep: Grep will work with extended regular expression.
 w    : Matches only word/words instead of substring.
 o    : Display only matched pattern instead of whole line.
 i    : If u want to ignore case sensitivity.

2017-03-28 09:25:08

ripgrep

下面是使用ripgrep的示例:

rg -o "(\w+)?th(\w+)?"

它会匹配所有与th匹配的单词。

2018-11-07 12:38:46

grep --color -o -E "Begin.{0,}?End" file.txt

？ -尽可能少的匹配直到结束

在macos终端上测试

2022-08-30 23:46:45

grep能否只显示匹配搜索模式的单词?

推荐文章

最新文章

标签