如何在Unix平台上的文件中grep标签(\t) ?
当前回答
基本上有两种解决方法:
(Recommended) Use regular expression syntax supported by grep(1). Modern grep(1) supports two forms of POSIX 1003.2 regex syntax: basic (obsolete) REs, and modern REs. Syntax is described in details on re_format(7) and regex(7) man pages which are part of BSD and Linux systems respectively. The GNU grep(1) also supports Perl-compatible REs as provided by the pcre(3) library. In regex language the tab symbol is usually encoded by \t atom. The atom is supported by BSD extended regular expressions (egrep, grep -E on BSD compatible system), as well as Perl-compatible REs (pcregrep, GNU grep -P). Both basic regular expressions and Linux extended REs apparently have no support for the \t. Please consult UNIX utility man page to know which regex language it supports (hence the difference between sed(1), awk(1), and pcregrep(1) regular expressions). Therefore, on Linux: $ grep -P '\t' FILE ... On BSD alike system: $ egrep '\t' FILE ... $ grep -E '\t' FILE ... Pass the tab character into pattern. This is straightforward when you edit a script file: # no tabs for Python please! grep -q ' ' *.py && exit 1 However, when working in an interactive shell you may need to rely on shell and terminal capabilities to type the proper symbol into the line. On most terminals this can be done through Ctrl+V key combination which instructs terminal to treat the next input character literally (the V is for "verbatim"): $ grep '<Ctrl>+<V><TAB>' FILE ... Some shells may offer advanced support for command typesetting. Such, in bash(1) words of the form $'string' are treated specially: bash$ grep $'\t' FILE ... Please note though, while being nice in a command line this may produce compatibility issues when the script will be moved to another platform. Also, be careful with quotes when using the specials, please consult bash(1) for details. For Bourne shell (and not only) the same behaviour may be emulated using command substitution augmented by printf(1) to construct proper regex: $ grep "`printf '\t'`" FILE ...
其他回答
These alternative binary identification methods are totally functional. And, I really like the one's using awk, as I couldn't quite remember the syntaxic use with single binary chars. However, it should also be possible to assign a shell variable a value in a POSIX portable fashion (i.e. TAB=echo "@" | tr "\100" "\011"), and then employ it from there everywhere, in a POSIX portable fashion; as well (i.e grep "$TAB" filename). While this solution works well with TAB, it will also work well other binary chars, when another desired binary value is used in the assignment (instead of the value for the TAB character to 'tr').
使用gawk,将字段分隔符设置为TAB (\t)并检查字段的数量。如果多于1,则有/有制表符
awk -F"\t" 'NF>1' file
在其他答案中给出的$'\t'符号是特定于shell的——它似乎在bash和zsh中工作,但不是通用的。
注意:下面是针对fish shell的,在bash中不起作用:
在fish shell中,可以使用不带引号的\t,例如:
grep \t foo.txt
或者可以使用十六进制或unicode符号,例如:
grep \X09 foo.txt
grep \U0009 foo.txt
(这些符号对于更深奥的字符很有用)
因为这些值必须是不加引号的,所以可以将加引号的值和不加引号的值进行拼接:
grep "foo"\t"bar"
一种方法是(这是Bash)
grep -P '\t'
-P将打开Perl正则表达式,因此\t将工作。
正如用户unwind所说,它可能是特定于GNU grep的。另一种方法是在shell、编辑器或终端允许的情况下插入一个制表符。
一个好的选择是使用sed。
sed -n '/\t/p' file
示例(工作在bash, sh, ksh, csh,..):
[~]$ cat testfile
12 3
1 4 abc
xa c
a c\2
1 23
[~]$ sed -n '/\t/p' testfile
xa c
a c\2
[~]$ sed -n '/\ta\t/p' testfile
a c\2
(以下答案已根据评论中的建议进行了编辑。谢谢大家)
推荐文章
- 在Windows中有像GREP这样的模式匹配实用程序吗?
- 如何从命令行将每两行合并为一行?
- 如何从命令行通过mysql运行一个查询?
- 在创建守护进程时执行双fork的原因是什么?
- 匹配前后的Grep字符?
- (grep)正则表达式匹配非ascii字符?
- 如何从另一个文件A中删除文件B中出现的行?
- 对以制表符分隔的文件进行排序
- 如何使用查找命令从列表中查找所有具有扩展名的文件?
- 如何将文件指针(file * fp)转换为文件描述符(int fd)?
- 在Bash中获取日期(比当前时间早一天)
- Linux: kill后台任务
- 在OSX中永久设置PATH环境变量
- PowerShell等价于grep -f
- 如何在C程序中获取当前目录?