在bash中使用正则表达式进行搜索和替换

我看过这个例子:

hello=ho02123ware38384you443d34o3434ingtod38384day
echo ${hello//[0-9]/}

遵循以下语法:${variable//pattern/replacement}

不幸的是，模式字段似乎不支持完整的正则表达式语法(如果我使用。或者\s，例如，它试图匹配文字字符)。

我如何使用完整的正则表达式语法搜索/替换字符串?

当前回答

这实际上可以在纯bash中完成:

hello=ho02123ware38384you443d34o3434ingtod38384day
re='(.*)[0-9]+(.*)'
while [[ $hello =~ $re ]]; do
  hello=${BASH_REMATCH[1]}${BASH_REMATCH[2]}
done
echo "$hello"

收益率…

howareyoudoingtodday

2014-03-07 21:55:27

其他回答

这实际上可以在纯bash中完成:

hello=ho02123ware38384you443d34o3434ingtod38384day
re='(.*)[0-9]+(.*)'
while [[ $hello =~ $re ]]; do
  hello=${BASH_REMATCH[1]}${BASH_REMATCH[2]}
done
echo "$hello"

收益率…

howareyoudoingtodday

2014-03-07 21:55:27

这些例子也可以在bash中工作，不需要使用sed:

#!/bin/bash
MYVAR=ho02123ware38384you443d34o3434ingtod38384day
MYVAR=${MYVAR//[a-zA-Z]/X} 
echo ${MYVAR//[0-9]/N}

也可以使用字符类括号表达式

#!/bin/bash
MYVAR=ho02123ware38384you443d34o3434ingtod38384day
MYVAR=${MYVAR//[[:alpha:]]/X} 
echo ${MYVAR//[[:digit:]]/N}

输出

XXNNNNNXXXXNNNNNXXXNNNXNNXNNNNXXXXXXNNNNNXXX

然而，@Lanaru想知道的是，如果我理解正确的话，为什么“完整”或PCRE扩展\s\ s\ w\ w\ d\ d等不像php、ruby、python等所支持的那样工作。这些扩展来自与perl兼容的正则表达式(PCRE)，可能与其他形式的基于shell的正则表达式不兼容。

这些都不管用:

#!/bin/bash
hello=ho02123ware38384you443d34o3434ingtod38384day
echo ${hello//\d/}


#!/bin/bash
hello=ho02123ware38384you443d34o3434ingtod38384day
echo $hello | sed 's/\d//g'

输出时删除所有“d”字面值

ho02123ware38384you44334o3434ingto38384ay

但下面的操作确实如预期的那样工作

#!/bin/bash
hello=ho02123ware38384you443d34o3434ingtod38384day
echo $hello | perl -pe 's/\d//g'

输出

howareyoudoingtodday

希望这能更清楚地说明问题，但如果你还不困惑，为什么不在启用了REG_ENHANCED标志的Mac OS X上尝试一下:

#!/bin/bash
MYVAR=ho02123ware38384you443d34o3434ingtod38384day;
echo $MYVAR | grep -o -E '\d'

在大多数*nix版本中，你只会看到以下输出:

d
d
d

nJoy !

2014-03-07 21:48:36

我知道这是一个古老的线程，但它是我在谷歌上的第一次点击，我想分享以下我放在一起的resub，它增加了对多个$1，$2等反向引用的支持…

#!/usr/bin/env bash

############################################
###  resub - regex substitution in bash  ###
############################################

resub() {
    local match="$1" subst="$2" tmp

    if [[ -z $match ]]; then
        echo "Usage: echo \"some text\" | resub '(.*) (.*)' '\$2 me \${1}time'" >&2
        return 1
    fi

    ### First, convert "$1" to "$BASH_REMATCH[1]" and 'single-quote' for later eval-ing...

    ### Utility function to 'single-quote' a list of strings
    squot() { local a=(); for i in "$@"; do a+=( $(echo \'${i//\'/\'\"\'\"\'}\' )); done; echo "${a[@]}"; }

    tmp=""
    while [[ $subst =~ (.*)\${([0-9]+)}(.*) ]] || [[ $subst =~ (.*)\$([0-9]+)(.*) ]]; do
        tmp="\${BASH_REMATCH[${BASH_REMATCH[2]}]}$(squot "${BASH_REMATCH[3]}")${tmp}"
        subst="${BASH_REMATCH[1]}"
    done
    subst="$(squot "${subst}")${tmp}"

    ### Now start (globally) substituting

    tmp=""
    while read line; do
        counter=0
        while [[ $line =~ $match(.*) ]]; do
            eval tmp='"${tmp}${line%${BASH_REMATCH[0]}}"'"${subst}"
            line="${BASH_REMATCH[$(( ${#BASH_REMATCH[@]} - 1 ))]}"
        done
        echo "${tmp}${line}"
    done
}

resub "$@"

##################
###  EXAMPLES  ###
##################

###  % echo "The quick brown fox jumps quickly over the lazy dog" | resub quick slow
###    The slow brown fox jumps slowly over the lazy dog

###  % echo "The quick brown fox jumps quickly over the lazy dog" | resub 'quick ([^ ]+) fox' 'slow $1 sheep'
###    The slow brown sheep jumps quickly over the lazy dog

###  % animal="sheep"
###  % echo "The quick brown fox 'jumps' quickly over the \"lazy\" \$dog" | resub 'quick ([^ ]+) fox' "\"\$low\" \${1} '$animal'"
###    The "$low" brown 'sheep' 'jumps' quickly over the "lazy" $dog

###  % echo "one two three four five" | resub "one ([^ ]+) three ([^ ]+) five" 'one $2 three $1 five'
###    one four three two five

###  % echo "one two one four five" | resub "one ([^ ]+) " 'XXX $1 '
###    XXX two XXX four five

###  % echo "one two three four five one six three seven eight" | resub "one ([^ ]+) three ([^ ]+) " 'XXX $1 YYY $2 '
###    XXX two YYY four five XXX six YYY seven eight

H/T @Charles Duffy回复:(.*)$match(.*)

2020-07-24 01:29:52

如果您正在重复调用并且关心性能，这个测试显示BASH方法比分支到sed和可能的任何其他外部进程快15倍。

hello=123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X

P1=$(date +%s)

for i in {1..10000}
do
   echo $hello | sed s/X//g > /dev/null
done

P2=$(date +%s)
echo $[$P2-$P1]

for i in {1..10000}
do
   echo ${hello//X/} > /dev/null
done

P3=$(date +%s)
echo $[$P3-$P2]

2017-01-05 21:32:18

设置var

hello=ho02123ware38384you443d34o3434ingtod38384day

然后，在var上用regex替换echo

echo ${hello//[[:digit:]]/}

这将打印:

howareyoudoingtodday

额外-如果你想要相反的(获得数字字符)

echo ${hello//[![:digit:]]/}

这将打印:

021233838444334343438384

2021-08-12 14:01:35

在bash中使用正则表达式进行搜索和替换

推荐文章

最新文章

标签