在Bash中,测试数组是否包含某个值的最简单方法是什么?


当前回答

考虑到:

array=("something to search for" "a string" "test2000")
elem="a string"

然后简单检查一下:

if c=$'\x1E' && p="${c}${elem} ${c}" && [[ ! "${array[@]/#/${c}} ${c}" =~ $p ]]; then
  echo "$elem exists in array"
fi

在哪里

c is element separator
p is regex pattern

(单独分配p,而不是直接在[[]]中使用表达式的原因是为了保持bash 4的兼容性)

其他回答

结合Beorn Harris和loentar的回答,我们得出了一个更有趣的单行测试:

delim=$'\x1F' # define a control code to be used as more or less reliable delimiter
if [[ "${delim}${array[@]}${delim}" =~ "${delim}a string to test${delim}" ]]; then
    echo "contains 'a string to test'"
fi

它不使用额外的函数,不替换测试,并添加了额外的保护,防止使用控制代码作为分隔符偶尔出现错误匹配。


UPD:感谢@ChrisCogdon的注意,这个错误的代码被重写并以https://stackoverflow.com/a/58527681/972463的形式发布。

如何检查一个Bash数组是否包含一个值


假阳性匹配

array=(a1 b1 c1 d1 ee)

[[ ${array[*]} =~ 'a' ]] && echo 'yes' || echo 'no'
# output:
yes

[[ ${array[*]} =~ 'a1' ]] && echo 'yes' || echo 'no'
# output:
yes

[[ ${array[*]} =~ 'e' ]] && echo 'yes' || echo 'no'
# output:
yes

[[ ${array[*]} =~ 'ee' ]] && echo 'yes' || echo 'no'
# output:
yes

精确匹配

为了寻找精确匹配,你的正则表达式模式需要在值的前后添加额外的空格,如(^|[[:space:]])" value "($|[[:space:]])

# Exact match

array=(aa1 bc1 ac1 ed1 aee)

if [[ ${array[*]} =~ (^|[[:space:]])"a"($|[[:space:]]) ]]; then
    echo "Yes";
else
    echo "No";
fi
# output:
No

if [[ ${array[*]} =~ (^|[[:space:]])"ac1"($|[[:space:]]) ]]; then
    echo "Yes";
else
    echo "No";
fi
# output:
Yes

find="ac1"
if [[ ${array[*]} =~ (^|[[:space:]])"$find"($|[[:space:]]) ]]; then
    echo "Yes";
else
    echo "No";
fi
# output:
Yes

有关更多用法示例,示例的来源在这里

保持简单:

Array1=( "item1" "item2" "item3" "item-4" )
var="item3"

count=$(echo ${Array1[@]} | tr ' ' '\n' | awk '$1 == "'"$var"'"{print $0}' | wc -l)
[ $count -eq 0 ] && echo "Not found" || echo "found"

我的版本的正则表达式技术,已经建议:

values=(foo bar)
requestedValue=bar

requestedValue=${requestedValue##[[:space:]]}
requestedValue=${requestedValue%%[[:space:]]}
[[ "${values[@]/#/X-}" =~ "X-${requestedValue}" ]] || echo "Unsupported value"

What's happening here is that you're expanding the entire array of supported values into words and prepending a specific string, "X-" in this case, to each of them, and doing the same to the requested value. If this one is indeed contained in the array, then the resulting string will at most match one of the resulting tokens, or none at all in the contrary. In the latter case the || operator triggers and you know you're dealing with an unsupported value. Prior to all of that the requested value is stripped of all leading and trailing whitespace through standard shell string manipulation.

我相信它是干净而优雅的,尽管如果支持的值数组特别大,我不太确定它的性能如何。

: NeedleInArgs "$needle" "${haystack[@]}"
: NeedleInArgs "$needle" arg1 arg2 .. argN
NeedleInArgs()
{
local a b;
printf -va '\n%q\n' "$1";
printf -vb '%q\n' "${@:2}";
case $'\n'"$b" in (*"$a"*) return 0;; esac;
return 1;
}

使用:

NeedleInArgs "$needle" "${haystack[@]}" && echo "$needle" found || echo "$needle" not found;

对于bash v3.1及以上版本(printf -v支持) 没有分叉,也没有外部程序 没有循环(除了bash中的内部扩展) 适用于所有可能的值和数组,没有异常,没有什么可担心的

也可以直接使用,比如:

if      NeedleInArgs "$input" value1 value2 value3 value4;
then
        : input from the list;
else
        : input not from list;
fi;

对于从v20.5 b到v3.0的bash, printf缺少-v,因此需要额外的2个fork(但不需要执行,因为printf是bash内置的):

NeedleInArgs()
{
case $'\n'"`printf '%q\n' "${@:2}"`" in
(*"`printf '\n%q\n' "$1"`"*) return 0;;
esac;
return 1;
}

注意,我测试了时间:

check call0:  n: t4.43 u4.41 s0.00 f: t3.65 u3.64 s0.00 l: t4.91 u4.90 s0.00 N: t5.28 u5.27 s0.00 F: t2.38 u2.38 s0.00 L: t5.20 u5.20 s0.00
check call1:  n: t3.41 u3.40 s0.00 f: t2.86 u2.84 s0.01 l: t3.72 u3.69 s0.02 N: t4.01 u4.00 s0.00 F: t1.15 u1.15 s0.00 L: t4.05 u4.05 s0.00
check call2:  n: t3.52 u3.50 s0.01 f: t3.74 u3.73 s0.00 l: t3.82 u3.80 s0.01 N: t2.67 u2.67 s0.00 F: t2.64 u2.64 s0.00 L: t2.68 u2.68 s0.00

Call0和call1是对另一个快速pure-bash变体调用的不同变体 Call2在这里。 N=notfound F=firstmatch L=lastmatch 小写字母为短数组,大写字母为长数组

正如您所看到的,这里的这个变体有一个非常稳定的运行时,所以它不太依赖于匹配位置。运行时主要由数组长度决定。搜索变量的运行时高度依赖于匹配位置。所以在边缘情况下,这个变体可以(快得多)。

但非常重要的是,搜索变量的RAM效率更高,因为这里的这个变量总是将整个数组转换为一个大字符串。

所以如果你的内存很紧,你希望大部分比赛都是早期的,那么就不要在这里使用这个。但是,如果您想要一个可预测的运行时,有很长的数组来匹配(期望延迟或根本不匹配),并且双RAM使用也不是太大的问题,那么这里有一些优势。

定时测试脚本:

in_array()
{
    local needle="$1" arrref="$2[@]" item
    for item in "${!arrref}"; do
        [[ "${item}" == "${needle}" ]] && return 0
    done
    return 1
}

NeedleInArgs()
{
local a b;
printf -va '\n%q\n' "$1";
printf -vb '%q\n' "${@:2}";
case $'\n'"$b" in (*"$a"*) return 0;; esac;
return 1;
}

loop1() { for a in {1..100000}; do "$@"; done }
loop2() { for a in {1..1000}; do "$@"; done }

run()
{
  needle="$5"
  arr=("${@:6}")

  out="$( ( time -p "loop$2" "$3" ) 2>&1 )"

  ret="$?"
  got="${out}"
  syst="${got##*sys }"
  got="${got%"sys $syst"}"
  got="${got%$'\n'}"
  user="${got##*user }"
  got="${got%"user $user"}"
  got="${got%$'\n'}"
  real="${got##*real }"
  got="${got%"real $real"}"
  got="${got%$'\n'}"
  printf ' %s: t%q u%q s%q' "$1" "$real" "$user" "$syst"
  [ -z "$rest" ] && [ "$ret" = "$4" ] && return
  printf 'FAIL! expected %q got %q\n' "$4" "$ret"
  printf 'call:   %q\n' "$3"
  printf 'out:    %q\n' "$out"
  printf 'rest:   %q\n' "$rest"
  printf 'needle: %q\n' "$5"
  printf 'arr:   '; printf ' %q' "${@:6}"; printf '\n'
  exit 1
}

check()
{
  printf 'check %q: ' "$1"
  run n 1 "$1" 1 needle a b c d
  run f 1 "$1" 0 needle needle a b c d
  run l 1 "$1" 0 needle a b c d needle
  run N 2 "$1" 1 needle "${rnd[@]}"
  run F 2 "$1" 0 needle needle "${rnd[@]}"
  run L 2 "$1" 0 needle "${rnd[@]}" needle
  printf '\n'
}

call0() { chk=("${arr[@]}"); in_array "$needle" chk; }
call1() { in_array "$needle" arr; }
call2() { NeedleInArgs "$needle" "${arr[@]}"; }

rnd=()
for a in {1..1000}; do rnd+=("$a"); done

check call0
check call1
check call2