在Bash中,测试数组是否包含某个值的最简单方法是什么?


当前回答

我的版本的正则表达式技术,已经建议:

values=(foo bar)
requestedValue=bar

requestedValue=${requestedValue##[[:space:]]}
requestedValue=${requestedValue%%[[:space:]]}
[[ "${values[@]/#/X-}" =~ "X-${requestedValue}" ]] || echo "Unsupported value"

What's happening here is that you're expanding the entire array of supported values into words and prepending a specific string, "X-" in this case, to each of them, and doing the same to the requested value. If this one is indeed contained in the array, then the resulting string will at most match one of the resulting tokens, or none at all in the contrary. In the latter case the || operator triggers and you know you're dealing with an unsupported value. Prior to all of that the requested value is stripped of all leading and trailing whitespace through standard shell string manipulation.

我相信它是干净而优雅的,尽管如果支持的值数组特别大,我不太确定它的性能如何。

其他回答

如何检查一个Bash数组是否包含一个值


假阳性匹配

array=(a1 b1 c1 d1 ee)

[[ ${array[*]} =~ 'a' ]] && echo 'yes' || echo 'no'
# output:
yes

[[ ${array[*]} =~ 'a1' ]] && echo 'yes' || echo 'no'
# output:
yes

[[ ${array[*]} =~ 'e' ]] && echo 'yes' || echo 'no'
# output:
yes

[[ ${array[*]} =~ 'ee' ]] && echo 'yes' || echo 'no'
# output:
yes

精确匹配

为了寻找精确匹配,你的正则表达式模式需要在值的前后添加额外的空格,如(^|[[:space:]])" value "($|[[:space:]])

# Exact match

array=(aa1 bc1 ac1 ed1 aee)

if [[ ${array[*]} =~ (^|[[:space:]])"a"($|[[:space:]]) ]]; then
    echo "Yes";
else
    echo "No";
fi
# output:
No

if [[ ${array[*]} =~ (^|[[:space:]])"ac1"($|[[:space:]]) ]]; then
    echo "Yes";
else
    echo "No";
fi
# output:
Yes

find="ac1"
if [[ ${array[*]} =~ (^|[[:space:]])"$find"($|[[:space:]]) ]]; then
    echo "Yes";
else
    echo "No";
fi
# output:
Yes

有关更多用法示例,示例的来源在这里

这种方法的优点是不需要遍历所有元素(至少不是显式地)。但是由于array.c中的array_to_string_internal()仍然循环遍历数组元素并将它们连接到一个字符串中,因此它可能并不比所提出的循环解决方案更有效,但它更具可读性。

if [[ " ${array[*]} " =~ " ${value} " ]]; then
    # whatever you want to do when array contains value
fi

if [[ ! " ${array[*]} " =~ " ${value} " ]]; then
    # whatever you want to do when array doesn't contain value
fi

请注意,如果您正在搜索的值是带有空格的数组元素中的某个单词,则会给出假阳性。例如

array=("Jack Brown")
value="Jack"

正则表达式将“Jack”视为在数组中,即使它不在数组中。所以你必须改变IFS和正则表达式上的分隔符如果你仍然想使用这个解决方案,就像这样

IFS="|"
array=("Jack Brown${IFS}Jack Smith")
value="Jack"

if [[ "${IFS}${array[*]}${IFS}" =~ "${IFS}${value}${IFS}" ]]; then
    echo "true"
else
    echo "false"
fi

unset IFS # or set back to original IFS if previously set

这将打印“false”。

显然,这也可以用作测试语句,允许将其表示为一行程序

[[ " ${array[*]} " =~ " ${value} " ]] && echo "true" || echo "false"
: NeedleInArgs "$needle" "${haystack[@]}"
: NeedleInArgs "$needle" arg1 arg2 .. argN
NeedleInArgs()
{
local a b;
printf -va '\n%q\n' "$1";
printf -vb '%q\n' "${@:2}";
case $'\n'"$b" in (*"$a"*) return 0;; esac;
return 1;
}

使用:

NeedleInArgs "$needle" "${haystack[@]}" && echo "$needle" found || echo "$needle" not found;

对于bash v3.1及以上版本(printf -v支持) 没有分叉,也没有外部程序 没有循环(除了bash中的内部扩展) 适用于所有可能的值和数组,没有异常,没有什么可担心的

也可以直接使用,比如:

if      NeedleInArgs "$input" value1 value2 value3 value4;
then
        : input from the list;
else
        : input not from list;
fi;

对于从v20.5 b到v3.0的bash, printf缺少-v,因此需要额外的2个fork(但不需要执行,因为printf是bash内置的):

NeedleInArgs()
{
case $'\n'"`printf '%q\n' "${@:2}"`" in
(*"`printf '\n%q\n' "$1"`"*) return 0;;
esac;
return 1;
}

注意,我测试了时间:

check call0:  n: t4.43 u4.41 s0.00 f: t3.65 u3.64 s0.00 l: t4.91 u4.90 s0.00 N: t5.28 u5.27 s0.00 F: t2.38 u2.38 s0.00 L: t5.20 u5.20 s0.00
check call1:  n: t3.41 u3.40 s0.00 f: t2.86 u2.84 s0.01 l: t3.72 u3.69 s0.02 N: t4.01 u4.00 s0.00 F: t1.15 u1.15 s0.00 L: t4.05 u4.05 s0.00
check call2:  n: t3.52 u3.50 s0.01 f: t3.74 u3.73 s0.00 l: t3.82 u3.80 s0.01 N: t2.67 u2.67 s0.00 F: t2.64 u2.64 s0.00 L: t2.68 u2.68 s0.00

Call0和call1是对另一个快速pure-bash变体调用的不同变体 Call2在这里。 N=notfound F=firstmatch L=lastmatch 小写字母为短数组,大写字母为长数组

正如您所看到的,这里的这个变体有一个非常稳定的运行时,所以它不太依赖于匹配位置。运行时主要由数组长度决定。搜索变量的运行时高度依赖于匹配位置。所以在边缘情况下,这个变体可以(快得多)。

但非常重要的是,搜索变量的RAM效率更高,因为这里的这个变量总是将整个数组转换为一个大字符串。

所以如果你的内存很紧,你希望大部分比赛都是早期的,那么就不要在这里使用这个。但是,如果您想要一个可预测的运行时,有很长的数组来匹配(期望延迟或根本不匹配),并且双RAM使用也不是太大的问题,那么这里有一些优势。

定时测试脚本:

in_array()
{
    local needle="$1" arrref="$2[@]" item
    for item in "${!arrref}"; do
        [[ "${item}" == "${needle}" ]] && return 0
    done
    return 1
}

NeedleInArgs()
{
local a b;
printf -va '\n%q\n' "$1";
printf -vb '%q\n' "${@:2}";
case $'\n'"$b" in (*"$a"*) return 0;; esac;
return 1;
}

loop1() { for a in {1..100000}; do "$@"; done }
loop2() { for a in {1..1000}; do "$@"; done }

run()
{
  needle="$5"
  arr=("${@:6}")

  out="$( ( time -p "loop$2" "$3" ) 2>&1 )"

  ret="$?"
  got="${out}"
  syst="${got##*sys }"
  got="${got%"sys $syst"}"
  got="${got%$'\n'}"
  user="${got##*user }"
  got="${got%"user $user"}"
  got="${got%$'\n'}"
  real="${got##*real }"
  got="${got%"real $real"}"
  got="${got%$'\n'}"
  printf ' %s: t%q u%q s%q' "$1" "$real" "$user" "$syst"
  [ -z "$rest" ] && [ "$ret" = "$4" ] && return
  printf 'FAIL! expected %q got %q\n' "$4" "$ret"
  printf 'call:   %q\n' "$3"
  printf 'out:    %q\n' "$out"
  printf 'rest:   %q\n' "$rest"
  printf 'needle: %q\n' "$5"
  printf 'arr:   '; printf ' %q' "${@:6}"; printf '\n'
  exit 1
}

check()
{
  printf 'check %q: ' "$1"
  run n 1 "$1" 1 needle a b c d
  run f 1 "$1" 0 needle needle a b c d
  run l 1 "$1" 0 needle a b c d needle
  run N 2 "$1" 1 needle "${rnd[@]}"
  run F 2 "$1" 0 needle needle "${rnd[@]}"
  run L 2 "$1" 0 needle "${rnd[@]}" needle
  printf '\n'
}

call0() { chk=("${arr[@]}"); in_array "$needle" chk; }
call1() { in_array "$needle" arr; }
call2() { NeedleInArgs "$needle" "${arr[@]}"; }

rnd=()
for a in {1..1000}; do rnd+=("$a"); done

check call0
check call1
check call2
containsElement () { for e in "${@:2}"; do [[ "$e" = "$1" ]] && return 0; done; return 1; }

现在正确处理空数组。

下面是实现这一点的一个小函数。搜索字符串是第一个参数,其余是数组元素:

set +e #otherwise the script will exit on error
containsElement () {
  local e match="$1"
  shift
  for e; do [[ "$e" == "$match" ]] && return 0; done
  return 1
}

该函数的测试运行如下:

$ array=("something to search for" "a string" "test2000")
$ containsElement "a string" "${array[@]}"
$ echo $?
0
$ containsElement "blaha" "${array[@]}"
$ echo $?
1