如何在Bash中定义哈希表?

什么是Python字典的等效，但在Bash中(应该适用于OS X和Linux)。

当前回答

你可以进一步修改hput()/hget()接口，这样你就有了如下命名的哈希值:

hput() {
    eval "$1""$2"='$3'
}

hget() {
    eval echo '${'"$1$2"'#hash}'
}

然后

hput capitals France Paris
hput capitals Netherlands Amsterdam
hput capitals Spain Madrid
echo `hget capitals France` and `hget capitals Netherlands` and `hget capitals Spain`

这让你可以定义其他不冲突的地图(例如，'rcapitals'根据首都城市进行国家查找)。但是，不管怎样，我想你会发现这一切都很糟糕，就性能而言。

编辑:上面的修改版本，支持非字母数字字符的键

hashKey() {
  # replace non-alphanumeric characters with underscore to make keys valid BASH identifiers
  echo "$1_$2" | sed -E "s/[^a-zA-Z0-9]+/_/g" | sed -E "s/^[^a-zA-Z0-9]+|[^a-zA-Z0-9]+\$//g"
}

hashPut() {
  local KEY=`hashKey $1 $2`
  eval "$KEY"="$3"
}

hashGet() {
  local KEY=`hashKey $1 $2`
  echo "${!KEY}"
}

最后编辑

如果你真的想要快速哈希查找，有一个非常非常糟糕的黑客，它实际上非常有效。它是这样的:将您的键/值写入一个临时文件，每行一个，然后使用'grep "^$key"'将它们取出，使用带有cut或awk或sed或其他工具的管道来检索值。

就像我说的，这听起来很可怕，听起来它应该很慢，做各种不必要的IO，但实际上它非常快(磁盘缓存很棒，不是吗?)，即使对于非常大的哈希表也是如此。你必须自己强制键的唯一性等等。即使只有几百个条目，输出文件/grep组合也会快很多——以我的经验，快几倍。它还消耗更少的内存。

这里有一种方法:

hinit() {
    rm -f /tmp/hashmap.$1
}

hput() {
    echo "$2 $3" >> /tmp/hashmap.$1
}

hget() {
    grep "^$2 " /tmp/hashmap.$1 | awk '{ print $2 };'
}

hinit capitals
hput capitals France Paris
hput capitals Netherlands Amsterdam
hput capitals Spain Madrid

echo `hget capitals France` and `hget capitals Netherlands` and `hget capitals Spain`

2010-02-08 23:38:29

其他回答

有两件事，在任何2.6内核中使用/dev/shm (Redhat)都可以使用内存而不是/tmp，其他发行版可能会有所不同。hget也可以用read重新实现，如下所示:

function hget {

  while read key idx
  do
    if [ $key = $2 ]
    then
      echo $idx
      return
    fi
  done < /dev/shm/hashmap.$1
}

此外，通过假设所有键都是唯一的，返回值会使读循环短路，从而避免读取所有条目。如果您的实现可以有重复的键，那么只需省略返回。这节省了读取和派生grep和awk的开销。在这两个实现中使用/dev/shm会在3个条目的散列上使用time hget来搜索最后一个条目:

握把/卒：

hget() {
    grep "^$2 " /dev/shm/hashmap.$1 | awk '{ print $2 };'
}

$ time echo $(hget FD oracle)
3

real    0m0.011s
user    0m0.002s
sys     0m0.013s

Read / echo:

$ time echo $(hget FD oracle)
3

real    0m0.004s
user    0m0.000s
sys     0m0.004s

在多次调用中，我从未看到过低于50%的改善。这都是由于使用了/dev/shm.而导致的

2010-08-14 23:45:30

这就是我要找的东西:

declare -A hashmap
hashmap["key"]="value"
hashmap["key2"]="value2"
echo "${hashmap["key"]}"
for key in ${!hashmap[@]}; do echo $key; done
for value in ${hashmap[@]}; do echo $value; done
echo hashmap has ${#hashmap[@]} elements

这在bash 4.1.5中并不适用:

animals=( ["moo"]="cow" )

2011-05-23 00:30:08

我在bash 3中使用动态变量创建hashmap。我在我的回答中解释了它是如何工作的:Shell脚本中的关联数组

您还可以查看shell_map，它是bash 3中实现的HashMap。

2016-06-03 16:34:44

下面是一个相当做作但希望有指导意义的哈希/映射/字典/关联数组示例。假设我有一个字符串数组，我想创建一个映射，从每个单词到它在数组中出现的次数。

当然，有很多方法可以使用管道命令来实现这一点，但重点是演示核心的映射操作:使用-v检查键的存在性、添加键-值映射、检索键的值、更新键的现有值以及遍历整个映射以打印键-值对。

#!/usr/bin/bash
set -o pipefail

bash --version | head -1

words=(foo foo bar bar foo baz baz foo bar)
declare -A counter=() # create the map

for word in "${words[@]}"; do
    # if the key doesn't yet exist in the map, add it
    if [[ ! -v counter[$word] ]]; then
        counter[$word]=0
    fi

    # look up the value of a key, add one, and store back in the map
    counter[$word]=$((${counter[$word]} + 1))
done

# iterate the map
for key in "${!counter[@]}"; do
    echo "$key ${counter[$key]}"
done

输出:

GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
foo 4
bar 3
baz 2

2022-12-11 04:38:55

一位同事刚刚提到了这个帖子。我在bash中独立实现了哈希表，它不依赖于版本4。我在2010年3月的一篇博客文章(在这里的一些答案之前…)题为bash中的哈希表:

我以前使用cksum来散列，但后来将Java的字符串hashCode转换为本地bash/zsh。

# Here's the hashing function
ht() {
  local h=0 i
  for (( i=0; i < ${#1}; i++ )); do
    let "h=( (h<<5) - h ) + $(printf %d \'${1:$i:1})"
    let "h |= h"
  done
  printf "$h"
}

# Example:

myhash[`ht foo bar`]="a value"
myhash[`ht baz baf`]="b value"

echo ${myhash[`ht baz baf`]} # "b value"
echo ${myhash[@]} # "a value b value" though perhaps reversed
echo ${#myhash[@]} # "2" - there are two values (note, zsh doesn't count right)

它不是双向的，内置的方式要好得多，但无论如何都不应该使用。Bash是用于快速一次性的，这类事情很少涉及可能需要散列的复杂性，除非在~/中。Bashrc和朋友们。

2012-10-18 00:39:57

如何在Bash中定义哈希表?

推荐文章

最新文章

标签