如何从Linux shell脚本解析YAML文件?

我希望提供一个结构化的配置文件，它对于非技术用户来说尽可能容易编辑(不幸的是它必须是一个文件)，所以我想使用YAML。然而，我找不到任何方法从Unix shell脚本解析这个。

当前回答

如果你需要一个单一的值，你可以使用一个工具将你的YAML文档转换为JSON并提供给jq，例如yq。

sample.yaml的内容:

---
bob:
  item1:
    cats: bananas
  item2:
    cats: apples
  thing:
    cats: oranges

例子:

$ yq -r '.bob["thing"]["cats"]' sample.yaml 
oranges

2019-07-17 09:22:58

其他回答

以下是Stefan Farestam回答的扩展版本:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
      if(length($2)== 0){  vname[indent]= ++idx[indent] };
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], $3);
      }
   }'
}

该版本支持字典和列表的-符号和短符号。以下输入:

global:
  input:
    - "main.c"
    - "main.h"
  flags: [ "-O3", "-fpic" ]
  sample_input:
    -  { property1: value, property2: "value2" }
    -  { property1: "value3", property2: 'value 4' }

产生如下输出:

global_input_1="main.c"
global_input_2="main.h"
global_flags_1="-O3"
global_flags_2="-fpic"
global_sample_input_1_property1="value"
global_sample_input_1_property2="value2"
global_sample_input_2_property1="value3"
global_sample_input_2_property2="value 4"

as you can see the - items automatically get numbered in order to obtain different variable names for each item. In bash there are no multidimensional arrays, so this is one way to work around. Multiple levels are supported. To work around the problem with trailing white spaces mentioned by @briceburg one should enclose the values in single or double quotes. However, there are still some limitations: Expansion of the dictionaries and lists can produce wrong results when values contain commas. Also, more complex structures like values spanning multiple lines (like ssh-keys) are not (yet) supported.

A few words about the code: The first sed command expands the short form of dictionaries { key: value, ...} to regular and converts them to more simple yaml style. The second sed call does the same for the short notation of lists and converts [ entry, ... ] to an itemized list with the - notation. The third sed call is the original one that handled normal dictionaries, now with the addition to handle lists with - and indentations. The awk part introduces an index for each indentation level and increases it when the variable name is empty (i.e. when processing a list). The current value of the counters are used instead of the empty vname. When going up one level, the counters are zeroed.

编辑:我已经为此创建了一个github存储库。

2018-08-10 15:24:05

另一种选择是将YAML转换为JSON，然后使用jq与JSON表示进行交互，从其中提取信息或编辑信息。

我写了一个简单的bash脚本，包含这个胶水-见Y2J项目在GitHub上

2015-08-02 12:44:26

我知道这是非常具体的，但我认为我的回答可能对某些用户有帮助。如果你的机器上安装了node和npm，你可以使用js-yaml。首次安装:

npm i -g js-yaml
# or locally
npm i js-yaml

然后在bash脚本中

#!/bin/bash
js-yaml your-yaml-file.yml

如果你正在使用jq，你也可以这样做

#!/bin/bash
json="$(js-yaml your-yaml-file.yml)"
aproperty="$(jq '.apropery' <<< "$json")"
echo "$aproperty"

因为js-yaml将yaml文件转换为json字符串文字。然后，您可以在unix系统中的任何json解析器中使用该字符串。

2018-05-15 12:53:45

你可以用golang写成yq的等价形式:

./go-yg -yamlFile /home/user/dev/ansible-firefox/defaults/main.yml -key
firefox_version

62.0.3

2018-10-28 21:20:23

Whenever you need a solution for "How to work with YAML/JSON/compatible data from a shell script" which works on just about every OS with Python (*nix, OSX, Windows), consider yamlpath, which provides several command-line tools for reading, writing, searching, and merging YAML, EYAML, JSON, and compatible files. Since just about every OS either comes with Python pre-installed or it is trivial to install, this makes yamlpath highly portable. Even more interesting: this project defines an intuitive path language with very powerful, command-line-friendly syntax that enables accessing one or more nodes.

针对您的具体问题，在使用Python的本地包管理器或您的操作系统的包管理器安装yamlpath之后(yamlpath可以通过RPM对某些操作系统提供):

#!/bin/bash
# Read values directly from YAML (or EYAML, JSON, etc) for use in this shell script:
myShellVar=$(yaml-get --query=any.path.no[matter%how].complex source-file.yaml)

# Use the value any way you need:
echo "Retrieved ${myShellVar}"

# Perhaps change the value and write it back:
myShellVar="New Value"
yaml-set --change=/any/path/no[matter%how]/complex --value="$myShellVar" source-file.yaml

不过，您没有指定数据是一个简单的Scalar值，因此让我们提高赌注。如果你想要的结果是一个数组呢?更有挑战性的是，如果它是一个哈希数组，而你只想要每个结果的一个属性呢?进一步假设您的数据实际上分布在多个YAML文件中，并且您需要在单个查询中获得所有结果。这是一个更有趣的问题。所以，假设你有这两个YAML文件:

文件:data1.yaml

---
baubles:
  - name: Doohickey
    sku: 0-000-1
    price: 4.75
    weight: 2.7g
  - name: Doodad
    sku: 0-000-2
    price: 10.5
    weight: 5g
  - name: Oddball
    sku: 0-000-3
    price: 25.99
    weight: 25kg

文件:data2.yaml

---
baubles:
  - name: Fob
    sku: 0-000-4
    price: 0.99
    weight: 18mg
  - name: Doohickey
    price: 10.5
  - name: Oddball
    sku: 0-000-3
    description: This ball is odd

在应用数据2的更改后，如何仅报告库存中每个项目的sku。Yaml到data1。Yaml，所有从一个shell脚本?试试这个:

#!/bin/bash
baubleSKUs=($(yaml-merge --aoh=deep data1.yaml data2.yaml | yaml-get --query=/baubles/sku -))

for sku in "${baubleSKUs[@]}"; do
    echo "Found bauble SKU:  ${sku}"
done

你只需要几行代码就能得到你想要的东西:

Found bauble SKU:  0-000-1
Found bauble SKU:  0-000-2
Found bauble SKU:  0-000-3
Found bauble SKU:  0-000-4

如您所见，yamlpath将非常复杂的问题转化为简单的解决方案。注意，整个查询是作为一个流处理的;查询没有更改YAML文件，也没有临时文件。

I realize this is "yet another tool to solve the same question" but after reading the other answers here, yamlpath appears more portable and robust than most alternatives. It also fully understands YAML/JSON/compatible files and it does not need to convert YAML to JSON to perform requested operations. As such, comments within the original YAML file are preserved whenever you need to change data in the source YAML file. Like some alternatives, yamlpath is also portable across OSes. More importantly, yamlpath defines a query language that is extremely powerful, enabling very specialized/filtered data queries. It can even operate against results from disparate parts of the file in a single query.

If you want to get or set many values in the data at once -- including complex data like hashes/arrays/maps/lists -- yamlpath can do that. Want a value but don't know precisely where it is in the document? yamlpath can find it and give you the exact path(s). Need to merge multiple data file together, including from STDIN? yamlpath does that, too. Further, yamlpath fully comprehends YAML anchors and their aliases, always giving or changing exactly the data you expect whether it is a concrete or referenced value.

免责声明:我编写并维护了yamlpath，它是基于ruamel的。yaml是基于PyYAML的。因此，yamlpath完全符合标准。

2020-10-02 17:32:41

如何从Linux shell脚本解析YAML文件?

推荐文章

最新文章

标签