如何在Python中解析YAML文件

如何在Python中解析YAML文件?

不依赖C头文件的最简单和最纯粹的方法是PyYaml(文档)，可以通过pip install PyYaml安装:

#!/usr/bin/env python

import yaml

with open("example.yaml", "r") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

就是这样。普通的yaml.load()函数也存在，但是应该始终优先使用yaml.safe_load()，以避免引入任意代码执行的可能性。因此，除非显式地需要任意对象序列化/反序列化，否则请使用safe_load。

注意PyYaml项目支持YAML 1.1规范的更高版本。如果需要YAML 1.2规范支持，请参阅ruamel。Yaml在这个答案中提到。

此外，您还可以使用一个替换pyyaml的drop，它可以使您的yaml文件保持原样，称为oyaml。在这里查看oyaml的synk

2009-11-21 00:23:34

如果你的YAML符合YAML 1.2规范(2009年发布)，那么你应该使用ruamel。yaml(免责声明:我是该包的作者)。它本质上是PyYAML的超集，它支持大部分YAML 1.1(从2005年开始)。

如果您希望在往返时能够保留注释，那么当然应该使用ruame .yaml。

升级@Jon的例子很简单:

import ruamel.yaml as yaml

with open("example.yaml") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

使用safe_load()，除非你真的完全控制输入，需要它(很少情况下)并且知道你在做什么。

如果你正在使用pathlib路径来操作文件，你最好使用新的API ruamel。yaml提供:

from ruamel.yaml import YAML
from pathlib import Path

path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

2016-08-12 16:15:46

使用Python 2+3(和unicode)读写YAML文件

# -*- coding: utf-8 -*-
import yaml
import io

# Define data
data = {
    'a list': [
        1, 
        42, 
        3.141, 
        1337, 
        'help', 
        u'€'
    ],
    'a string': 'bla',
    'another dict': {
        'foo': 'bar',
        'key': 'value',
        'the answer': 42
    }
}

# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
    yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)

# Read YAML file
with open("data.yaml", 'r') as stream:
    data_loaded = yaml.safe_load(stream)

print(data == data_loaded)

创建YAML文件

a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
  foo: bar
  key: value
  the answer: 42

常见的文件结尾

.yml 和 .yaml

选择

CSV: Super simple format (read & write) JSON: Nice for writing human-readable data; VERY commonly used (read & write) YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML) pickle: A Python serialization format (read & write) ⚠️ Using pickle with files from 3rd parties poses an uncontrollable arbitrary code execution risk. MessagePack (Python package): More compact representation (read & write) HDF5 (Python package): Nice for matrices (read & write) XML: exists too *sigh* (read & write)

对于您的应用程序，以下内容可能很重要:

其他编程语言的支持读写能力紧凑性(文件大小)

请参见:数据序列化格式的比较

如果您正在寻找一种创建配置文件的方法，您可能想要阅读我的简短文章Python中的配置文件

2017-02-05 17:07:21

#!/usr/bin/env python

import sys
import yaml

def main(argv):

    with open(argv[0]) as stream:
        try:
            #print(yaml.load(stream))
            return 0
        except yaml.YAMLError as exc:
            print(exc)
            return 1

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))

2017-04-06 13:15:25

首先使用pip3安装pyyaml。

然后导入yaml模块并将文件加载到名为'my_dict'的字典中:

import yaml
with open('filename.yaml') as f:
    my_dict = yaml.safe_load(f)

这就是你所需要的。现在整个yaml文件都在'my_dict'字典中。

2017-10-30 18:27:34

我用ruame .yaml。详情和辩论在这里。

from ruamel import yaml

with open(filename, 'r') as fp:
    read_data = yaml.load(fp)

ruamel的用法yaml兼容(一些简单的可解决的问题)PyYAML的旧用法，正如我提供的链接中所述，使用

from ruamel import yaml

而不是

import yaml

它会解决你的大部分问题。

编辑:PyYAML并没有死，只是在另一个地方维护了它。

2018-01-22 13:54:58

例子:

defaults.yaml

url: https://www.google.com

environment.py

from ruamel import yaml

data = yaml.safe_load(open('defaults.yaml'))
data['url']

2018-05-20 07:41:44

像这样访问YAML文件中列表的任何元素:

global:
  registry:
    url: dtr-:5000/
    repoPath:
  dbConnectionString: jdbc:oracle:thin:@x.x.x.x:1521:abcd

您可以使用以下python脚本:

import yaml

with open("/some/path/to/yaml.file", 'r') as f:
    valuesYaml = yaml.load(f, Loader=yaml.FullLoader)

print(valuesYaml['global']['dbConnectionString'])

2020-10-28 18:21:30

Read_yaml_file函数返回所有数据到字典中。

def read_yaml_file(full_path=None, relative_path=None):
    if relative_path is not None:
        resource_file_location_local = ProjectPaths.get_project_root_path() + relative_path
    else:
        resource_file_location_local = full_path

    with open(resource_file_location_local, 'r') as stream:
        try:
            file_artifacts = yaml.safe_load(stream)
        except yaml.YAMLError as exc:
            print(exc)
    return dict(file_artifacts.items())

2021-08-18 22:18:51

我自己写了剧本。请随意使用它，只要你保留属性。该脚本可以从文件(函数加载)解析yaml，从字符串(函数加载)解析yaml，并将字典转换为yaml(函数转储)。它尊重所有的变量类型。

# © didlly AGPL-3.0 License - github.com/didlly

def is_float(string: str) -> bool:
    try:
        float(string)
        return True
    except ValueError:
        return False


def is_integer(string: str) -> bool:
    try:
        int(string)
        return True
    except ValueError:
        return False


def load(path: str) -> dict:
    with open(path, "r") as yaml:
        levels = []
        data = {}
        indentation_str = ""

        for line in yaml.readlines():
            if line.replace(line.lstrip(), "") != "" and indentation_str == "":
                indentation_str = line.replace(line.lstrip(), "").rstrip("\n")
            if line.strip() == "":
                continue
            elif line.rstrip()[-1] == ":":
                key = line.strip()[:-1]
                quoteless = (
                    is_float(key)
                    or is_integer(key)
                    or key == "True"
                    or key == "False"
                    or ("[" in key and "]" in key)
                )

                if len(line.replace(line.strip(), "")) // 2 < len(levels):
                    if quoteless:
                        levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
                    else:
                        levels[len(line.replace(line.strip(), "")) // 2] = f"['{key}']"
                else:
                    if quoteless:
                        levels.append(f"[{line.strip()[:-1]}]")
                    else:
                        levels.append(f"['{line.strip()[:-1]}']")
                if quoteless:
                    exec(
                        f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}]"
                        + " = {}"
                    )
                else:
                    exec(
                        f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}']"
                        + " = {}"
                    )

                continue

            key = line.split(":")[0].strip()
            value = ":".join(line.split(":")[1:]).strip()

            if (
                is_float(value)
                or is_integer(value)
                or value == "True"
                or value == "False"
                or ("[" in value and "]" in value)
            ):
                if (
                    is_float(key)
                    or is_integer(key)
                    or key == "True"
                    or key == "False"
                    or ("[" in key and "]" in key)
                ):
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = {value}"
                    )
                else:
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = {value}"
                    )
            else:
                if (
                    is_float(key)
                    or is_integer(key)
                    or key == "True"
                    or key == "False"
                    or ("[" in key and "]" in key)
                ):
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = '{value}'"
                    )
                else:
                    exec(
                        f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = '{value}'"
                    )
    return data


def loads(yaml: str) -> dict:
    levels = []
    data = {}
    indentation_str = ""

    for line in yaml.split("\n"):
        if line.replace(line.lstrip(), "") != "" and indentation_str == "":
            indentation_str = line.replace(line.lstrip(), "")
        if line.strip() == "":
            continue
        elif line.rstrip()[-1] == ":":
            key = line.strip()[:-1]
            quoteless = (
                is_float(key)
                or is_integer(key)
                or key == "True"
                or key == "False"
                or ("[" in key and "]" in key)
            )

            if len(line.replace(line.strip(), "")) // 2 < len(levels):
                if quoteless:
                    levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
                else:
                    levels[len(line.replace(line.strip(), "")) // 2] = f"['{key}']"
            else:
                if quoteless:
                    levels.append(f"[{line.strip()[:-1]}]")
                else:
                    levels.append(f"['{line.strip()[:-1]}']")
            if quoteless:
                exec(
                    f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}]"
                    + " = {}"
                )
            else:
                exec(
                    f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}']"
                    + " = {}"
                )

            continue

        key = line.split(":")[0].strip()
        value = ":".join(line.split(":")[1:]).strip()

        if (
            is_float(value)
            or is_integer(value)
            or value == "True"
            or value == "False"
            or ("[" in value and "]" in value)
        ):
            if (
                is_float(key)
                or is_integer(key)
                or key == "True"
                or key == "False"
                or ("[" in key and "]" in key)
            ):
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = {value}"
                )
            else:
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = {value}"
                )
        else:
            if (
                is_float(key)
                or is_integer(key)
                or key == "True"
                or key == "False"
                or ("[" in key and "]" in key)
            ):
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = '{value}'"
                )
            else:
                exec(
                    f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = '{value}'"
                )

    return data


def dumps(yaml: dict, indent="") -> str:
    """A procedure which converts the dictionary passed to the procedure into it's yaml equivalent.

    Args:
        yaml (dict): The dictionary to be converted.

    Returns:
        data (str): The dictionary in yaml form.
    """

    data = ""

    for key in yaml.keys():
        if type(yaml[key]) == dict:
            data += f"\n{indent}{key}:\n"
            data += dumps(yaml[key], f"{indent}  ")
        else:
            data += f"{indent}{key}: {yaml[key]}\n"

    return data


print(load("config.yml"))

例子

config.yml

level 0 value: 0

level 1:
  level 1 value: 1
  level 2:
    level 2 value: 2

level 1 2:
  level 1 2 value: 1 2
  level 2 2:
    level 2 2 value: 2 2

输出

{'level 0 value': 0, 'level 1': {'level 1 value': 1, 'level 2': {'level 2 value': 2}}, 'level 1 2': {'level 1 2 value': '1 2', 'level 2 2': {'level 2 2 value': 2 2}}}

2022-03-21 16:30:54

如何在Python中解析YAML文件

推荐文章

最新文章

标签