使用Unix工具解析JSON

我试图解析从curl请求返回的JSON，就像这样:

curl 'http://twitter.com/users/username.json' |
    sed -e 's/[{}]/''/g' | 
    awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}'

上面将JSON划分为多个字段，例如:

% ...
"geo_enabled":false
"friends_count":245
"profile_text_color":"000000"
"status":"in_reply_to_screen_name":null
"source":"web"
"truncated":false
"text":"My status"
"favorited":false
% ...

我如何打印一个特定的字段(由-v k=文本表示)?

当前回答

有一个有趣的工具在现有的答案中还没有涉及到，那就是使用用Go编写的gron，它的口号是Make JSON可greppable!这正是它所做的。

所以从本质上讲，gron将JSON分解为离散的赋值，查看它的绝对“路径”。与jq等其他工具相比，它的主要优点是允许在不知道要搜索的记录是如何嵌套的情况下搜索值，而不会破坏原始的JSON结构

例如，我想从下面的链接搜索'twitter_username'字段，我只是这样做

% gron 'https://api.github.com/users/lambda' | fgrep 'twitter_username'
json.twitter_username = "unlambda";
% gron 'https://api.github.com/users/lambda' | fgrep 'twitter_username' | gron -u
{
  "twitter_username": "unlambda"
}

就这么简单。请注意gron -u (ungron的缩写)如何从搜索路径重新构造JSON。使用fgrep只是为了将搜索过滤到所需的路径，而不是让搜索表达式作为正则表达式计算，而是作为固定字符串(本质上是grep -F)

另一个搜索字符串以查看记录在嵌套结构中的位置的示例

% echo '{"foo":{"bar":{"zoo":{"moo":"fine"}}}}' | gron | fgrep "fine"
json.foo.bar.zoo.moo = "fine";

它还通过-s命令行标志支持JSON流，在这里您可以连续地对输入流进行gron以获得匹配的记录。此外，gron具有零运行时依赖性。你可以下载Linux、Mac、Windows或FreeBSD的二进制文件并运行它。

更多的用法示例和行程可以在官方Github页面-高级用法中找到

至于为什么可以使用gron而不是其他JSON解析工具，请参阅项目页面的作者注释。

为什么我不应该直接使用jq?

Jq非常棒，比gron强大得多，但这种强大带来了复杂性。Gron的目标是使您更容易使用您已经知道的工具，如grep和sed。

2021-01-18 09:04:45

其他回答

Niet是一个工具，可以帮助您直接在shell或Bash CLI中从JSON或YAML文件中提取数据。

pip install niet

考虑一个名为project的JSON文件。Json，包含以下内容:

{
  project: {
    meta: {
      name: project-sample
    }
}

你可以这样使用Niet:

PROJECT_NAME=$(niet project.json project.meta.name)
echo ${PROJECT_NAME}

输出:

project-sample

2018-02-12 15:37:32

使用node . js

如果系统安装了Node.js，则可以在JSON中使用-p print和-e evaluate脚本标志。解析以提取所需的任何值。

一个简单的例子，使用JSON字符串{"foo": "bar"}并取出"foo"的值:

node -pe 'JSON.parse(process.argv[1]).foo' '{ "foo": "bar" }'

输出:

bar

因为我们可以访问cat和其他实用程序，我们可以对文件使用这个:

node -pe 'JSON.parse(process.argv[1]).foo' "$(cat foobar.json)"

输出:

bar

或包含JSON的URL等任何其他格式:

node -pe 'JSON.parse(process.argv[1]).name' "$(curl -s https://api.github.com/users/trevorsenior)"

输出:

Trevor Senior

2013-08-27 15:11:22

在martinr和Boecko的带领下:

curl -s 'http://twitter.com/users/username.json' | python -mjson.tool

这将为您提供一个非常适合grep的输出。非常方便:

curl -s 'http://twitter.com/users/username.json' | python -mjson.tool | grep my_key

2011-06-12 08:04:15

对于更复杂的JSON解析，我建议使用Python jsonpath模块(Stefan Goessner) -

Install it - sudo easy_install -U jsonpath Use it - Example file.json (from http://goessner.net/articles/JsonPath) - { "store": { "book": [ { "category": "reference", "author": "Nigel Rees", "title": "Sayings of the Century", "price": 8.95 }, { "category": "fiction", "author": "Evelyn Waugh", "title": "Sword of Honour", "price": 12.99 }, { "category": "fiction", "author": "Herman Melville", "title": "Moby Dick", "isbn": "0-553-21311-3", "price": 8.99 }, { "category": "fiction", "author": "J. R. R. Tolkien", "title": "The Lord of the Rings", "isbn": "0-395-19395-8", "price": 22.99 } ], "bicycle": { "color": "red", "price": 19.95 } } } Parse it (extract all book titles with price < 10) - cat file.json | python -c "import sys, json, jsonpath; print '\n'.join(jsonpath.jsonpath(json.load(sys.stdin), 'store.book[?(@.price < 10)].title'))" Will output - Sayings of the Century Moby Dick Note: The above command line does not include error checking. For a full solution with error checking, you should create a small Python script, and wrap the code with try-except.

2014-04-01 08:57:52

在shell脚本中解析JSON非常痛苦。使用更合适的语言，创建一个工具，以与shell脚本约定一致的方式提取JSON属性。您可以使用您的新工具来解决当前的shell脚本问题，然后将其添加到您的工具包中以备将来使用。

例如，考虑一个jsonlookup工具，如果我说jsonlookup访问令牌id，它将返回在来自标准输入的属性访问中定义的属性令牌中定义的属性id，这些属性令牌可能是JSON数据。如果该属性不存在，该工具将不返回任何内容(退出状态1)。如果解析失败，则退出状态2并返回标准错误消息。如果查找成功，该工具将打印属性的值。

创建了一个用于精确提取JSON值的Unix工具后，您可以轻松地在shell脚本中使用它:

access_token=$(curl <some horrible crap> | jsonlookup access token id)

任何语言都可以实现jsonlookup。下面是一个相当简洁的Python版本:

#!/usr/bin/python

import sys
import json

try: rep = json.loads(sys.stdin.read())
except:
    sys.stderr.write(sys.argv[0] + ": unable to parse JSON from stdin\n")
    sys.exit(2)
for key in sys.argv[1:]:
    if key not in rep:
        sys.exit(1)
    rep = rep[key]
print rep

2014-02-02 21:28:01

使用Unix工具解析JSON

推荐文章

最新文章

标签