如何使用 jq 从复杂的 JSON 文件中获取简单的 JSON 数据文件

Posted

技术标签:

【中文标题】如何使用 jq 从复杂的 JSON 文件中获取简单的 JSON 数据文件【英文标题】:How to get a simple JSON data file from a complex JSON file using jq 【发布时间】:2017-01-27 14:46:24 【问题描述】:

我有一个复杂的 json 文件,嵌套在第 4 层和第 5 层,我正在尝试使用 jq 获得以下结果。任何帮助将不胜感激:


  "Name": "unix-global",
  "Title": "AWS cli should be installed",
  "desc": "System Package aws-cli should be installed",
  "result": "passed"
 

  "Name": "unix-global",
  "Title": "AWS cli should be installed",
  "desc": "Service besclient should be installed",
  "result": "failed"

这是一个 json 文件,我通过运行检查配置文件获得。真正的目的是将唯一需要的信息提取到一个简单的 json 中,以便我最终可以更新 AWS Redshift 数据库。


  "version": "1.7.1",
  "profiles": [
    "name": "java",
    "title": "InSpec Java in system",
    "maintainer": "awim",
    "copyright": "awim / mtaqwim",
    "copyright_email": "muhammadtaqwiem@gmail.com",
    "license": "All Rights Reserved",
    "summary": "An InSpec Compliance Profile",
    "version": "0.0.1",
    "supports": [],
    "controls": [
      "title": "identify java in system",
      "desc": "identify java in PATH system",
      "impact": 0.3,
      "refs": [],
      "tags": ,
      "code": "control 'java-1.0' do\n  impact 0.3\n  title 'identify java in system'\n  desc 'identify java in PATH system'\n\n  describe java_info do\n    it should exist \n    its(:version) should match '1.7'\n  end\nend",
      "source_location": 
        "ref": "inspec/java/controls/java_1.0.rb",
        "line": 6
      ,
      "id": "java-1.0",
      "results": [
        "status": "passed",
        "code_desc": "java_info should exist",
        "run_time": 0.000895896,
        "start_time": "2017-01-20 05:04:47 +0000"
      , 
        "status": "passed",
        "code_desc": "java_info version should match \"1.7\"",
        "run_time": 0.067581113,
        "start_time": "2017-01-20 05:04:47 +0000"
      ]
    , 
      "title": "run java from specific path",
      "desc": "run java from specific path",
      "impact": 1.0,
      "refs": [],
      "tags": ,
      "code": "control 'java-2.0' do\n  impact 1.0\n  title 'run java from specific path'\n  desc 'run java from specific path'\n\n  describe java_info(java_path) do\n    it should exist \n    its(:version) should match '1.7'\n  end\nend",
      "source_location": 
        "ref": "inspec/java/controls/java_2.0.rb",
        "line": 8
      ,
      "id": "java-2.0",
      "results": [
        "status": "skipped",
        "code_desc": "java_info",
        "skip_message": "Can't find file \"/opt/jdk/current\"",
        "resource": "java_info",
        "run_time": 1.6512e-05,
        "start_time": "2017-01-20 05:04:47 +0000"
      ]
    , 
      "title": "identify java home",
      "desc": "identify java home match to specific path",
      "impact": 0.1,
      "refs": [],
      "tags": ,
      "code": "control 'java-3.0' do\n  impact 0.1\n  title 'identify java home'\n  desc 'identify java home match to specific path'\n\n  describe java_info(java_path) do\n    its(:java_home) should match java_path\n  end\nend",
      "source_location": 
        "ref": "inspec/java/controls/java_3.0.rb",
        "line": 8
      ,
      "id": "java-3.0",
      "results": [
        "status": "skipped",
        "code_desc": "java_info",
        "skip_message": "Can't find file \"/opt/jdk/current\"",
        "resource": "java_info",
        "run_time": 6.139e-06,
        "start_time": "2017-01-20 05:04:47 +0000"
      ]
    ],
    "groups": [
      "title": "which(UNIX)/where(Windows) java installed",
      "controls": ["java-1.0"],
      "id": "controls/java_1.0.rb"
    , 
      "title": "which(UNIX)/where(Windows) java installed",
      "controls": ["java-2.0"],
      "id": "controls/java_2.0.rb"
    , 
      "title": "which(UNIX)/where(Windows) java installed",
      "controls": ["java-3.0"],
      "id": "controls/java_3.0.rb"
    ],
    "attributes": []
  ],
  "other_checks": [],
  "statistics": 
    "duration": 0.069669698
  

【问题讨论】:

您应该 a) 准确描述简化 JSON 中应该包含的内容,b) 展示您如何尝试提取它以及如何失败,而不是期望有人为您编写代码。 @user5188385 - 请放心,jq 是您任务的绝佳选择,所以我建议您学习基础知识,当您有更具体的问题时,请提供一个最小的完整可验证示例***.com/help/mcve 的指南 第一个代码块是我期望的结果...名称、标题、描述、复杂 json 的结果(在第二个代码块中)。使用jqplay.org 上的复杂 json 尝试此代码:.profiles[0].name, .profiles[0].controls[].title, .profiles[0].controls[].results[].status 【参考方案1】:

这里有一个jq 过滤器来解决这个问题。请注意,过滤器之间的“管道”是必不可少的。您必须先将每个父数组展平,然后再展平它的子数组,否则您将得到它们的笛卡尔积(这非常很糟糕)。

jq '.profiles[] 
     |  Name: .name , Controls: .controls[]  
     |  Name: .Name, Desc: .Controls.desc , Title: .Controls.title , Results: .Controls.results[]  
     |  Name: .Name, Desc: .Desc , Title: .Title , StartTime: .Results.start_time , RunTime: .Results.run_time , Result: .Results.status '

为了清晰起见,在代码中添加了换行符

输出:


  "Name": "java",
  "Desc": "identify java in PATH system",
  "Title": "identify java in system",
  "StartTime": "2017-01-20 05:04:47 +0000",
  "RunTime": 0.000895896,
  "Result": "passed"

…etc

一旦你将它展平到这里,我会考虑将其保存为 CSV,因为这样加载到 Redshift 中会更简单一些。

 jq '.profiles[] 
     |  Name: .name , Controls: .controls[]  
     |  Name: .Name, Desc: .Controls.desc , Title: .Controls.title , Results: .Controls.results[]  
     | [ .Name, .Desc , .Title , .Results.start_time , .Results.run_time , .Results.status ] 
     | @csv '

输出:

"\"java\",\"identify java in PATH system\",\"identify java in system\",\"2017-01-20 05:04:47 +0000\",0.000895896,\"passed\""
…etc

【讨论】:

非常感谢乔。这正是我一直在寻找的东西。还要感谢在加载到 Redshift 之前保存为 csv 的提示。

以上是关于如何使用 jq 从复杂的 JSON 文件中获取简单的 JSON 数据文件的主要内容,如果未能解决你的问题,请参考以下文章

如何使用 jq 从 JSON 中获取键名

如何使用 jq 中的流选项从 JSON 文件中检索键和值

如何使用 jq 展平复杂的 json 结构?

使用'jq'[重复]从JSON输出中获取数据

如何在没有`jq`的情况下获取JSON字段值?

jq 从数组中选择值,其中 json 没有任何根元素