JsonPath学习笔记

Posted 二木成林

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了JsonPath学习笔记相关的知识,希望对你有一定的参考价值。

概述

介绍

JsonPath是用来提取json文档中内容的,跟xpath用来提取html或xml中内容一样,是一种路径表达式。

而这里说的是Java语言中的json-path包,其他语言如javascript也是也可以使用JsonPath的,但是相关解析的第三包不一样,下面所用的包是Java语言支持的,但json路径表达式的语法是通用的。

如果使用Gson等进行解析json,提取里面的个别值,代码会很多比较麻烦,而JsonPath只需要一个简单的路径表达式就可以提取到值,代码简洁。

一般情况下,做爬虫的对提取json中内容的需求比较大。

它的官网地址:JsonPath

安装

它的maven坐标如下:


<dependency>
    <groupId>com.jayway.jsonpath</groupId>
    <artifactId>json-path</artifactId>
    <version>2.6.0</version>
</dependency>

如果报错Caused by: java.lang.NoClassDefFoundError: net/minidev/json/writer/JsonReaderI则需要加上如下依赖:


<dependency>
    <groupId>net.minidev</groupId>
    <artifactId>asm</artifactId>
    <version>1.0.2</version>
    <scope>test</scope>
</dependency>
<dependency>
<groupId>net.minidev</groupId>
<artifactId>json-smart</artifactId>
<version>2.2.1</version>
<scope>test</scope>
</dependency>

入门

JsonPath中的根成员对象是$。JsonPath获取数据有两种表示方法:

  • 点表示法:$.store.book[0].title
  • 括号表示法:$['store']['book'][0]['title']

入门示例如下:

public class Test 
    @Test
    public void hello() 
        String json = "\\"msg\\":\\"hello world\\"";
        String msg = JsonPath.read(json, "$.msg");
        System.out.println(msg);
    
    

语法

用于测试的json字符串如下:


    "store": 
        "book": [
            
                "category": "reference",
                "author": "Nigel Rees",
                "title": "Sayings of the Century",
                "price": 8.95
            ,
            
                "category": "fiction",
                "author": "Evelyn Waugh",
                "title": "Sword of Honour",
                "price": 12.99
            ,
            
                "category": "fiction",
                "author": "Herman Melville",
                "title": "Moby Dick",
                "isbn": "0-553-21311-3",
                "price": 8.99
            ,
            
                "category": "fiction",
                "author": "J. R. R. Tolkien",
                "title": "The Lord of the Rings",
                "isbn": "0-395-19395-8",
                "price": 22.99
            
        ],
        "bicycle": 
            "color": "red",
            "price": 19.95
        
    ,
    "expensive": 10

操作符

JsonPath支持的操作符如下表:

操作说明
$查询根元素。这将启动所有路径表达式。
@当前节点由过滤谓词处理。
*通配符,必要时可用任何地方的名称或数字。
..深层扫描。必要时在任何地方可以使用名称。
.<name>点,表示子节点
['<name>' (, '<name>')]括号表示子项
[<number> (, <number>)]数组索引或索引,索引从0开始
[start:end]数组切片操作
[?(<expression>)]过滤表达式。 表达式必须求值为一个布尔值。

示例如下:

public class JsonPathTest 
    private String json = "\\"store\\":\\"book\\":[\\"category\\":\\"reference\\",\\"author\\":\\"Nigel Rees\\",\\"title\\":\\"Sayings of the Century\\",\\"price\\":8.95,\\"category\\":\\"fiction\\",\\"author\\":\\"Evelyn Waugh\\",\\"title\\":\\"Sword of Honour\\",\\"price\\":12.99,\\"category\\":\\"fiction\\",\\"author\\":\\"Herman Melville\\",\\"title\\":\\"Moby Dick\\",\\"isbn\\":\\"0-553-21311-3\\",\\"price\\":8.99,\\"category\\":\\"fiction\\",\\"author\\":\\"J. R. R. Tolkien\\",\\"title\\":\\"The Lord of the Rings\\",\\"isbn\\":\\"0-395-19395-8\\",\\"price\\":22.99],\\"bicycle\\":\\"color\\":\\"red\\",\\"price\\":19.95,\\"expensive\\":10";

    @Test
    public void test01() 
        // 获取json中store下book下的所有author值
        List<String> authors = JsonPath.read(json, "$.store.book[*].author");
        System.out.println(authors);// ["Nigel Rees","Evelyn Waugh","Herman Melville","J. R. R. Tolkien"]
    

    @Test
    public void test02() 
        // 获取所有json中所有author的值
        List<String> authors = JsonPath.read(json, "$..author");
        System.out.println(authors);// ["Nigel Rees","Evelyn Waugh","Herman Melville","J. R. R. Tolkien"]
    

    @Test
    public void test03() 
        // 获取所有的书籍和自行车
        Object obj = JsonPath.read(json, "$.store.*");
        System.out.println(obj);// [["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95,"category":"fiction","author":"Evelyn Waugh","title":"Sword of Honour","price":12.99,"category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99],"color":"red","price":19.95]
    

    @Test
    public void test04() 
        // 获取json中store下所有price的值
        List<Float> prices = JsonPath.read(json, "$.store..price");
        System.out.println(prices);// [8.95,12.99,8.99,22.99,19.95]
    

    @Test
    public void test05() 
        // 获取json中book数组的第3本书,注意索引是从0开始的,但也需要注意返回的是一个数组而非一个Map集合
        List<Map<String, Object>> book = JsonPath.read(json, "$..book[2]");
        System.out.println(book);// ["category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99]
    

    @Test
    public void test06() 
        // 获取倒数第二本书,即支持倒序索引,从-1开始
        List<Map<String, Object>> book = JsonPath.read(json, "$..book[-2]");
        System.out.println(book);// ["category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99]
    

    @Test
    public void test07() 
        // 获取前两本书,即在数组中索引为0和1的两本书
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[0,1]");
        System.out.println(books);// ["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95,"category":"fiction","author":"Evelyn Waugh","title":"Sword of Honour","price":12.99]
    

    @Test
    public void test08() 
        // 获取从索引0(包括)到索引2(排除)的所有图书,即[0,2)这样的关系
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[:2]");
        System.out.println(books);// ["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95,"category":"fiction","author":"Evelyn Waugh","title":"Sword of Honour","price":12.99]
    

    @Test
    public void test09() 
        // 获取从索引1(包括)到索引2(排除)的所有图书,即[1,2)这样的关系
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[1:2]");
        System.out.println(books);// ["category":"fiction","author":"Evelyn Waugh","title":"Sword of Honour","price":12.99]
    

    @Test
    public void test10() 
        // 获取json中book数组的最后两个值
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[-2:]");
        System.out.println(books);// ["category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99]
    

    @Test
    public void test11() 
        // 获取json中book数组的第3个到最后一个的区间值,包括最后一个
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[2:]");
        System.out.println(books);// ["category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99]
    

    @Test
    public void test12() 
        // 获取json中book数组中包含isbn的所有值
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[?(@.isbn)]");
        System.out.println(books);// ["category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99]
    

    @Test
    public void test13() 
        // 获取json中book数组中price<10的所有值
        List<Map<String, Object>> books = JsonPath.read(json, "$..book[?(@.price < 10)]");
        System.out.println(books);// ["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95,"category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99]
    

函数

JsonPath还支持一些函数,函数可以在路径的尾部调用,函数的输出是路径表达式的输出,该函数的输出是由函数本身所决定的。如下表:

函数描述输出
min()提供数字数组的最小值Double
max()提供数字数组的最大值Double
avg()提供数字数组的平均值Double
stddev()提供数字数组的标准偏差值Double
length()提供数组的长度Integer
sum()提供数组中所有原生的总和Double
keys()提供所有的键名Set
concat(X)Provides a concatinated version of the path output with a new itemlike input
append(X)向路径表达式的输出数组新增一项like input

示例如下:

public class JsonPathTest 
    private String json = "\\"store\\":\\"book\\":[\\"category\\":\\"reference\\",\\"author\\":\\"Nigel Rees\\",\\"title\\":\\"Sayings of the Century\\",\\"price\\":8.95,\\"category\\":\\"fiction\\",\\"author\\":\\"Evelyn Waugh\\",\\"title\\":\\"Sword of Honour\\",\\"price\\":12.99,\\"category\\":\\"fiction\\",\\"author\\":\\"Herman Melville\\",\\"title\\":\\"Moby Dick\\",\\"isbn\\":\\"0-553-21311-3\\",\\"price\\":8.99,\\"category\\":\\"fiction\\",\\"author\\":\\"J. R. R. Tolkien\\",\\"title\\":\\"The Lord of the Rings\\",\\"isbn\\":\\"0-395-19395-8\\",\\"price\\":22.99],\\"bicycle\\":\\"color\\":\\"red\\",\\"price\\":19.95,\\"expensive\\":10";

    @Test
    public void test01() 
        // 获取有几本书
        int length = JsonPath.read(json, "$..book.length()");
        System.out.println(length);
    

    @Test
    public void test02() 
        // 获取最小书价
        double min = JsonPath.read(json, "$..book..price.min()");
        System.out.println(min);// 8.95
    

    @Test
    public void test03() 
        // 获取最大书价
        double max = JsonPath.read(json, "$..book..price.max()");
        System.out.println(max);// 22.99
    

    @Test
    public void test04() 
        // 获取平均书价
        double avg = JsonPath.read(json, "$..book..price.avg()");
        System.out.println(avg);// 13.48
    

    @Test
    public void test05() 
        // 获取书价的标准偏差值,这是统计学上的概念,几乎用不到
        Object obj = JsonPath.read(json, "$..book..price.stddev()");
        System.out.println(obj);
    

    @Test
    public void test06() 
        // 获取所有书的总价
        double sum = JsonPath.read(json, "$..book..price.sum()");
        System.out.println(sum);// 53.92
    

    @Test
    public void test07() 
        // 获取第二本书所有的属性键名
        Set<String> set = JsonPath.read(json, "$.store.book[2].keys()");
        System.out.println(set);// [category, author, title, isbn, price]
    

过滤器运算符

过滤器是用于筛选数组的逻辑表达式。一个典型的过滤器将是[?(@.age > 18)],其中@表示正在处理的当前项。 可以使用逻辑运算符&&||(分别表示)创建更复杂的过滤器。 字符串文字必须用单引号或双引号括起来([?(@.color == 'blue')] 或者 [?(@.color == "blue")]).

操作符描述
==left等于right(注意1不等于’1’)
!=不等于
<小于
<=小于等于
>大于
>=大于等于
=~匹配正则表达式,如[?(@.name =~ /foo.*?/i)]
in左边存在于右边,如[?(@.size in ['S', 'M'])]
nin左边不存在于右边
subsetof左边是一个右边的子集合,如[?(@.sizes subsetof ['S','M','L'])]
anyof左边的集合与右边的集合相交,如[?(@.sizes anyof ['M','L'])]
noneof左右的集合与右边的集合不相交,如[?(@.sizes noneof ['M','L'])]
size左边(数组或字符串)长度应该匹配右边的数值
empty左边(数组或字符串)为空

示例如下:

public class JsonPathTest 
    private String json = "\\"store\\":\\"book\\":[\\"category\\":\\"reference\\",\\"author\\":\\"Nigel Rees\\",\\"title\\":\\"Sayings of the Century\\",\\"price\\":8.95,\\"category\\":\\"fiction\\",\\"author\\":\\"Evelyn Waugh\\",\\"title\\":\\"Sword of Honour\\",\\"price\\":12.99,\\"category\\":\\"fiction\\",\\"author\\":\\"Herman Melville\\",\\"title\\":\\"Moby Dick\\",\\"isbn\\":\\"0-553-21311-3\\",\\"price\\":8.99,\\"category\\":\\"fiction\\",\\"author\\":\\"J. R. R. Tolkien\\",\\"title\\":\\"The Lord of the Rings\\",\\"isbn\\":\\"0-395-19395-8\\",\\"price\\":22.99],\\"bicycle\\":\\"color\\":\\"red\\",\\"price\\":19.95,\\"expensive\\":10";

    @Test
    public void test01() 
        // 查找category等于"fiction"的书籍
        List<Map<String, Object>> books = JsonPath.read(json, "$.store.book[?(@.category=='fiction')]");
        System.out.println(books);// ["category":"fiction","author":"Evelyn Waugh","title":"Sword of Honour","price":12.99,"category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99]
    

    @Test
    public void test02() 
        // 查找category不等于"fiction"的书籍
        List<Map<String, Object>> books = JsonPath.read(json, "$.store.book[?(@.category!='fiction')]");
        System.out.println(books);// ["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95]
    

    @Test
    public void test03() 
        // 查找price小于10的书籍
        List<Map<String, Object>> books = JsonPath.read(json, "$.store.book[?(@.price<10)]");
        System.out.println(books);// ["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95,"category":"fiction","author":"Herman Melville","title":"Moby Dick","isbn":"0-553-21311-3","price":8.99]
    

    @Test
    public void test04() 
        // 查找price大于10的书籍
        List<Map<String, Object>> books = JsonPath.read(json, "$.store.book[?(@.price>10)]");
        System.out.println(books);// ["category":"fiction","author":"Evelyn Waugh","title":"Sword of Honour","price":12.99,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99]
    

    @Test
    public void test05() 
        // 利用正则表达式匹配title中包含"of the"的书籍
        List<Map<String, Object>> books = JsonPath.read(json, "$.store.book[?(@.title =~ /.*of the.*/i)]");
        System.out.println(books);// ["category":"reference","author":"Nigel Rees","title":"Sayings of the Century","price":8.95,"category":"fiction","author":"J. R. R. Tolkien","title":"The Lord of the Rings","isbn":"0-395-19395-8","price":22.99]
    

    @Test
    public void test06() 
        // 查询category存在于["reference","abc"]的书籍
        List<

以上是关于JsonPath学习笔记的主要内容,如果未能解决你的问题,请参考以下文章

httprunner 2.x学习14-jsonpath提取(解决:ResponseObject does not have attribute: parsed_body)

Xpathbs4和jsonpath

jsonpath解析

jmeter--使用jsonpath来实现变量参数 以及自动化测试

爬虫进阶数据提取-jsonpath模块

Python爬虫 JsonPath -- JsonPath的安装和基本使用