[Elasticsearch] Java操作Elasticsearch6实现group by分组查询

Posted 一杯糖不加咖啡

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[Elasticsearch] Java操作Elasticsearch6实现group by分组查询相关的知识,希望对你有一定的参考价值。

[转载]通过上篇博客的总结,我们知道了在Elasticsearch6中count、distinct和count(distinct)方法的使用。本篇博客继续聚合查询的学习,也就是对应mysql中的group by的使用。
公共实体
对于下面要介绍的查询,返回结果为统一实体,代码如下:

/**
 * 单个字段分组返回结果
 *
 * @date : 2020-11-18 15:02
 */
@Data
public class AggregationForOneDTO implements Serializable 
    /**
     * 分组字段对应的值
     */
    private String key;
    /**
     * 分组统计字段对应的总数
     */
    private Integer count;

  1. group by分组统计
    对应mysql中的sql如下:
select field1,count(field2) from table_name group by field1;

针对上面的sql,对应的elasticsearch代码如下:

/**
 * 指定索引文档数据中按某个字段分组后对应的文档总数
 */
@Test
public void testCountGroupBy() 
    SearchRequest searchRequest = new SearchRequest();
    searchRequest.indices("indexName").types("indexType");
    TermsAggregationBuilder aggregation = AggregationBuilders
            //别名
    		.terms("uid")  
            //聚合字段名
            .field("uid.keyword")
            //降序
            .order(BucketOrder.count(false))
            //聚合结果数据量,默认只返回前十条
            .size(100);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.aggregation(aggregation);
    //执行查询
    searchRequest.source(searchSourceBuilder);
    List<AggregationForOneDTO> result = new ArrayList<>();
    SearchResponse response;
    try 
        response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        log.info("response is ", response);
        Terms byAgeAggregation = response.getAggregations().get("uid");
        for (Terms.Bucket buck : byAgeAggregation.getBuckets()) 
            AggregationForOneDTO aggregationForOne = new AggregationForOneDTO();
            aggregationForOne.setCount((int) buck.getDocCount());
            aggregationForOne.setKey(buck.getKeyAsString());
            result.add(aggregationForOne);
        
     catch (IOException e) 
        log.error("[EsClientConfig.groupByField][error][fail to query]", e);
    
    log.info("result is ", JSON.toJSONString(result));

为了看到更直观的结果,附上一张结果截图,其中对应的key就是分组的字段值,count就是通过该字段查询到的文档总数:

  1. group by分组统计去重
    对应mysql中的sql如下:
select field1,count(distinct (field2)) from table_name group by field1;

对应的Elasticsearch查询代码如下:

@Test
public void testCountDistinctGroupBy() 
    SearchRequest searchRequest = new SearchRequest();
    searchRequest.indices("indexName").types("indexType");
    //指定去重字段,cardinality指定别名,field指定字段名
    CardinalityAggregationBuilder aggregationBuilder = 
AggregationBuilders.cardinality("alias").field("field_distinct");
    //指定分组字段,terms指定别名,field指定字段名
    TermsAggregationBuilder aggregation = AggregationBuilders.terms("alias")  
            //聚合字段名
            .field("field_group")
            .subAggregation(aggregationBuilder)
            .size(100)
            //按去重字段数量降序
            .order(BucketOrder.aggregation("field_distinct", false));
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.aggregation(aggregation);
    //执行查询
    searchRequest.source(searchSourceBuilder);
    List<AggregationForOneDTO> result = new ArrayList<>();
    SearchResponse response;
    try 
        response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        Terms byAgeAggregation = response.getAggregations().get("field_group");
        for (Terms.Bucket buck : byAgeAggregation.getBuckets()) 
            Aggregations aggregations1 = buck.getAggregations();
            Aggregation subjectCount = aggregations1.get("field_distinct");
            JSONObject jsonObject = JSON.parseObject(JSON.toJSONString(subjectCount));
            String cardinalityValue = jsonObject.getString("value");
            AggregationForOneDTO aggregationForOne = new AggregationForOneDTO();
            aggregationForOne.setCount(Integer.parseInt(cardinalityValue));
            aggregationForOne.setKey(buck.getKeyAsString());
            result.add(aggregationForOne);
        
     catch (IOException e) 
        log.error("[EsClientConfig.groupByField][error][fail to query]", e);
    
    log.info("result is ", JSON.toJSONString(result));

结果如下,和第一个查询一样,只是count是按照某个字段去重后的结果统计:

以上是关于[Elasticsearch] Java操作Elasticsearch6实现group by分组查询的主要内容,如果未能解决你的问题,请参考以下文章

docker elasticsearch挂载宿主机报 java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes(

学习用Node.js和Elasticsearch构建搜索引擎

elasticsearch中的log4j升级

elasticsearch+moloch

linux常用命令总结

CentOS上安装elasticsearch