Elasticsearch之使用RestClient实现script正则countsource查询

Posted 2021-08-30 你是小KS

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Elasticsearch之使用RestClient实现script正则countsource查询相关的知识，希望对你有一定的参考价值。

当前版本elasticsearch 7.13.4

1. 声明

当前内容主要为本人学习和使用RestClietn实现script、正则、count、source查询,主要参考：官方文档

主要涉及

使用script实现脚本查询
使用正则进行匹配查询
使用count查询文档数量
使用source只查询返回的_source中的内容

当前文章基于前面博文：Es操作

2. 基本的script查询

官方的：

但是本人用postman的为：

  "query": {
    "bool": {
      "filter": {
        "script": {
          //"script": "double price = doc['price'].value;if(doc['bookName'].value == 'javaSE') {return price>10;}return false;"
          "script": "double price = doc['price'].value;long count = doc['count'].value;if( count > 1 ) {return price>10;}return false;"
        }
      }
    }
  }
}

所以测试发现使用双引号中就可以使用脚本，但是脚本中不可使用doc[字符串]

如果使用doc[‘具有字符串属性的字段’]，直接报错，还有就是默认写入整数实际返回为long,小数默认为double

"caused_by": 
{
   "type": "illegal_argument_exception",
    "reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [bookName] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}

具体java操作

public static void main(String[] args) throws IOException {
		RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200, "http")).build();
		// Sniffer 默认为5分钟一次更新节点信息（主要为控制节点，判断节点是否可以访问）
		// sniffer的主要作用就是按照指定的时间间隔方式修改restClient的setNodes方法
		Sniffer sniffer = Sniffer.builder(restClient).build();
		// 手动设置为1分钟更新一次节点
		// Sniffer.builder(restClient).setSniffIntervalMillis(60*1000);
		try {
			// 	查询操作：使用script方式进行匹配查询.，脚本查询必须返回boolean类型
			selectDataUsingScript(restClient);
			
		} finally {
			// 注意关闭顺序
			sniffer.close();
			restClient.close();
		}

	}
/**
	 * 
	 * @author hy
	 * @createTime 2021-08-01 08:47:55
	 * @description 当前内容主要为使用脚本方式查询es中的数据
	 * @param restClient
	 * @throws IOException 
	 *
	 */
	private static void selectDataUsingScript(RestClient restClient) throws IOException {
		// 查询书名为java se且价格大于10的,且count数量大于1的，使用脚本方式查询
		Request request = new Request("GET", "/book/java/_search");
		request.setJsonEntity("{\\"query\\": {\\"bool\\": {\\"filter\\": {\\"script\\": {\\"script\\": \\"double price = doc['price'].value;long count = doc['count'].value;if( count > 1 ) {return price>10;}return false;\\"\\n}}}}}");
		Response response = restClient.performRequest(request);
		System.out.println(response);
		String result = getResponseContent(response);
		System.out.println(result);
		// 注意使用官方的"""script"""方式是不能查询的，必须使用"script":"script方式"才可以查询
		// 查询注意事项，不可使用带有索引的字符串的使用doc['属性']来获取属性会报错的
	}

/**
	 * 
	 * @author hy
	 * @createTime 2021-07-31 13:51:11
	 * @description 获取带有返回值的响应数据，例如select查询操作
	 * @param response
	 * @return
	 * @throws UnsupportedOperationException
	 * @throws IOException
	 *
	 */
	private static String getResponseContent(Response response) throws UnsupportedOperationException, IOException {
		if (response == null) {
			return null;
		}
		HttpEntity entity = response.getEntity();
		StringBuilder builder = new StringBuilder();
		if (entity != null) {
			InputStream content = entity.getContent();
			BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(content));
			String line = null;
			while ((line = bufferedReader.readLine()) != null) {
				builder.append(line);
			}
		}
		return builder.toString();
	}

结果为

八月 01, 2021 11:12:22 上午 org.elasticsearch.client.RestClient logResponse
警告: request [GET http://localhost:9200/book/java/_search] returned 2 warnings: [299 Elasticsearch-7.13.4-c5f60e894ca0c61cdbae4f5a686d9f08bcefc942 "Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See https://www.elastic.co/guide/en/elasticsearch/reference/7.13/security-minimal-setup.html to enable security."],[299 Elasticsearch-7.13.4-c5f60e894ca0c61cdbae4f5a686d9f08bcefc942 "[types removal] Specifying types in search requests is deprecated."]
八月 01, 2021 11:12:22 上午 org.elasticsearch.client.RestClient logResponse
警告: request [GET http://localhost:9200/_nodes/http?timeout=1000ms] returned 1 warnings: [299 Elasticsearch-7.13.4-c5f60e894ca0c61cdbae4f5a686d9f08bcefc942 "Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See https://www.elastic.co/guide/en/elasticsearch/reference/7.13/security-minimal-setup.html to enable security."]
Response{requestLine=GET /book/java/_search HTTP/1.1, host=http://localhost:9200, response=HTTP/1.1 200 OK}
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":0.0,"hits":[{"_index":"book","_type":"java","_id":"1","_score":0.0,"_source":{"bookName":"java se","count":5,"id":1,"price":18.5,"publishDate":"2021-07-31 12:12:12"}}]}}

2. 正则查询


	/**
	 * 
	 * @author hy
	 * @createTime 2021-07-31 16:30:29
	 * @description 使用正则的方式匹配查询
	 * @param restClient
	 * @throws IOException
	 *
	 */
	private static void selectDataUsingRegex(RestClient restClient) throws IOException {
		// ?表示匹配一个字符,*表示匹配0到n个字符
		// String encode = URLEncoder.encode("publishDate:202?*", "UTF-8"); // 可以匹配日期类型
		// String encode = URLEncoder.encode("publishDate:202?-*", "UTF-8");// 无法匹配2021-07-31 12:12:12
		// String encode = URLEncoder.encode("publishDate:202?\\\\-*", "UTF-8");// 无法匹配2021-07-31 12:12:12
		// String encode = URLEncoder.encode("bookName:java", "UTF-8"); // 直接查询bookName中有java的.可以匹配java se中的空格,就相当于模糊查询(只要有即可)
		 String encode = URLEncoder.encode("bookName:*java*", "UTF-8"); // 直接查询bookName中有java的.可以匹配java se中的空格
		// String encode = URLEncoder.encode("bookName:java *", "UTF-8"); // 可以匹配java se
		// String encode = URLEncoder.encode("bookName:java\\\\ *", "UTF-8"); //无法匹配java se
		// String encode = URLEncoder.encode("bookName:java\\\\ se", "UTF-8"); //可以匹配java se
		// String encode = URLEncoder.encode("bookName:ja?a\\\\ ?e", "UTF-8"); // 无法匹配java se
		// 使用正则匹配
		// String encode = URLEncoder.encode("bookName:/ja.*/", "UTF-8"); // 可以匹配 java se
		// java的正则可以匹配
		// String encode = URLEncoder.encode("bookName:/ja[a-z|A-Z]+\\\\W+[a-z|A-Z]+/", "UTF-8");  // 不可以匹配 java se，感觉有问题
		// String encode = URLEncoder.encode("bookName:/ja.*\\\\W+[a-z|A-Z]+/", "UTF-8");  // 不可以匹配 java se，感觉有问题
		// String encode = URLEncoder.encode("bookName:/ja.*\\\\W[a-z|A-Z]+/", "UTF-8"); // 不可匹配
		// String encode = URLEncoder.encode("bookName:/ja.*\\\\W?[a-z|A-Z]+/", "UTF-8");//可匹配
		// String encode = URLEncoder.encode("bookName:/ja.*\\\\W{1}[a-z|A-Z]+/", "UTF-8");// 不可匹配
		
		
		// 这个正则使用java是可以执行的
		Request request = new Request("GET", "/book/java/_search?q="+encode);
		
		
		Response response = restClient.performRequest(request);
		System.out.println(response);
		String result = getResponseContent(response);
		System.out.println(result);

	}

这里可以分为带有//的正则和不带有//的匹配查询操作，但是需要注意的是和java中的正则匹配可能不一样
例如：ja[a-z|A-Z]+\\\\W+[a-z|A-Z]+

而es中却是

这个是有点不一样的

3. count查询

/**
	 * 
	 * @author hy
	 * @createTime 2021-08-01 09:36:20
	 * @description 使用count查询
	 * @param restClient
	 * @throws IOException
	 *
	 */
	private static void selectDataUsingCount(RestClient restClient) throws IOException {
		// 查询书名为java se且价格大于10的,且count数量大于1的，使用count查询
		Request request = new Request("GET", "/book/java/_count");
		request.setJsonEntity("{\\"query\\": {\\"bool\\": {\\"filter\\": {\\"script\\": {\\"script\\": \\"double price = doc['price'].value;long count = doc['count'].value;if( count > 1 ) {return price>10;}return false;\\"\\n}}}}}");
		Response response = restClient.performRequest(request);
		System.out.println(response);
		String result = getResponseContent(response);
		System.out.println(result);
		// {"count":1,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}
	}

结果为：{"count":1,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}

_count表示查询匹配的数量

4. source查询

这个查询如果_id存在，那么就会返回_source中的内容，否则就会报错，还可以通过_source指定返回的_source中的内容

/**
	 * 
	 * @author hy
	 * @createTime 2021-08-01 09:36:20
	 * @description 只查询_source数据,不需要什么took之类的
	 * @param restClient
	 * @throws IOException
	 *
	 */
	private static void selectDataUsingSource(RestClient restClient) throws IOException {
		// 使用_source指定当前返回的_source中的结果集
		//Request request = new Request("GET", "/book/java/_search");
		//request.setJsonEntity("{\\"_source\\": [\\"bookName\\"]}");
		// {"took":6,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"book","_type":"java","_id":"1","_score":1.0,"_source":{"bookName":"java se"}}]}}
		
		//	注意如果使用这个查询一个不存在的那么就会报错异常。{"error":{"root_cause":[{"type":"resource_not_found_exception","reason":"Document not found [book]/[java]/[2]"}],"type":"resource_not_found_exception","reason":"Document not found [book]/[java]/[2]"},"status":404}
		Request request = new Request("GET", "/book/java/1/_source");//{"bookName":"java se","count":5,"id":1,"price":18.5,"publishDate":"2021-07-31 12:12:12"}
		// 文档_id=1存在则正常返回
		// Request request = new Request("GET", "/book/java/2/_source"); //_id=2不存在则直接报错
		// request.setJsonEntity("{\\"_source\\": [\\"bookName\\"]}"); 错误不支持使用的时候有body
		Response response = restClient.performRequest(request);
		System.out.println(response);
		String result = getResponseContent(response);
		System.out.println(result);
		
	}

1._search中的jsonbody设置了_source可以控制_source返回的字段属性内容

2.如果直接使用_id/_source那么直接返回_source中的内容,如果_id不存在，那么直接报错，且这样使用的时候不能设置jsonBody

以上是关于Elasticsearch之使用RestClient实现script正则countsource查询的主要内容，如果未能解决你的问题，请参考以下文章