hive映射es

Posted 上官沐雪

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了hive映射es相关的知识,希望对你有一定的参考价值。

hive映射es

1.再hive中建里映射表

CREATE EXTERNAL TABLE `hive_es_dt_basic4_id`(
  `id` string COMMENT 'from deserializer', 
  `country` string COMMENT 'from deserializer', 
  `regstatus` string COMMENT 'from deserializer', 
  `districtcode` string COMMENT 'from deserializer', 
  `regcapital` string COMMENT 'from deserializer', 
  `city` string COMMENT 'from deserializer', 
  `emaillist` string COMMENT 'from deserializer', 
  `citycode` string COMMENT 'from deserializer', 
  `companyname` string COMMENT 'from deserializer', 
  `opto` string COMMENT 'from deserializer', 
  `phonelist` string COMMENT 'from deserializer', 
  `socialcreditcode` string COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.elasticsearch.hadoop.hive.EsSerDe' 
STORED BY 
  'org.elasticsearch.hadoop.hive.EsStorageHandler' 
WITH SERDEPROPERTIES ( 
  'serialization.format'='1')
LOCATION
  'hdfs://nameservice1/user/hive/warehouse/customs.db/hive_es_dt_basic3'
TBLPROPERTIES (
  'es.index.auto.create'='false', 
  'es.read.metadata'='true',
  'es.read.metadata.field'='_metadata',
  'es.mapping.id' = 'id',
  'es.mapping.names'='country:country,regStatus:regStatus,districtCode:districtCode,regCapital:regCapital,city:city,emailList:emailList,cityCode:cityCode,companyName:companyName,opTo:opTo,phoneList:phoneList,socialCreditCode:socialCreditCode', 
  'es.nodes'='172.16.98.113,172.16.98.149,172.16.98.150,172.16.98.151,172.16.98.152', 
  'es.port'='9200', 
  'es.resource'='dt_basic4/dt_basic');

2.再hive建表,可把数据导入到此表

  CREATE EXTERNAL TABLE `hive_es_keywordexpand_v7_lastbak`(
  `id` string, 
  `wordexregion` string  , 
  `searchresultcount` bigint, 
  `isad` bigint , 
  `bidwordcompanycount` bigint , 
  `pageurl` string , 
  `longkeywordcount` bigint , 
  `wordroot` string  , 
  `baiduindex` double  , 
  `mobileindex` double , 
  `qihooindex` double  , 
  `baidupc` bigint , 
  `baidumobile` bigint , 
  `competition` bigint , 
  `keytype` int , 
  `status` int  , 
  `semreason` string  , 
  `semprice` string , 
  `createtime` bigint , 
  `updatetime` bigint   )
COMMENT '关键词'
row format delimited fields terminated by '\\001' 
STORED AS ORC TBLPROPERTIES ("orc.compress"="SNAPPY");

若缺少jar包可添加相应加包:

add jar /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive/auxlib/elasticsearch-hadoop-6.3.0.jar;
add jar /data/jar/httpclient-4.5.5.jar;
add jar /data/jar/org.apache.commons.httpclient.jar;

下面是es常用指令:
1.es使用postman 复制索引到另一个索引

http://121.46.193.39:9200/_reindex/

  "source": 
    "index": "enexpand",
	"size":10000
  ,
  "dest": 
    "index": "enexpand_bak"
  

下面是带es带子集的索引导入

drop table hive_es_dt_basic6_test;
CREATE EXTERNAL TABLE `hive_es_dt_basic6_test`(
  `phone` string COMMENT 'from deserializer', 
  `phone_source` string COMMENT 'from deserializer', 
  `routing` string COMMENT 'from deserializer', 
  `my_join_field` map<string,string> COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.elasticsearch.hadoop.hive.EsSerDe' 
STORED BY 
  'org.elasticsearch.hadoop.hive.EsStorageHandler' 
WITH SERDEPROPERTIES ( 
  'serialization.format'='1')
LOCATION
  'hdfs://nameservice1/user/hive/warehouse/customs.db/hive_es_dt_basic'
TBLPROPERTIES (
  'es.index.auto.create'='false', 
  'es.mapping.names'='phone:phone,phone_source:phone_source,my_join_field:my_join_field', 
  'es.mapping.routing'='routing',
  'es.mapping.join'='my_join_field',
  'es.nodes'='ips', 
  'es.port'='9200', 
  'es.read.metadata'='true', 
  'es.read.metadata.field'='_metadata', 
  'es.resource'='dt_basic6/dt_basic', 
  'last_modified_by'='root', 
  'last_modified_time'='1627226340', 
  'transient_lastDdlTime'='1627226340');
  
  
  insert into table hive_es_dt_basic6_test 
select '1212121','111','1',map('name','phone','parent','1');

注:hive预查入es的表最好使用hive默认的存储方式

以上是关于hive映射es的主要内容,如果未能解决你的问题,请参考以下文章

hive映射Elasticsearch时间date类型问题

hive映射Elasticsearch时间date类型问题

Hive 集成 ElasticSearch

Spark+ES+ClickHouse 构建DMP用户画像

从Hive导入数据到ES

hive访问es的问题