SparkSQL电商用户画像之电商用户画像数据可视化
Posted 大码王
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了SparkSQL电商用户画像之电商用户画像数据可视化相关的知识,希望对你有一定的参考价值。
8.1 数据可视化方案
核心点:采用phoenix与hbase整合,通过我们熟知的sql语句来操作NoSql数据库。
8.2 Phoenix建立hbase表的映射表
CREATE TABLE IF NOT EXISTS "itcast_adm_personas_hbase_2017_01_01"( rw varchar(100) primary key, "basicInfo"."user_name" VARCHAR(100), "basicInfo"."user_sex" VARCHAR(100), "basicInfo"."user_birthday" VARCHAR(100), "basicInfo"."user_age" VARCHAR(100), "basicInfo"."constellation" VARCHAR(100), "basicInfo"."province" VARCHAR(100), "basicInfo"."city" VARCHAR(100), "basicInfo"."city_level" VARCHAR(100), "basicInfo"."hex_mail" VARCHAR(100), "basicInfo"."op_mail" VARCHAR(100), "basicInfo"."hex_phone" VARCHAR(100), "basicInfo"."fore_phone" VARCHAR(100), "basicInfo"."op_phone" VARCHAR(100), "basicInfo"."add_time" VARCHAR(100), "basicInfo"."login_ip" VARCHAR(100), "basicInfo"."login_source" VARCHAR(100), "basicInfo"."request_user" VARCHAR(100), "basicInfo"."total_mark" VARCHAR(100), "basicInfo"."used_mark" VARCHAR(100), "basicInfo"."level_name" VARCHAR(100), "basicInfo"."blacklist" VARCHAR(100), "basicInfo"."is_married" VARCHAR(100), "basicInfo"."education" VARCHAR(100), "basicInfo"."monthly_money" VARCHAR(100), "basicInfo"."profession" VARCHAR(100), "basicInfo"."sex_model" VARCHAR(100), "basicInfo"."is_pregnant_woman" VARCHAR(100), "basicInfo"."is_have_children" VARCHAR(100), "basicInfo"."children_sex_rate" VARCHAR(100), "basicInfo"."children_age_rate" VARCHAR(100), "basicInfo"."is_have_car" VARCHAR(100), "basicInfo"."potential_car_user_rate" VARCHAR(100), "basicInfo"."phone_brand" VARCHAR(100), "basicInfo"."phone_brand_level" VARCHAR(100), "basicInfo"."phone_cnt" VARCHAR(100), "basicInfo"."change_phone_rate" VARCHAR(100), "basicInfo"."majia_flag" VARCHAR(100), "basicInfo"."majie_account_cnt" VARCHAR(100), "basicInfo"."loyal_model" VARCHAR(100), "basicInfo"."shopping_type_model" VARCHAR(100), "basicInfo"."figure_model" VARCHAR(100), "basicInfo"."stature_model" VARCHAR(100), "order"."first_order_time" VARCHAR(100), "order"."last_order_time" VARCHAR(100), "order"."first_order_ago" VARCHAR(100), "order"."last_order_ago" VARCHAR(100), "order"."month1_hg_order_cnt" VARCHAR(100), "order"."month1_hg_order_amt" VARCHAR(100), "order"."month2_hg_order_cnt" VARCHAR(100), "order"."month2_hg_order_amt" VARCHAR(100), "order"."month3_hg_order_cnt" VARCHAR(100), "order"."month3_hg_order_amt" VARCHAR(100), "order"."month1_order_cnt" VARCHAR(100), "order"."month1_order_amt" VARCHAR(100), "order"."month2_order_cnt" VARCHAR(100), "order"."month2_order_amt" VARCHAR(100), "order"."month3_order_cnt" VARCHAR(100), "order"."month3_order_amt" VARCHAR(100), "order"."max_order_amt" VARCHAR(100), "order"."min_order_amt" VARCHAR(100), "order"."total_order_cnt" VARCHAR(100), "order"."total_order_amt" VARCHAR(100), "order"."user_avg_amt" VARCHAR(100), "order"."month3_user_avg_amt" VARCHAR(100), "order"."common_address" VARCHAR(100), "order"."common_paytype" VARCHAR(100), "order"."month1_cart_cnt" VARCHAR(100), "order"."month1_cart_goods_cnt" VARCHAR(100), "order"."month1_cart_submit_cnt" VARCHAR(100), "order"."month1_cart_rate" VARCHAR(100), "order"."month1_cart_cancle_cnt" VARCHAR(100), "order"."return_cnt" VARCHAR(100), "order"."return_amt" VARCHAR(100), "order"."reject_cnt" VARCHAR(100), "order"."reject_amt" VARCHAR(100), "order"."last_return_time" VARCHAR(100), "order"."school_order_cnt" VARCHAR(100), "order"."company_order_cnt" VARCHAR(100), "order"."home_order_cnt" VARCHAR(100), "order"."forenoon_order_cnt" VARCHAR(100), "order"."afternoon_order_cnt" VARCHAR(100), "order"."night_order_cnt" VARCHAR(100), "order"."morning_order_cnt" VARCHAR(100), "category"."first_category_id" VARCHAR(100), "category"."first_category_name" VARCHAR(100), "category"."second_category_id" VARCHAR(100), "category"."second_catery_name" VARCHAR(100), "category"."third_category_id" VARCHAR(100), "category"."third_category_name" VARCHAR(100), "category"."month1_category_cnt" VARCHAR(100), "category"."month1_category_amt" VARCHAR(100), "category"."month3_category_cnt" VARCHAR(100), "category"."month3_category_amt" VARCHAR(100), "category"."month6_category_cnt" VARCHAR(100), "category"."month6_category_amt" VARCHAR(100), "category"."total_category_cnt" VARCHAR(100), "category"."total_category_amt" VARCHAR(100), "category"."month1_cart_category_cnt" VARCHAR(100), "category"."month3_cart_category_cnt" VARCHAR(100), "category"."month6_cart_category_cnt" VARCHAR(100), "category"."total_cart_category_cnt" VARCHAR(100), "category"."last_category_time" VARCHAR(100), "category"."last_category_ago" VARCHAR(100), "visit"."latest_pc_visit_date" VARCHAR(100), "visit"."latest_app_visit_date" VARCHAR(100), "visit"."latest_pc_visit_session" VARCHAR(100), "visit"."latest_pc_cookies" VARCHAR(100), "visit"."latest_pc_pv" VARCHAR(100), "visit"."latest_pc_browser_name" VARCHAR(100), "visit"."latest_pc_visit_os" VARCHAR(100), "visit"."latest_app_name" VARCHAR(100), "visit"."latest_app_visit_os" VARCHAR(100), "visit"."latest_visit_ip" VARCHAR(100), "visit"."latest_city" VARCHAR(100), "visit"."latest_province" VARCHAR(100), "visit"."first_pc_visit_date" VARCHAR(100), "visit"."first_app_visit_date" VARCHAR(100), "visit"."first_pc_visit_session" VARCHAR(100), "visit"."first_pc_cookies" VARCHAR(100), "visit"."first_pc_pv" VARCHAR(100), "visit"."first_pc_browser_name" VARCHAR(100), "visit"."first_pc_visit_os" VARCHAR(100), "visit"."first_app_name" VARCHAR(100), "visit"."first_app_visit_os" VARCHAR(100), "visit"."first_visit_ip" VARCHAR(100), "visit"."first_city" VARCHAR(100), "visit"."first_province" VARCHAR(100), "visit"."day7_app_cnt" VARCHAR(100), "visit"."day15_app_cnt" VARCHAR(100), "visit"."month1_app_cnt" VARCHAR(100), "visit"."month2_app_cnt" VARCHAR(100), "visit"."month3_app_cnt" VARCHAR(100), "visit"."day7_pc_cnt" VARCHAR(100), "visit"."day15_pc_cnt" VARCHAR(100), "visit"."month1_pc_cnt" VARCHAR(100), "visit"."month2_pc_cnt" VARCHAR(100), "visit"."month3_pc_cnt" VARCHAR(100), "visit"."month1_pc_days" VARCHAR(100), "visit"."month1_pc_pv" VARCHAR(100), "visit"."month1_pc_avg_pv" VARCHAR(100), "visit"."month1_pc_diff_ip_cnt" VARCHAR(100), "visit"."month1_pc_diff_cookie_cnt" VARCHAR(100), "visit"."month1_pc_common_ip" VARCHAR(100), "visit"."month1_pc_common_cookie" VARCHAR(100), "visit"."month1_pc_common_browser_name" VARCHAR(100), "visit"."month1_pc_common_os" VARCHAR(100), "visit"."month1_hour025_cnt" VARCHAR(100), "visit"."month1_hour627_cnt" VARCHAR(100), "visit"."month1_hour829_cnt" VARCHAR(100), "visit"."month1_hour10212_cnt" VARCHAR(100), "visit"."month1_hour13214_cnt" VARCHAR(100), "visit"."month1_hour15217_cnt" VARCHAR(100), "visit"."month1_hour18219_cnt" VARCHAR(100), "visit"."month1_hour20221_cnt" VARCHAR(100), "visit"."month1_hour22223_cnt" VARCHAR(100) );
这个语句有几个注意点
在建立映射表之前要说明的是,Phoenix是大小写敏感的,并且所有命令都是大写,如果你建的表名没有用双引号括起来,那么无论你输入的是大写还是小写,建立出来的表名都是大写的,如果你需要建立出同时包含大写和小写的表名和字段名,请把表名或者字段名用双引号括起来。 你可以建立读写的表或者只读的表,他们的区别如下:
-
读写表:如果你定义的列簇不存在,会被自动建立出来,并且赋以空值
-
只读表:你定义的列簇必须事先存在
-
IF NOT EXISTS可以保证如果已经有建立过这个表,配置不会被覆盖
-
作为rowkey的字段用 PRIMARY KEY标定
-
列簇用 columnFamily.columnName 来表
8.3 构建maven工程
9.4 用户画像查询展现
根据不同的维度进行组合查询,筛选出满足条件的用户。
以上是关于SparkSQL电商用户画像之电商用户画像数据可视化的主要内容,如果未能解决你的问题,请参考以下文章
SparkSQL电商用户画像之用户画像开发(客户消费订单表)