客快物流大数据项目(五十):项目框架初始化
Posted Lansonli
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了客快物流大数据项目(五十):项目框架初始化相关的知识,希望对你有一定的参考价值。
目录
项目框架初始化
一、搭建工程
groupid | artifact | 模块 | 生成方式 |
cn.it | it-logistics-parent | 父工程 | 创建 |
cn.it | logistics-common | 公共模块 | 创建 |
cn.it | logistics-etl | 实时ETL处理模块 | 创建 |
cn.it | logistics-offline | 离线指标计算模块 | 创建 |
cn.it | logistics-generate | 数据生成器模块 | 导入 |
二、导入依赖
1、父工程依赖
<repositories>
<repository>
<id>mvnrepository</id>
<url>https://mvnrepository.com/</url>
<layout>default</layout>
</repository>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
<repository>
<id>elastic.co</id>
<url>https://artifacts.elastic.co/maven</url>
</repository>
</repositories>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<!-- SDK -->
<java.version>1.8</java.version>
<scala.version>2.11</scala.version>
<!-- Junit -->
<junit.version>4.12</junit.version>
<!-- HTTP Version -->
<http.version>4.5.11</http.version>
<!-- Hadoop -->
<hadoop.version>3.0.0-cdh6.2.1</hadoop.version>
<!-- Spark -->
<spark.version>2.4.0-cdh6.2.1</spark.version>
<!-- Spark Graph Visual -->
<gs.version>1.3</gs.version>
<breeze.version>1.0</breeze.version>
<jfreechart.version>1.5.0</jfreechart.version>
<!-- Parquet -->
<parquet.version>1.9.0-cdh6.2.1</parquet.version>
<!-- Kudu -->
<kudu.version>1.9.0-cdh6.2.1</kudu.version>
<!-- Hive -->
<hive.version>2.1.1-cdh6.2.1</hive.version>
<!-- Kafka -->
<kafka.version>2.1.0-cdh6.2.1</kafka.version>
<!-- ClickHouse -->
<clickhouse.version>0.2.2</clickhouse.version>
<!-- ElasticSearch -->
<es.version>7.6.1</es.version>
<!-- JSON Version -->
<fastjson.version>1.2.62</fastjson.version>
<!-- Apache Commons Version -->
<commons-io.version>2.6</commons-io.version>
<commons-lang3.version>3.10</commons-lang3.version>
<commons-beanutils.version>1.9.4</commons-beanutils.version>
<!-- JDBC Drivers Version-->
<ojdbc.version>12.2.0.1</ojdbc.version>
<mysql.version>5.1.44</mysql.version>
<!-- Other -->
<jtuple.version>1.2</jtuple.version>
<!-- Maven Plugins Version -->
<maven-compiler-plugin.version>3.1</maven-compiler-plugin.version>
<maven-surefire-plugin.version>2.19.1</maven-surefire-plugin.version>
<maven-shade-plugin.version>3.2.1</maven-shade-plugin.version>
</properties>
<dependencyManagement>
<dependencies>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>$junit.version</version>
<scope>test</scope>
</dependency>
<!-- JDBC -->
<dependency>
<groupId>com.oracle.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<version>$ojdbc.version</version>
<systemPath>E:/softs/db/jdbc-drivers/ojdbc8-12.2.0.1.jar</systemPath>
<scope>system</scope>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>$mysql.version</version>
</dependency>
<!-- Http -->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>$http.version</version>
</dependency>
<!-- Apache Kafka -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_$scala.version</artifactId>
<version>$kafka.version</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_$scala.version</artifactId>
<version>$spark.version</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.11</artifactId>
<version>$spark.version</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-common</artifactId>
<version>$parquet.version</version>
</dependency>
<dependency>
<groupId>net.jpountz.lz4</groupId>
<artifactId>lz4</artifactId>
<version>1.3.0</version>
</dependency>
<!-- Graph Visual -->
<dependency>
<groupId>org.graphstream</groupId>
<artifactId>gs-core</artifactId>
<version>$gs.version</version>
</dependency>
<dependency>
<groupId>org.graphstream</groupId>
<artifactId>gs-ui</artifactId>
<version>$gs.version</version>
</dependency>
<dependency>
<groupId>org.scalanlp</groupId>
<artifactId>breeze_$scala.version</artifactId>
<version>$breeze.version</version>
</dependency>
<dependency>
<groupId>org.scalanlp</groupId>
<artifactId>breeze-viz_$scala.version</artifactId>
<version>$breeze.version</version>
</dependency>
<dependency>
<groupId>org.jfree</groupId>
<artifactId>jfreechart</artifactId>
<version>$jfreechart.version</version>
</dependency>
<!-- JSON -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>$fastjson.version</version>
</dependency>
<!-- Kudu -->
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-client</artifactId>
<version>$kudu.version</version>
</dependency>
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-spark2_2.11</artifactId>
<version>$kudu.version</version>
</dependency>
<!-- Hive -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>$hive.version</version>
</dependency>
<!-- Clickhouse -->
<dependency>
<groupId>ru.yandex.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>$clickhouse.version</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</exclusion>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- ElasticSearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>$es.version</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>$es.version</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>x-pack-sql-jdbc</artifactId>
<version>$es.version</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark-20_2.11</artifactId>
<version>$es.version</version>
</dependency>
<!-- Alibaba Json -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>$fastjson.version</version>
</dependency>
<!-- Apache Commons -->
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>$commons-io.version</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>$commons-lang3.version</version>
</dependency>
<dependency>
<groupId>commons-beanutils</groupId>
<artifactId>commons-beanutils</artifactId>
<version>$commons-beanutils.version</version>
</dependency>
<!-- Other -->
<dependency>
<groupId>org.javatuples</groupId>
<artifactId>javatuples</artifactId>
<version>$jtuple.version</version>
</dependency>
<!--<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.3</version>
</dependency>-->
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.0.1</version>
</dependency>
</dependencies>
</dependencyManagement>
注意:修改oracle驱动包所在的路径
1:拷贝”\\资料\\oracle连接驱动ojdbc8-12.2.0.1.jar”文件到本地磁盘任意目录
2:将pom文件驱动包路径修改为本地驱动包所在路径
<dependency>
<groupId>com.oracle.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<version>$ojdbc.version</version>
<systemPath>E:/softs/db/jdbc-drivers/ojdbc8-12.2.0.1.jar</systemPath>
<scope>system</scope>
</dependency>
2、导入公共模块依赖
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>
<!-- JDBC -->
<dependency>
<groupId>com.oracle.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<scope>system</scope>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
<!-- Http -->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
</dependency>
<!-- Java Tuples -->
<dependency>
<groupId>org.javatuples</groupId>
<artifactId>javatuples</artifactId>
</dependency>
<!-- Alibaba Json -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
</dependency>
<!-- Apache Commons -->
<dependency>
<groupId>commons-beanutils</groupId>
<artifactId>commons-beanutils</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
</dependency>
<!-- Apache Kafka -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_$scala.version</artifactId>
</dependency>
<!-- Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_$scala.version</artifactId>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.11</artifactId>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-common</artifactId>
</dependency>
<!-- Graph Visual -->
<dependency>
<groupId>org.graphstream</groupId>
<artifactId>gs-core</artifactId>
</dependency>
<dependency>
<groupId>org.graphstream</groupId>
<artifactId>gs-ui</artifactId>
</dependency>
<dependency>
<groupId>org.scalanlp</groupId>
<artifactId>breeze_$scala.version</artifactId>
</dependency>
<dependency>
<groupId>org.scalanlp</groupId>
<artifactId>breeze-viz_$scala.version</artifactId>
</dependency>
<dependency>
<groupId>org.jfree</groupId>
<artifactId>jfreechart</artifactId>
</dependency>
<!-- Kudu -->
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-client</artifactId>
</dependency>
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-spark2_2.11</artifactId>
</dependency>
<!-- Clickhouse -->
<dependency>
<groupId>ru.yandex.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
</dependency>
<!-- ElasticSearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency>
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>x-pack-sql-jdbc</artifactId>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark-20_2.11</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>8</source>
<target>8</target>
</configuration>
</plugin>
</plugins>
</build>
3、导入实时ETL模块依赖
<repositories>
<repository>
<id>mvnrepository</id>
<url>https://mvnrepository.com/</url>
<layout>default</layout>
</repository>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
<repository>
<id>elastic.co</id>
<url>https://artifacts.elastic.co/maven</url>
</repository>
</repositories>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<!-- SDK -->
<java.version>1.8</java.version>
<scala.version>2.11</scala.version>
<!-- Spark -->
<spark.version>2.4.0-cdh6.2.1</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>cn.it.logistics.common</groupId>
<artifactId>logistics-common</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
<!-- Structured Streaming -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_$scala.version</artifactId>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.11</artifactId>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-common</artifactId>
</dependency>
<dependency>
<groupId>net.jpountz.lz4</groupId>
<artifactId>lz4</artifactId>
</dependency>
<dependency>
<groupId>org.jfree</groupId>
<artifactId>jfreechart</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
</dependency>
<!-- kudu -->
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-client</artifactId>
</dependency>
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-spark2_2.11</artifactId>
</dependency>
<!-- Other -->
<dependency>
<groupId>org.javatuples</groupId>
<artifactId>javatuples</artifactId>
</dependency>
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.0.1</version>
</dependency>
</dependencies>
4、导入离线指标计算模块依赖
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<!-- SDK -->
<java.version>1.8</java.version>
<scala.version>2.11</scala.version>
<!-- Spark -->
<spark.version>2.4.0-cdh6.2.1</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>cn.it.logistics.common</groupId>
<artifactId>logistics-common</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
<!-- Structured Streaming -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_$scala.version</artifactId>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-common</artifactId>
</dependency>
<dependency>
<groupId>net.jpountz.lz4</groupId>
<artifactId>lz4</artifactId>
</dependency>
<dependency>
<groupId>org.jfree</groupId>
<artifactId>jfreechart</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
</dependency>
<!-- kudu -->
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-client</artifactId>
</dependency>
<dependency>
<groupId>org.apache.kudu</groupId>
<artifactId>kudu-spark2_2.11</artifactId>
</dependency>
<!-- Other -->
<dependency>
<groupId>org.javatuples</groupId>
<artifactId>javatuples</artifactId>
</dependency>
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.0.1</version>
</dependency>
</dependencies>
三、导入模块
1、导入数据生成器模块到工程中
将:4.资料\\3.数据生成器模块\\logistics-generate模块导入到工程中
注意:将table-data目录一定设置为资源目录
- 📢博客主页:https://lansonli.blog.csdn.net
- 📢欢迎点赞 👍 收藏 ⭐留言 📝 如有错误敬请指正!
- 📢本文由 Lansonli 原创,首发于 CSDN博客🙉
- 📢大数据系列文章会每天更新,停下休息的时候不要忘了别人还在奔跑,希望大家抓紧时间学习,全力奔赴更美好的生活✨
以上是关于客快物流大数据项目(五十):项目框架初始化的主要内容,如果未能解决你的问题,请参考以下文章
客快物流大数据项目(五十六): 编写SparkSession对象工具类
客快物流大数据项目(五十七):创建Kudu-ETL流式计算程序
客快物流大数据项目(五十五):封装公共接口(根据存储介质抽取特质)