hadoop+spark+kudu

Posted 述而不做

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了hadoop+spark+kudu相关的知识,希望对你有一定的参考价值。

1.spark 和kudu 的兼容版本

Spark Integration Known Issues and Limitations
Spark 2.2+ requires Java 8 at runtime even though Kudu Spark 2.x integration is Java 7 compatible. Spark 2.2 is the default dependency version as of Kudu 1.5.0.

Kudu tables with a name containing upper case or non-ascii characters must be assigned an alternate name when registered as a temporary table.

Kudu tables with a column name containing upper case or non-ascii characters may not be used with SparkSQL. Columns may be renamed in Kudu to work around this issue.

<> and OR predicates are not pushed to Kudu, and instead will be evaluated by the Spark task. Only LIKE predicates with a suffix wildcard are pushed to Kudu, meaning that LIKE "FOO%" is pushed down but LIKE "FOO%BAR" isn’t.

Kudu does not support every type supported by Spark SQL. For example, Date and complex types are not supported.

Kudu tables may only be registered as temporary tables in SparkSQL. Kudu tables may not be queried using HiveContext.

spark 2.2 需要 kudu 1.5.0

以上是关于hadoop+spark+kudu的主要内容,如果未能解决你的问题,请参考以下文章

Spark Kudu 结合

Hadoop/Spark生态圈里的新气象

用于 Kudu 兼容性的 Spark 数据帧转换列

客快物流大数据项目(四十六):Spark操作Kudu dataFrame操作kudu

spark操作kudu,出现异常java.lang.ClassNotFoundException: org.apache.kudu.spark.kudu.DefaultSource

spark操作kudu,出现异常java.lang.ClassNotFoundException: org.apache.kudu.spark.kudu.DefaultSource