无法在Google Cloud Dataproc上启动Apache Flink 1.7

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了无法在Google Cloud Dataproc上启动Apache Flink 1.7相关的知识,希望对你有一定的参考价值。

我使用Hadoop 2.9.2启动了一个Dataproc集群,下载了Flink 1.7.2并尝试使用以下命令启动它:

./bin/yarn-session.sh  -n 2

此操作失败,并显示以下错误消息:

Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
2019-02-15 12:56:05,679 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-02-15 12:56:05,680 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-02-15 12:56:05,681 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-02-15 12:56:05,681 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-02-15 12:56:05,681 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-02-15 12:56:05,681 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-02-15 12:56:05,682 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: rest.port, 8081
2019-02-15 12:56:06,241 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-02-15 12:56:06,323 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to robert (auth:SIMPLE)
2019-02-15 12:56:06,534 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error while running the Flink Yarn session.
java.lang.NoClassDefFoundError: javax/ws/rs/ext/MessageBodyReader
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.apache.hadoop.yarn.util.timeline.TimelineUtils.<clinit>(TimelineUtils.java:50)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:179)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.getClusterDescriptor(FlinkYarnSessionCli.java:984)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.createDescriptor(FlinkYarnSessionCli.java:273)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.createClusterDescriptor(FlinkYarnSessionCli.java:451)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:588)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:810)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
    at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:810)
Caused by: java.lang.ClassNotFoundException: javax.ws.rs.ext.MessageBodyReader
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 49 more
答案

要解决这个问题,我需要导出以下变量:

export HADOOP_CLASSPATH=`hadoop classpath`
另一答案

通常,安装Flink的最佳方法是使用initialization action,它通过apt-get从Dataproc的发行版安装,以获得已经针对给定的Dataproc版本进行交叉编译和测试的Flink版本。在这种情况下,您不需要修改任何环境变量来使其工作。

如果您确实想要使用特定版本的Flink和Dataproc版本,而该版本在发行版中不提供该版本的Flink,实际上您可能需要像下载Flink一样安装。当你从tarball安装Flink时,你也可以查看其中一个earlier versions of the initialization action,看看可能需要的配置类型。例如,基于之前的init操作,您可能还想导出HADOOP_CONF_DIR=/etc/hadoop/conf

以上是关于无法在Google Cloud Dataproc上启动Apache Flink 1.7的主要内容,如果未能解决你的问题,请参考以下文章

如何在 Google Cloud Platform 上查看 Dataproc 作业的输出文件

在 Google Cloud DataProc 上安排 cron 作业

来自 DataProc 集群的 Google Cloud Sdk

Google Cloud Dataproc 删除 BigQuery 表不起作用

在 Google Cloud Dataproc 上安装 pyspark 导致“搜索时找不到有效的 SPARK_HOME ['/tmp', '/usr/local/bin']”

Google Cloud Dataflow 和 Google Cloud Dataproc 有啥区别?