运行 zepplin 连接 aws 胶水时出错
Posted
技术标签:
【中文标题】运行 zepplin 连接 aws 胶水时出错【英文标题】:error when run zepplin connecting aws glue 【发布时间】:2020-01-24 04:00:01 【问题描述】:我按照https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-local-notebook.html中所示的教程步骤进行操作
本地 zepplin 与 AWS Glue 之间的连接没有问题。但是,当我在 zepplin 上运行测试命令时,它给了我错误
%pyspark
import sys
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TiostreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:266)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:250)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:373)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:97)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:406)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
来自 Spark 服务器的日志
ERROR [2019-09-24 12:45:09,757] (pool-2-thread-8 Job.java[run]:188) - Job failed
org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:401)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:97)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:406)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:266)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:250)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:373)
... 11 more
ERROR [2019-09-24 12:45:09,774] (pool-2-thread-8 RemoteScheduler.java[getStatus]:281) - Unknown status
java.lang.IllegalArgumentException: No enum constant org.apache.zeppelin.scheduler.Job.Status.UNKNOWN
at java.lang.Enum.valueOf(Enum.java:238)
at org.apache.zeppelin.scheduler.Job$Status.valueOf(Job.java:51)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:271)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:342)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
ERROR [2019-09-24 12:45:09,775] (pool-2-thread-8 NotebookServer.java[afterStatusChange]:2056) - Error
org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:401)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:97)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:406)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:266)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:250)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:373)
... 11 more
WARN [2019-09-24 12:45:09,775] (pool-2-thread-8 NotebookServer.java[afterStatusChange]:2064) - Job 20190924-110806_1690051421 is finished, status: ERROR, exception: org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException, result: org.apache.thrift.transport.TTransportException
INFO [2019-09-24 12:45:09,783] (pool-2-thread-8 SchedulerFactory.java[jobFinished]:137) - Job paragraph_1569294486122_1120959405 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreterexisting_process1435285773
知道哪里出了问题吗?
【问题讨论】:
【参考方案1】:我在创建开发端点时通过选择旧版本的 Spark 2.2 解决了这个问题
【讨论】:
很高兴能够提供帮助:)【参考方案2】:AWS 似乎已更新其端点以支持较新版本的 Zepplin 笔记本和较新的胶水版本——但他们尚未更新其 AWS Glue 开发端点文档以说明这一点。 AWS Glue dev endpoint documentation
如果您运行的是 Zepplin 0.7.3 版,您需要使用 Glue 版本 Spark 2.2(Glue 版本 0.9) 初始化您的端点——正如 Conandor 所述。
如果您想运行最新的 Zepplin 版本 0.8.2,它将与开发端点 Spark 2.4 Python 3(Glue 版本 1.0) 一起使用。
【讨论】:
【参考方案3】:我遇到了同样的问题,但尚未解决。但是我确实经历了一些故障排除步骤:
重启zeppelin:sudo bash
./zeppelin-daemon stop
lsof -t -i tcp:8080 | xargs kill # Had to use this when zeppelin wouldn't die
./zeppelin-daemon start
完全重新安装zeppelin,确保没有加载以前的配置
检查 java 版本。 Zeppelin 仅支持 Java 8:
java -version
检查 zeppelin 版本。 AWS Glue 仅支持 0.7.3
如果您碰巧找到了解决方案,请发布!!!!
我无法评论你的答案,但如果这能解决问题,你就是上帝。您为我们节省了数小时的开发时间(在 python 脚本中运行这些作业,等待冷启动时间)。
【讨论】:
【参考方案4】:如果您遇到 TTransportException,这意味着 Zeppelin 无法连接到元存储。
您需要指定“目录选项”。
现在您可以使用 Spark 2.4、Python 3 和 Zeppelin 0.8
【讨论】:
【参考方案5】:这花费了我数小时的时间 - 对我来说,这是另一个进程占用了我的隧道端口 9907。我使用 lsof -i :9907
和 kill -9 PID
杀死了 9907 上的进程,然后重新连接到开发端点并启动表现如预期。
【讨论】:
以上是关于运行 zepplin 连接 aws 胶水时出错的主要内容,如果未能解决你的问题,请参考以下文章
无法在 AWS 胶水中建立 Redshift 连接,其中 Redshift 位于不同的位置 (us-west-1)
运行 AWS 胶水工作室 ETL 脚本时出现 ARN 角色授权错误