Spark-Submit 错误:名称或服务未知

Posted

技术标签:

【中文标题】Spark-Submit 错误:名称或服务未知【英文标题】:Spark-Submit Error :Name or service not known 【发布时间】:2016-06-12 04:19:13 【问题描述】:

我正在使用亚马逊机器运行 pyspark 代码

code in pyspark shell:
a=open("test.txt")
s=sc.parallelize(a)
print(s.count())

由于某些问题,我无法直接使用 sc.textFile("test.txt")。

python 文件中的代码:

from pyspark import SparkContxt

sc=SparkContext()
with open("test.txt") as f:
s=sc.parallelize(f)
print(s.count())

当我尝试 spark-submit test.py 时出现错误名称或服务未知

ubuntu@10-0-0-32:~/Deepak/projects$ spark-submit test1.py
16/06/12 03:44:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/06/12 03:44:59 ERROR : 10-0-0-32: 10-0-0-32: Name or service not known
java.net.UnknownHostException: 10-0-0-32: 10-0-0-32: Name or service not known
    at java.net.InetAddress.getLocalHost(InetAddress.java:1496)
    at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:355)
    at tachyon.util.network.NetworkAddressUtils.getLocalHostName(NetworkAddressUtils.java:320)
    at tachyon.conf.TachyonConf.<init>(TachyonConf.java:122)
    at tachyon.conf.TachyonConf.<init>(TachyonConf.java:111)
    at tachyon.Version.<clinit>(Version.java:27)
    at tachyon.Constants.<clinit>(Constants.java:328)
    at tachyon.hadoop.AbstractTFS.<clinit>(AbstractTFS.java:63)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at java.lang.Class.newInstance(Class.java:383)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
    at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2364)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1362)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:491)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: 10-0-0-32: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
    at java.net.InetAddress.getLocalHost(InetAddress.java:1492)
    ... 40 more
16/06/12 03:44:59 ERROR SparkContext: Error initializing SparkContext.
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider tachyon.hadoop.TFS could not be instantiated
    at java.util.ServiceLoader.fail(ServiceLoader.java:224)
    at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
    at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2364)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1362)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:491)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ExceptionInInitializerError
    at tachyon.Constants.<clinit>(Constants.java:328)
    at tachyon.hadoop.AbstractTFS.<clinit>(AbstractTFS.java:63)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at java.lang.Class.newInstance(Class.java:383)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
    ... 27 more
Caused by: java.lang.RuntimeException: java.net.UnknownHostException: 10-0-0-32: 10-0-0-32: Name or service not known
    at org.spark-project.guava.base.Throwables.propagate(Throwables.java:160)
    at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:398)
    at tachyon.util.network.NetworkAddressUtils.getLocalHostName(NetworkAddressUtils.java:320)
    at tachyon.conf.TachyonConf.<init>(TachyonConf.java:122)
    at tachyon.conf.TachyonConf.<init>(TachyonConf.java:111)
    at tachyon.Version.<clinit>(Version.java:27)
    ... 35 more
Caused by: java.net.UnknownHostException: 10-0-0-32: 10-0-0-32: Name or service not known
    at java.net.InetAddress.getLocalHost(InetAddress.java:1496)
    at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:355)
    ... 39 more
Caused by: java.net.UnknownHostException: 10-0-0-32: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
    at java.net.InetAddress.getLocalHost(InetAddress.java:1492)
    ... 40 more
16/06/12 03:44:59 WARN MetricsSystem: Stopping a MetricsSystem that is not running
Traceback (most recent call last):
File "/home/ubuntu/Deepak/projects/test1.py", line 2, in <module>
sc = SparkContext("local", "test1", pyFiles=['test1.py'])
File "/home/ubuntu/spark-1.6.0-bin-hadoop2.4/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__
File "/home/ubuntu/spark-1.6.0-bin-hadoop2.4/python/lib/pyspark.zip/pyspark/context.py", line 172, in _do_init
File "/home/ubuntu/spark-1.6.0-bin-hadoop2.4/python/lib/pyspark.zip/pyspark/context.py", line 235, in _initialize_context
File "/home/ubuntu/spark-1.6.0-bin-hadoop2.4/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__
File "/home/ubuntu/spark-1.6.0-bin-hadoop2.4/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider tachyon.hadoop.TFS could not be instantiated
    at java.util.ServiceLoader.fail(ServiceLoader.java:224)
    at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
    at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2364)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1362)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
    at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:491)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ExceptionInInitializerError
    at tachyon.Constants.<clinit>(Constants.java:328)
    at tachyon.hadoop.AbstractTFS.<clinit>(AbstractTFS.java:63)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at java.lang.Class.newInstance(Class.java:383)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
    ... 27 more
Caused by: java.lang.RuntimeException: java.net.UnknownHostException: 10-0-0-32: 10-0-0-32: Name or service not known
    at org.spark-project.guava.base.Throwables.propagate(Throwables.java:160)
    at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:398)
    at tachyon.util.network.NetworkAddressUtils.getLocalHostName(NetworkAddressUtils.java:320)
    at tachyon.conf.TachyonConf.<init>(TachyonConf.java:122)
    at tachyon.conf.TachyonConf.<init>(TachyonConf.java:111)
    at tachyon.Version.<clinit>(Version.java:27)
    ... 35 more
Caused by: java.net.UnknownHostException: 10-0-0-32: 10-0-0-32: Name or service not known
    at java.net.InetAddress.getLocalHost(InetAddress.java:1496)
    at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:355)
    ... 39 more
Caused by: java.net.UnknownHostException: 10-0-0-32: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
    at java.net.InetAddress.getLocalHost(InetAddress.java:1492)
    ... 40 more

【问题讨论】:

【参考方案1】:

在 etc/hosts 文件中添加了主机名

以前我是这样做的

IP ubuntu(用户名) 别名

我改成

IP 主机名 alias_name

混淆部分在这里,因为我使用亚马逊机器,我的 IP 和主机名相同。

【讨论】:

以上是关于Spark-Submit 错误:名称或服务未知的主要内容,如果未能解决你的问题,请参考以下文章

Python 3 ftplib错误“名称或服务未知”

jupyter抛出错误:socket.gaierror:[Errno -2]名称或服务未知

名称或服务未知

JWTRefreshTokenBundle:名称或服务未知

JetBrains Rider 调试 Docker Compose 引发异常“名称或服务未知”

socket.gaierror: [Errno -2] 名称或服务未知| Python