PYSAPRK python java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/i
Posted
技术标签:
【中文标题】PYSAPRK python java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V【英文标题】: 【发布时间】:2018-10-29 09:53:25 【问题描述】:没有回答仅针对 PySpark 的这个问题。所以我又问了。
我正在使用来自Spark download 的 PySpark 做一个简单的wordcount.py
示例。
代码在下面。我做了一个mvn clean install
并使用了this 建议,并在spark 文件夹的示例下向我的pom.xml 添加了依赖项。和mvn install
再次。
<dependency>
<groupId>net.jpountz.lz4</groupId>
<artifactId>lz4</artifactId>
<version>1.3.0</version>
</dependency>
from __future__ import print_function
import sys
from operator import add
from pyspark.sql import SparkSession
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: wordcount <file>", file=sys.stderr)
sys.exit(-1)
spark = SparkSession\
.builder\
.appName("PythonWordCount")\
.getOrCreate()
lines = spark.read.text(sys.argv[1]).rdd.map(lambda r: r[0])
counts = lines.flatMap(lambda x: x.split(' ')) \
.map(lambda x: (x, 1)) \
.reduceByKey(add)
output = counts.collect()
for (word, count) in output:
print("%s: %i" % (word, count))
spark.stop()
我得到的错误是:
2018-10-29 15:19:01 ERROR Utils:91 - Uncaught exception in thread stdout writer for python
java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)
at org.apache.spark.serializer.SerializerManager.wrapForCompression(SerializerManager.scala:163)
at org.apache.spark.serializer.SerializerManager.wrapStream(SerializerManager.scala:124)
at org.apache.spark.shuffle.BlockStoreShuffleReader$$anonfun$2.apply(BlockStoreShuffleReader.scala:50)
at org.apache.spark.shuffle.BlockStoreShuffleReader$$anonfun$2.apply(BlockStoreShuffleReader.scala:50)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:417)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:61)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:204)
at org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:407)
at org.apache.spark.api.python.BasePythonRunner$WriterThread$$anonfun$run$1.apply(PythonRunner.scala:215)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988)
at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:170)
Exception in thread "stdout writer for python" java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)
at org.apache.spark.serializer.SerializerManager.wrapForCompression(SerializerManager.scala:163)
at org.apache.spark.serializer.SerializerManager.wrapStream(SerializerManager.scala:124)
at org.apache.spark.shuffle.BlockStoreShuffleReader$$anonfun$2.apply(BlockStoreShuffleReader.scala:50)
at org.apache.spark.shuffle.BlockStoreShuffleReader$$anonfun$2.apply(BlockStoreShuffleReader.scala:50)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:417)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:61)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:204)
at org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:407)
at org.apache.spark.api.python.BasePythonRunner$WriterThread$$anonfun$run$1.apply(PythonRunner.scala:215)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988)
at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:170)
^C2018-10-29 15:19:10 INFO SparkContext:54 - Invoking stop() from shutdown hook
Traceback (most recent call last):
File "/usr/local/Cellar/spark-2.3.0/spark/examples/src/main/python/wordcount.py", line 40, in <module>
output = counts.collect()
【问题讨论】:
你为什么还要使用 Maven?您是否正在尝试构建 Spark?如果是这样,请关注Building Spark。否则下载预构建的二进制文件。 我不是在尝试构建火花。我也在尝试在scala中做同样的事情,我得到了同样的错误。 scala> counts.collect() 2018-11-02 15:44:06 错误执行程序:91 - 阶段 9.0 (TID 18) 中任务 0.0 中的异常 java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.我将 spark 升级到 2.3.2,错误看起来已解决。
例子:
scala> val lines = sc.parallelize(Array(('a', 1), ('a', 1), ('b', 1)))
行:org.apache.spark.rdd.RDD[(Char, Int)] =ParallelCollectionRDD[0] at parallelize at :24
scala> val y = lines.reduceByKey((x,y) =>(x+y))
y: org.apache.spark.rdd.RDD[(Char, Int)] = ShuffledRDD[1] at reduceByKey 在:25
scala> y.collect() res0: Array[(Char, Int)] = Array((a,2), (b,1))
有效..!
【讨论】:
以上是关于PYSAPRK python java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/i的主要内容,如果未能解决你的问题,请参考以下文章
java调用python代码出现java.lang.NullPointerException: null
将 Python UDF 应用于 Spark 数据帧时出现 java.lang.IllegalArgumentException
使用java.lang.Runtime.getRuntime无法在matlab中调用多个python脚本实例
jpype._jexception.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class Test not found
卡夫卡结构化流 java.lang.NoClassDefFoundError
org/kivy/android/PythonActivity java.lang.NoClassDefFoundError