python pyspark-wordcount-ide.py
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python pyspark-wordcount-ide.py相关的知识,希望对你有一定的参考价值。
import sys
try:
from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName("Word Count").setMaster("local")
sc = SparkContext(conf=conf)
inputPath = sys.argv[1]
outputPath = sys.argv[2]
Path = sc._gateway.jvm.org.apache.hadoop.fs.Path
FileSystem = sc._gateway.jvm.org.apache.hadoop.fs.FileSystem
Configuration = sc._gateway.jvm.org.apache.hadoop.conf.Configuration
fs = FileSystem.get(Configuration())
if(fs.exists(Path(inputPath)) == False):
print("Input path does not exists")
else:
if(fs.exists(Path(outputPath))):
fs.delete(Path(outputPath), True)
sc.textFile(inputPath). \
flatMap(lambda l: l.split(" ")). \
map(lambda w: (w, 1)). \
reduceByKey(lambda t, e: t + e). \
saveAsTextFile(outputPath)
print ("Successfully imported Spark Modules")
except ImportError as e:
print ("Can not import Spark Modules", e)
sys.exit(1)
以上是关于python pyspark-wordcount-ide.py的主要内容,如果未能解决你的问题,请参考以下文章
Python代写,Python作业代写,代写Python,代做Python
Python开发
Python,python,python
Python 介绍
Python学习之认识python
python初识