ubuntu下pyspark的安装
Posted tanshoudong
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ubuntu下pyspark的安装相关的知识,希望对你有一定的参考价值。
1.安装jkd1.8(这里不再描述)
2.直接在终端输入pip install pyspark(官网提供的最简单的一种安装方式)
过程如下:
Collecting pyspark Downloading https://files.pythonhosted.org/packages/ee/2f/709df6e8dc00624689aa0a11c7a4c06061a7d00037e370584b9f011df44c/pyspark-2.3.1.tar.gz (211.9MB) 100% |████████████████████████████████| 211.9MB 8.3kB/s Requirement already satisfied: py4j==0.10.7 in ./anaconda3/lib/python3.6/site-packages (from pyspark) Building wheels for collected packages: pyspark Running setup.py bdist_wheel for pyspark ... done Stored in directory: /home/tan/.cache/pip/wheels/37/48/54/f1b63f0dbb729e20c92f1bbcf1c53c03b300e0b93ca1781526 Successfully built pyspark Installing collected packages: pyspark Successfully installed pyspark-2.3.1
安装完成后, 终端输入pyspark,启动pyspark出错......
[email protected]:~$ pyspark JAVA_HOME is not set
解决方法:
找到pyspark的安装路径
[email protected]:~$ pip install pyspark Requirement already satisfied: pyspark in ./anaconda3/lib/python3.6/site-packages Requirement already satisfied: py4j==0.10.7 in ./anaconda3/lib/python3.6/site-packages (from pyspark)
找到路径后,在load-spark-env.sh文件中加上jdk的安装路径即可
export JAVA_HOME=/home/tan/jdk1.8.0_181
保存之后, 再次在终端输入pyspark, 成功启动pyspark
[email protected]:~$ pyspark Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. 2018-07-29 12:37:48 WARN Utils:66 - Your hostname, tan-Precision-Tower-3620 resolves to a loopback address: 127.0.1.1; using 192.168.0.100 instead (on interface enp0s31f6) 2018-07-29 12:37:48 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address 2018-07-29 12:37:48 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _ / _ / _ `/ __/ ‘_/ /__ / .__/\_,_/_/ /_/\_ version 2.3.1 /_/ Using Python version 3.6.4 (default, Jan 16 2018 18:10:19) SparkSession available as ‘spark‘. >>>
完结
以上是关于ubuntu下pyspark的安装的主要内容,如果未能解决你的问题,请参考以下文章
续:纠正:ubuntu7.04可以安装,而且完美的安装 ! for《Oracle-10.2.0.1,打补丁10.2.0.5:在 debian 版本4不含4以上,及 ubuntu 7.04不含(代码片段
在Tomcat的安装目录下conf目录下的server.xml文件中增加一个xml代码片段,该代码片段中每个属性的含义与用途
spark-submit 适用于 Python 程序,但 pyspark 不起作用