markdown 在python和r上为数据科学安装spark

Posted 2021-05-05

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了markdown 在python和r上为数据科学安装spark相关的知识，希望对你有一定的参考价值。

# Python

Create your environment using Anaconda miniconda for python environments

```
conda create -n spark python=3
```

Activate the environment

```
source activate spark
```

Install ipython

```
conda install ipython
```

Now install pyspark

```
pip install pyspark
````

Fire up an ipython terminal

```
ipython
```

Show that you can import the package with `import pyspark`.

To prove it works, type `pyspark.` and hit tab, this will show the methods

# R

http://spark.rstudio.com/

Download the package using devtools and load it

```
install.packages("sparklyr")
library(sparklyr)
```

the `sparklyr` package has utilities to manage the install for you

```
spark_install(version = "2.1.0")
```

and then ensure we have the connection loaded

```
sc <- spark_connect(master = "local")
```

and `class(sc)` should render:

```
[1] "spark_connection"       "spark_shell_connection" "DBIConnection"  
```

以上是关于markdown 在python和r上为数据科学安装spark的主要内容，如果未能解决你的问题，请参考以下文章