markdown 在python和r上为数据科学安装spark

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown 在python和r上为数据科学安装spark相关的知识,希望对你有一定的参考价值。

# Python

Create your environment using Anaconda miniconda for python environments

```
conda create -n spark python=3
```

Activate the environment

```
source activate spark
```

Install ipython

```
conda install ipython
```

Now install pyspark

```
pip install pyspark
````

Fire up an ipython terminal

```
ipython
```

Show that you can import the package with `import pyspark`.

To prove it works, type `pyspark.` and hit tab, this will show the methods

# R

http://spark.rstudio.com/

Download the package using devtools and load it

```
install.packages("sparklyr")
library(sparklyr)
```

the `sparklyr` package has utilities to manage the install for you

```
spark_install(version = "2.1.0")
```

and then ensure we have the connection loaded

```
sc <- spark_connect(master = "local")
```

and `class(sc)` should render:

```
[1] "spark_connection"       "spark_shell_connection" "DBIConnection"  
```

以上是关于markdown 在python和r上为数据科学安装spark的主要内容,如果未能解决你的问题,请参考以下文章

数据分析工具的使用

数据科学中R VS Python:获胜者是...

数据科学中R VS Python:获胜者是...

数据科学入门丨选Python还是R

Anaconda 安装与卸载

简谈-如何使用Python和R组合完成任务