markdown 使用Google Compute Engine(GCE)和开源工具设置您自己的云分析机器
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown 使用Google Compute Engine(GCE)和开源工具设置您自己的云分析机器相关的知识,希望对你有一定的参考价值。
...
...
...
### System Installation
- Get a spare 2+ GB USB flash device / SD card
- Download an Ubuntu flavour ISO file. You can also directly [buy](https://shop.canonical.com/product_info.php?products_id=1206) a bootable USB stick.
- If you have a Windows OS, the easiest way is to download [Rufus](https://rufus.akeo.ie/), run it and follow instructions. You can also have a look at some graphical instructions on [Canonical webiste](https://www.ubuntu.com/download/desktop/create-a-usb-stick-on-windows).
- Once the writing process is done, it should take around 10 mins, put the USB drive into the computer you want to install Ubuntu on, and turn it on.
If needed, change the BIOS settings accordingly to boot off the USB. You usually enter any BIOS menu pressing the DELete key while booting.
- Select Install Ubuntu Server
- Select the preferred language
- Choose the preferred location
- Choose a keyboard layout, or let the system detect it by you pressing a few specified keys
- The system will now detect hardware and load the corresponding additional driver.
With multiple network interfaces installed, the installer is now going to ask which one is to be consider primary. In case the selection is a wireless adapter which is going to be used on a protected network, the installer asks for the corresponding ESSID and password before proceeding.
Enter the hostname that identifies your system to the network
When asked about a proxy, you may leave the line empty.
The system will now try to configure the network and load additional components.
Create a personal (not root) user which must have a full (real) name, a username (don't use the word admin as it is a reserved name on Ubuntu) and (obviously) a password
Choose if encrypt the home directory for the above user (It should probably be YES if installing on a portable machine)
Check the time zone
It's now time to deal with hard stuff: partitioning the disk. Most users should simply go for the guided way, and let the system use the entire disk and configure LVM (usually the selected default choice). The choice of the disk depend on the particular hardware, most users would have just one disk though. Moreover, it could be useful to leave some space unused for future needs. Anyway, DO take note of the diskname (it should be sdb, as sda may have been reserved for the USB drive).
After clicking yes on writing changes to disk, the system begin the actual installation of the OS.
Choose how you want to update the system
Choose which additional software to install. I would recommend: SSH, LAMP, Samba (I'd prefer let Lubuntu and PostgreSQL as a separate installation)
If you chose to install LAMP you will be asked to insert a password for the MySQL root user. It's not actually that important if you plan to block root login afterwards.
When ask about install GRUB on master record, answer NO. Afterwards, at the prompt where to install GRUB instead, type or select the correct diskname
The next message box is the end, remove the CD/USB/card and restart the system.
After the login, most of the subseequent steps have to be executed as a "super-user", which practically means to write sudo before every command, and inserting the password after the first command. While it's possible to login as the root user, and avoid writing permissions subsequently, it's not advisable to do so (actually, we are going to turn off the possibility for the root user to even log in in the future).
If you did not select to install the OpenSSH server during the system installation, but you nonetheless want now to connect to the system with an SSH client from remote, you can do it now: sudo apt-get install ssh openssh-server
I tend to suggest to newbie to use the pre-installed nano for its apparent simplicity, but if you want to use the more popular vim I suggest to install the nox version: sudo apt-get install nano vim-nox
### Change ip mode to static address
- run `ifconfig` to check the id of yhe network card used by the system
- `sudo nano /etc/network/interfaces`
- add following lines:
```
iface eno1 inet static
address 192.168.x.yyy
netmask 255.255.255.0
network 192.168.x.0
broadcast 192.168.x.255
gateway 192.168.x.1
dns-nameserver 8.8.8.8 8.8.4.4
```
where x is linked to the local network, while yyy is the requested fixed internal IP address for the server.
...
...
...
...
....
#### Installation
- add the Neo4j key into the apt package manager:
```
sudo wget -qO - http://debian.neo4j.org/neotechnology.gpg.key | apt-key add -
```
- add Neo4J to the apt sources list:
```
sudo echo 'deb http://debian.neo4j.org/repo stable/' | sudo tee --append /etc/apt/sources.list.d/neo4j.list
```
- update the apt package list:
```
sudo apt-get update
```
- install neo4j:
```
sudo apt-get install neo4j
```
- ensure the server is running:
```
sudo service neo4j-service status
```
- the server is listening on default port 7474
```
```
#### Access to the web
- open the neo4j config file for edit:
```
sudo nano /etc/neo4j/neo4j-server.properties
```
- Uncomment the line
```
#org.neo4j.server.webserver.address=0.0.0.0
```
to allow connection from ANY external URL
- restart the Neo4j service:
```
sudo service neo4j-service restart
```
...
#### Installatiopn and configuration
If not already installed, proceed with all the following:
```
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install mysql-server
```
When asked about the *root* password you should always leave it blank as we'll change it later.
When the installation finish, run `sudo mysql_secure_installation`, then:
- enter a strong password for *root*
- remove the *anonymous* user
- disallow *root* remote login
- delete test database
- and finally reload privileges.
#### Adding user(s)
- Open the terminal and login as root: `mysql -u root -p`
- Create the user:
- `CREATE USER 'username'@'localhost' IDENTIFIED BY 'pwd';`
- Grant privileges:
- all privileges: `GRANT ALL ON *.* TO 'username'@'localhost';`
- admin privileges:
- common user:
- Update the server: `FLUSH PRIVILEGES;`
- However, for a remote user to connect even with the correct privileges, the previous commands have to be repeated with the correct IP address instead of localhost, or insert '%' meaning *everywhere*. Shortly, to let a user connect from anywhere the correct commands are the following:
```
CREATE USER 'username'@'%' IDENTIFIED BY 'pwd';
GRANT ALL ON *.* TO 'username'@'%';
```
- create .my.cnf with credentials
sudo nano /etc/mysql/.my.cnf
add following lines for each group
[groupname]
host =
user =
password =
if that does not work, try to save it in the home folder of the user, or add "default.file = location" in the R connection string
- install web interface (dbninja?)
- run script (text file with SQL statements!) to add tables to a database:
mysql -u username -p password dbname < fpath + fname
#### Configure and tweaking the server
First of all, install the standard program to monitor performance: `sudo apt-get install mysqltuner`
This program should be run periodically to get insights into the server efficiency, and have some feedback clues into how to tweak the configuration.
Let's open now the configuration file: `sudo nano /etc/mysql/my.cnf`
- To expose MySQL to a specific IP, scroll to the position with `bind-address = 127.0.0.1`, which stand for `localhost`, i.e. the machine itself, and substitute that address with the one you want to coonnect. If instead you want to connect from anywhere write down `bind-address = 0.0.0.0`
- To change the port the server listen to, scroll down to the lines (two!) `port = 3306`, and put the desired number in both [client] and [mysqld] submenus.
- Change general ENGINE to MYISAM adding the row `default_storage_engine=MYISAM`
- Control that general character set is UTF8 and general collation is UTF8_unicode_ci:
```
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
```
- tweak parameters so that the maximum potential RAM is less than 1/k of the installed RAM
To ensure the changes take effect, remember to restart the server: `sudo service mysql restart`
#### Web interface
While not compulsory, having a web interface to access the server from a browser anytime anywhere in the world could be quite helpful.
My preferred one is [DB Ninja](http://www.dbninja.com/), which is a proprietary software but totally free for personal use, and also for work if it used from a single computer at any single time.
- Install APACHE: `sudo apt-get install apache2`. To check whether the server is working, open the browser and run the URL `http://your_IP/`.
- Install PHP: `sudo apt-get install php5 libapache2-mod-php5 php5-mcrypt`. To check whether PHP is working, create a file index.php in `/var/www/html` with the code `<?php phpinfo(); ?>``, and run the previous URL again. You should be greeted by a lot of info about PHP.
- Install some auxiliary library: `sudo apt-get install libapache2-mod-auth-mysql php5-mysql php5-json`
- Restart the server: `sudo service apache2 restart`.
- Install DBNinja:
- download the file `wget http://www.dbninja.com/download/dbninja.zip`
- unzip the file `unzip dbninja.zip`
- move the zip content in a new directory, possibly with a different name `mv dbninja /var/www/html/mysqlmanager`
- set the correct read/write permission to the `_users' directory` in the previous destination directory `chmod 777 /var/www/html/mysqlmanager/_users/`
- run the URL `http://your_IP/mysqlmanager/` and follow the intructions
- rename the `_users/admin` directory to something unique, to force any perpetrator to guess the username in addition to the password.
...
...
library(devtools)
# install_github('RcppCore/Rcpp') # Install this if <RPostgres> is not installing
# install_github('rstats-db/DBI') # Install this if <RPostgres> is not installing
# install_github('rstats-db/RPostgres')
install_github("slowkow/ggrepel") # https://github.com/slowkow/ggrepel ggrepel provides geoms for ggplot2 to repel overlapping text labels
install_github('rstudio/rmarkdown')
install_github('swarm-lab/editR')
install_github('ramnathv/rCharts')
install_github('ramnathv/rMaps')
install_github('hrbrmstr/waffle')
install_github('ramnathv/slidifyLibraries')
install_github('ramnathv/slidify')
install_github('rstudio/shinydashboard')
install_github('trestletech/shinyTable')
install_github("daattali/shinyjs") # https://github.com/daattali/shinyjs
install_github('ThomasSiegmund/shinyTypeahead') # https://github.com/ThomasSiegmund/shinyTypeahead
install_github("rstudio/profvis") # http://rpubs.com/wch/123888
install_github('skardhamar/rga')
install_github('jcheng5/googleCharts')
install_github('twitter/AnomalyDetection')
install_github('hadley/bigvis')
install_github('leeper/rio')
install_github('56north/hexamapmaker')
install_github('jennybc/googlesheets') # https://github.com/jennybc/googlesheets ## also on CRAN for the stable version.
install_github('hadley/xml2')
install_github('trestletech/plumber') # https://github.com/trestletech/plumber
# HTMLWIDGETS: http://www.htmlwidgets.org/showcase_plotly.html
# ALL htmlwidgets works in shiny with the standard function:
# - ui: xxxOutput('plot_id'),
# - server: output$plot_id <- renderxxx({ ... })
# where xxx is the name of the package
install_github('ramnathv/htmlwidgets')
install.packages('rglwidget')
install_github('rstudio/leaflet') # http://rstudio.github.io/leaflet/
install_github('rstudio/dygraphs') # http://rstudio.github.io/dygraphs/
install_github('ropensci/plotly') # https://plot.ly/r/
install_github('jbkunst/highcharter') # http://jkunst.com/highcharter/
install_github('dataknowledge/visNetwork') # http://dataknowledge.github.io/visNetwork/ ## also on CRAN for the stable version.
install_github('christophergandrud/networkD3') # http://christophergandrud.github.io/networkD3/ ## also on CRAN for the stable version
install_github('rstudio/d3heatmap') # https://github.com/rstudio/d3heatmap
install_github('rstudio/DT') # http://rstudio.github.io/DT/
install_github('bwlewis/rthreejs') # https://github.com/bwlewis/rthreejs ===> The package itself is called just <threejs>
install_github('rich-iannone/DiagrammeRsvg')
install_github('rich-iannone/DiagrammeR') # http://rich-iannone.github.io/DiagrammeR/
install_github('hrbrmstr/metricsgraphics') # http://hrbrmstr.github.io/metricsgraphics/
install_github('renkun-ken/formattable') # http://renkun.me/formattable/
install_github('bokeh/rbokeh') # http://hafen.github.io/rbokeh/
install_github('smartinsightsfromdata/rpivotTable') # https://github.com/smartinsightsfromdata/rpivotTable
install_github('htmlwidgets/sparkline') # https://github.com/htmlwidgets/sparkline
install_github('hrbrmstr/streamgraph') # http://hrbrmstr.github.io/streamgraph/
install_github('jrowen/rhandsontable') # http://jrowen.github.io/rhandsontable/
install_github('kbroman/qtlcharts') # http://kbroman.org/qtlcharts/
install_github('hrbrmstr/taucharts') # https://github.com/hrbrmstr/taucharts
install_github('timelyportfolio/rcdimple') # https://github.com/timelyportfolio/rcdimple
install_github('garthtarr/pairsD3') # http://github.com/garthtarr/pairsD3
install_github('timelyportfolio/parcoords') # https://github.com/timelyportfolio/parcoords
install_github('timelyportfolio/svgPanZoom') # https://github.com/timelyportfolio/svgPanZoom
install_github('rstudio/crosstalk') # dependency for D3TableFilter
install_github('ThomasSiegmund/D3TableFilter') # https://github.com/ThomasSiegmund/D3TableFilter
install_github('timelyportfolio/listviewer') # https://github.com/timelyportfolio/listviewer
install_github('jcheng5/bubbles') # https://github.com/jcheng5/bubbles
install_github('timelyportfolio/sunburstR') # https://github.com/timelyportfolio/sunburstR
install_github('armish/coffeewheel') # https://github.com/armish/coffeewheel
install_github('jbkunst/d3wordcloud') # https://github.com/jbkunst/d3wordcloud, http://rpubs.com/jbkunst/133106
install_github('timelyportfolio/sortableR') # https://github.com/timelyportfolio/sortableR
install_github('ramnathv/rChartsCalmap') # https://github.com/ramnathv/rChartsCalmap
install_github('durtal/calheatmapR') # http://durtal.github.io/calheatmapR/index.html
install_github('garthtarr/edgebundleR') # https://github.com/garthtarr/edgebundleR
install_github('juba/scatterD3') # https://github.com/juba/scatterD3
install_github('timelyportfolio/comicR') # http://timelyportfolio.github.io/buildingwidgets/week18/readme.html
install_github('timelyportfolio/katexR') # http://www.buildingwidgets.com/blog/2015/2/5/week-05-katex-in-r
install_github('timelyportfolio/loryR') # http://timelyportfolio.github.io/buildingwidgets/week19/readme.html
install_github('timelyportfolio/d3vennR') # http://www.buildingwidgets.com/blog/2015/6/5/week-22-d3vennr
install_github('timelyportfolio/d3hiveR') # http://www.buildingwidgets.com/blog/2015/7/11/week-27-d3hiver
install_github('adymimos/rWordCloud') # https://github.com/adymimos/rWordCloud
install_github('htmlwidgets/knob') # https://github.com/htmlwidgets/knob
install_github('timelyportfolio/parsetR') #
install_github('timelyportfolio/stockchartR') #
install_github('timelyportfolio/gifrecordeR') #
install_github('timelyportfolio/mapshaper_htmlwidget') #
install_github('cmpolis/datacomb', subdir = 'pkg') #
install_github('timelyportfolio/timelineR') #
install_github('timelyportfolio/d3radarR') #
install_github('timelyportfolio/functionplotR') #
install_github('timelyportfolio/railroadR') #
install_github('richfitz/remoji') #
install_github('timelyportfolio/summarytrees@htmlwidget')
####################################
# # # ===> C R A N <=== # # #
# # CLASSICS
pkg.lst <- c(
'Amelia', 'car', 'caret', 'classInt', 'corrplot', 'colourpicker', 'data.table', 'devtools', 'diffobj',
'forecast', 'foreach', 'funModeling', 'ggvis', 'glmnet', 'gmodels', 'googleVis', 'gridExtra',
'Hmisc', 'httr', 'janitor', 'listviewer', 'lme4', 'metafor', 'mgcv', 'mlr', 'modelr', 'multcomp',
'nlme', 'parallel', 'plyr', 'psych',
'randomForest', 'RColorBrewer', 'Rcpp', 'reshape2', 'rio', 'RMySQL',
'scales', 'simpletable', 'sjPlot', 'sjmisc', 'sparklyr', 'survival',
'tidyquant', 'tidyverse', 'validate', 'vcd', 'viridis', 'xtable'
)
install.packages(pkg.lst, dependencies = TRUE)
# 1) Note that "tidyverse" include:
# - core: dplyr, ggplot2, purrr, readr, tibble, tidyr (these are always loaded when loading the tidyverse library)
# - plus: broom, feather, forcats, haven, hms, httr, jsonlite, lubridate, magrittr, modelr, readxl, rvest, stringr, xml2
# 2) Note that "tidyquant" include: PerformanceAnalytics, quantmod, TTR, xts, zoo
# # required for reading SPATIAL OBJECTS, plotting MAPS and SPATIAL ANALYSIS
pkg.lst <- c(
'cartography', 'cshapes', 'fields', 'gdistance', 'geojsonio', 'geosphere', 'ggmap', 'GISTools',
'mapmisc', 'maps', 'maptools', 'mapview',
'quickmapr', 'raster', 'rgdal', 'rgeos', 'RgoogleMaps', 'rworldmap', 'rworldxtra', 'sf', 'sp', 'tigris', 'tmap', 'tmaptools'
)
install.packages(pkg.lst, dependencies = TRUE)
# # required for plotting NETWORKS
pkg.lst <- c('igraph', 'network', 'networkDynamic', 'sna')
install.packages(pkg.lst, dependencies = TRUE)
# # GGPLOT EXTENSIONS: http://www.ggplot2-exts.org/gallery/, https://www.ggplot2-exts.org/ggiraph.html, https://github.com/ggplot2-exts/ggplot2-exts.github.io
pkg.lst <- c('ggplot2',
'geomnet', 'GGally', 'ggalt', 'ggedit', 'ggExtra', 'ggfortify', 'ggforce', 'ggiraph', 'ggnetwork', 'ggpmisc', 'ggQC',
'ggraph', 'ggrepel', 'ggtern', 'ggthemes', 'waffle'
)
install.packages(pkg.lst, dependencies = TRUE)
# geomnet: , https://github.com/sctyner/geomnet
# ggally: install_github("ggobi/ggally"), https://ggobi.github.io/ggally/docs.html
# ggalt: install_github("hrbrmstr/ggalt")
# ggedit: install_github("metrumresearchgroup/ggedit"), https://metrumresearchgroup.github.io/ggedit/
# ggExtra: install_github("daattali/ggExtra")
# ggforce: install_github('thomasp85/ggforce'), https://github.com/thomasp85/ggforce
# ggfortify: install_github('sinhrks/ggfortify'), https://journal.r-project.org/archive/accepted/tang-horikoshi-li.pdf
# ggiraph: install_github('davidgohel/ggiraph'), http://davidgohel.github.io/ggiraph/introduction.html
# ggnetwork: install_github("briatte/ggnetwork"), https://briatte.github.io/ggnetwork/, http://curleylab.psych.columbia.edu/netviz/
# ggpmisc: , https://bitbucket.org/aphalo/ggpmisc/src
# ggQC: install_github("kenithgrey/ggQC"), http://ggqc.r-bar.net/index.html
# ggraph: install_github('thomasp85/ggraph')
# ggrepel: install_github("slowkow/ggrepel")
# ggtern: install_git('https://bitbucket.org/nicholasehamilton/ggtern'), https://github.com/nicholasehamilton/ggtern, http://www.ggtern.com/
# ggthemes: install_github('jrnold/ggthemes'), https://github.com/jrnold/ggthemes
# waffle: , https://github.com/hrbrmstr/waffle
# # SHINY, RMARKDOWN, INTERACTIVE REPORTING
pkg.lst <- c(
'bookdown', 'bsplus', 'commonmark', 'flexdashboard', 'htmlTable', 'knitr', 'prettydoc',
'revealjs', 'rmarkdown', 'rmdformats', 'rmdshower', 'rsconnect',
'shiny', 'shinycssloaders', 'shinydashboard', 'shinyDND', 'shinyjqui', 'shinyjs', 'shinythemes', 'shinyWidgets'
'tufte', 'tufterhandout'
)
install.packages(pkg.lst, dependencies = TRUE)
# install_github('rstudio/rmarkdown') # http://rmarkdown.rstudio.com/
# install_github('daattali/shinyjs') # https://github.com/daattali/shinyjs
#
#
#
#
#
# # HTMLWIDGETS: http://gallery.htmlwidgets.org/, http://www.htmlwidgets.org/showcase_leaflet.html
pkg.lst <- c(
'htmlwidgets', 'DiagrammeR', 'DT', 'dygraphs', 'edgebundleR', 'formattable', 'googleway',
'highcharter', 'leaflet', 'mapview', 'networkD3', 'qtlcharts', 'pairsD3', 'plotly',
'rAmCharts', 'rbokeh', 'rhandsontable', 'scatterD3', 'sunburstR', 'timevis', 'tmap', 'visNetwork'
)
install.packages(pkg.lst, dependencies = TRUE)
# DiagrammeR: install_github('rich-iannone/DiagrammeR') http://rich-iannone.github.io/DiagrammeR/
# DT: install_github('rstudio/DT') http://rstudio.github.io/DT/
# dygraphs: install_github('rstudio/dygraphs') http://rstudio.github.io/dygraphs/
# edgebundleR: install_github('garthtarr/edgebundleR') https://github.com/garthtarr/edgebundleR
# formattable: install_github('renkun-ken/formattable') http://renkun.me/formattable/
# googleway: install_github('SymbolixAU/googleway') https://github.com/SymbolixAU/googleway
# highcharter: install_github('jbkunst/highcharter') http://jkunst.com/highcharter/
# leaflet: install_github('rstudio/leaflet') http://rstudio.github.io/leaflet/
# listviewer: install_github('timelyportfolio/listviewer') http://github.com/timelyportfolio/listviewer
# mapview: install_github('environmentalinformatics-marburg/mapview', ref = 'develop') # https://github.com/environmentalinformatics-marburg/mapview
# networkD3: http://christophergandrud.github.io/networkD3/
# pairsD3: install_github('garthtarr/pairsD3') https://github.com/garthtarr/pairsD3
# plotly: install_github('ropensci/plotly') https://plot.ly/r/
# qtlcharts: install_github('kbroman/qtlcharts') http://kbroman.org/qtlcharts/
# rAmCharts: install_github('datastorm-open/rAmCharts') http://datastorm-open.github.io/introduction_ramcharts/
# rbokeh: install_github('bokeh/rbokeh') http://hafen.github.io/rbokeh/
# rhandsontable: install_github('jrowen/rhandsontable') http://jrowen.github.io/rhandsontable/
# scatterD3: install_github('juba/scatterD3') https://github.com/juba/scatterD3
# slickR: install_github('metrumresearchgroup/slickR') https://metrumresearchgroup.github.io/slickR
# sunburstR: install_github('timelyportfolio/sunburstR') https://github.com/timelyportfolio/sunburstR, http://www.buildingwidgets.com/blog/2015/7/2/week-26-sunburstr
# timevis: install_github('daattali/timevis') https://github.com/daattali/timevis
# tmap: install_github('mtennekes/tmap', subdir = 'pkg') https://github.com/mtennekes/tmap
# visNetwork: install_github('datastorm-open/visNetwork') http://datastorm-open.github.io/visNetwork/
# crosstalk:
# # OTHERS
pkg.lst <- c('ndtv')
install.packages(pkg.lst, dependencies = TRUE)
# ndtv:
###################################
# # # ===> G I T H U B <=== # # #
library(devtools)
# # GENERICS
# # GGPLOT
install_github('hadley/ggplot2') # ggplot dev:
install_github("dgrtwo/gganimate") # gganimate: https://github.com/dgrtwo/gganimate
install_github("robjohnnoble/ggmuller") # ggmuller: https://thesefewlines.wordpress.com/2016/08/20/how-to-ggmuller/
install_github("guiastrennec/ggplus") # ggplus: https://github.com/guiastrennec/ggplus
install_github("lionel-/ggstance") # ggstance: https://github.com/lionel-/ggstance
install_github('Ather-Energy/ggTimeSeries') # ggTimeSeries: https://github.com/Ather-Energy/ggTimeSeries
install_github("sachsmc/plotROC") # plotROC: https://github.com/sachsmc/plotROC
# # SHINY
# # HTMLWIDGETS
install_github('jcheng5/bubbles') # bubbles: https://github.com/jcheng5/bubbles
install_github('Kitware/candela', subdir='R/candela', dependencies = TRUE) # candela: https://candela.readthedocs.io/en/latest/index.html
install_github('neuhausi/canvasXpress') # canvasXpress: https://github.com/neuhausi/canvasXpress/
install_github('yutannihilation/chartist') # chartist: https://github.com/yutannihilation/chartist
install_github('armish/coffeewheel') # coffeewheel: https://github.com/armish/coffeewheel, https://www.jasondavies.com/coffee-wheel/
install_github('rstudio/d3heatmap') # d3heatmap: https://github.com/rstudio/d3heatmap
install_github(c('rstudio/crosstalk', 'ThomasSiegmund/D3TableFilter')) # D3TableFilter: https://github.com/ThomasSiegmund/D3TableFilter
install_github("timelyportfolio/exportwidget") # exportwidget: https://github.com/timelyportfolio/exportwidget
install_github('prpatil/healthvis') # healthvis: https://github.com/prpatil/healthvis
install_github('56north/hexamapmaker') # hexamapmaker: https://github.com/56north/hexamapmaker
install_github('hrbrmstr/metricsgraphics') # metricsgraphics: http://hrbrmstr.github.io/metricsgraphics/
install_github("dgrapov/networkly") # networkly: https://github.com/dgrapov/networkly, http://dgrapov.github.io/networkly/
install_github('timelyportfolio/parcoords') # parcoords: https://github.com/timelyportfolio/parcoords
install_github('smartinsightsfromdata/rpivotTable') # rpivotTable: https://github.com/smartinsightsfromdata/rpivotTable
install_github('ramnathv/rChartsCalmap') # rChartsCalmap: http://cal-heatmap.com/
install_github('bwlewis/rthreejs') # rthreejs: https://github.com/bwlewis/rthreejs
install_github('htmlwidgets/sparkline') # sparklines: https://github.com/htmlwidgets/sparkline
install_github('hrbrmstr/streamgraph') # streamgraph: http://hrbrmstr.github.io/streamgraph/
install_github('hrbrmstr/taucharts') # taucharts: https://github.com/hrbrmstr/taucharts
install_github('lchiffon/wordcloud2') # wordcloud2: https://github.com/lchiffon/wordcloud2
by can be found [here]().
### CRAN
All packages should be installed as **su** to ensure a unique shared library between normal users and the shiny-srv user, and avoid duplication and possible mismatches in versions:
sudo su
R
install.packages("pkg_name")
q()
exit
The single installation line could be replaced by the following in case of multiple installations:
dep.pkg <- c(...) # list of packages
pkgs.not.installed <- dep.pkg[!sapply(dep.pkg, function(p) require(p, character.only = TRUE))]
if( length(pkgs.not.installed) > 0 ) install.packages(pkgs.not.installed, dependencies = TRUE)
Even if not directly needed for installing packages from CRAN, it is important to install devtools as the first package because some packages need to install packages dependencies that need to be compiled from source.
pkgs <- c(
'broom', 'Cairo', 'circlize', 'classInt', 'colourpicker', 'data.table', 'DT', 'dygraphs', 'e1071', 'flexdashboard', 'forcats', 'forecast', 'extrafont',
'GGally', 'geojsonio', 'ggplot2', 'ggiraph', 'ggmap', 'ggparallel', 'ggrepel', 'ggspatial', 'ggthemes', 'glmnet',
'highcharter', 'htmltools', 'jsonlite', 'leaflet', 'leaflet.extras', 'lme4', 'lubridate',
'mapview', 'maptools', 'mgcv', 'mlr', 'modelr', 'multcomp', 'nlme', 'odbc', 'openxlsx', 'party', 'plyr', 'pool', 'quantmod',
'RColorBrewer', 'rgeos', 'rgdal', 'rmapshaper', 'rmarkdown', 'rbokeh', 'RMySQL', 'rpart', 'rpart.plot', 'rpivotTable', 'rvest',
'scales', 'sf', 'shinydashboard', 'shinyjs', 'shinythemes', 'shinyWidgets', 'showtext', 'sp', 'spdplyr', 'stringr',
'tidyverse', 'tmap', 'vcd', 'viridis', 'xml2', 'xts', 'zoo'
)
install.packages(pkgs, dependencies = TRUE)
- Subsequently, it'd be better first list the package not already installed:
pkgs.not.installed <- pkgs[!sapply(pkgs, function(p) require(p, character.only = TRUE))]
if(length(pkgs.not.installed) > 0) install.packages(pkgs.not.installed, dependencies = TRUE)
sqldf, qcc, reshape2, randomForest, ggvis, rgl, diagrammeR, network3D, googleVis, googlesheets, car, glmnet, survival, caret, xtable,
maps, diffobj, feather, foreach, gmodels, highcharter, Hmisc, mice, nnet, e1071, kernLab,
### GitHub
library(devtools)
install_github('bhaskarvk/leaflet.extras')
install_github('rstudio/pool')
### Bioconductor
```
source('http://bioconductor.org/biocLite.R')
biocLite('SVGAnnotation')
biocLite('IRanges')
biocLite('Rgraphviz')
biocLite('AnnotationDbi')
```
- devtools:
```
sudo apt-get install curl libssl-dev libcurl4-gnutls-dev
```
- RMySQL:
```
sudo apt-get install libmysqlclient-dev
```
- rgdal/rgeos/spdplyr:
```
sudo add-apt-repository ppa:ubuntugis/ppa
sudo apt-get update
sudo apt-get install gdal-bin
sudo apt-get install libgdal-dev libgeos-dev libproj-dev
```
- sf (must be installed AFTER previous deps):
```
sudo apt-get install libudunits2-dev
```
- geojsonio/tmap/rmapshaper (must be installed AFTER previous deps):
```
sudo apt-get install libv8-3.14-dev
sudo apt-get install libprotobuf-dev
sudo apt-get install protobuf-compiler
```
- Cairo/gdtools:
```
sudo apt-get install libcairo2-dev libxt-dev
```
- RccpGSL:
```
sudo apt-get install libgsl0-dev
```
- GMP:
```
sudo apt-get install libgmp3-dev
```
- rgl:
```
sudo apt-get install r-cran-rgl libcgal-dev libglu1-mesa-dev libglu1-mesa-dev
```
- rJava:
```
sudo apt-get install openjdk-8-*
sudo apt-get install r-cran-rjava
sudo R CMD javareconf
```
### Installation
- Install first the shiny package from inside R
sudo su
R
install.packages('shiny')
q()
exit
- download Shiny Server visiting [this page](https://www.rstudio.com/products/shiny/download-server/ '') and copying the address of the current version:
wget https://download3.rstudio.org/ubuntu-12.04/x86_64/shiny-server-1.5.3.838-amd64.deb
- install Shiny Server: sudo gdebi shiny-server-1.5.3.838-amd64.deb
### Populating the server directory
#### Creating a common group
Because of the way permissions work in Linux, and being the path /srv/shiny-server created by the user shiny, we can’t copy files directly there because they can’t be open by the shiny user
A workaround is to create a group, say shiny-apps, and add shiny and all necessary users to it, giving the group the correct permissions:
```
sudo groupadd shiny-apps
sudo usermod -aG shiny-apps shiny
sudo usermod -aG shiny-apps username
cd /srv/shiny-server
sudo chown -R username:shiny-apps .
sudo chmod g+w .
sudo chmod g+s .
```
Afterwards you can move apps from any location in the hoome folder to the server directory:
```
sudo mkdir /srv/shiny-server/<APP-NAME>
sudo cp -R /home/<USER>/<APP-PATH>/* /srv/shiny-server/<APP-NAME>/
```
#### Using GitHub repositories
### Error logs
Shiny Server error logs can be found at these locations:
- for the server: `/var/log/shiny-server.log`
- for the apps: `/var/log/shiny-server/*.log`
- install auxiliary Ubuntu libraries:
sudo apt-get install gdebi-core
sudo apt-get install libapparmor1
- download Rstudio Server:
wget https://s3.amazonaws.com/rstudio-dailybuilds/rstudio-server-1.0.143-amd64.deb
For the correct file name, visit [this page](http://www.rstudio.com/products/rstudio/download/preview/ '') and copy the address behind the link *RStudio Server x.yy.zzzz - Ubuntu 12.04+/Debian 8+ (64-bit)*
- install Rstudio Server: sudo gdebi rstudio-server-1.0.143-amd64.deb
- check the installation has
- check [git]() executable has been found opening the in RStudio: Tools => Global Options => Git
in case install git:
```
sudo apt-get install git-core
```
-
- There shouldn't be any needs, but in case you need here's how you can respectively start, stop, restart or looking at the status of the server:
```
sudo service rstudio-server start
sudo service rstudio-server stop
sudo service rstudio-server restart
sudo service rstudio-server status
```
- add the CRAN repository to the system file:
```
sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu xenial/" >> /etc/apt/sources.list'
```
- add the public key of *Michael Rutter* to secure apt:
```
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
```
- update and upgrade apt-get:
```
sudo apt-get update
sudo apt-get upgrade
```
- install *R*:
```
sudo apt-get install r-base
sudo apt-get install r-base-dev
```
...
### Deploy Shiny Server with Nginx Basic Authorization
The trick is to have Shiny only serve to the localhost and
have Nginx listen to localhost and only serve to users with a password.
install ngnix
$ sudo apt-get install nginx
allow port 80, and check firewall status
$ sudo ufw allow 80
$ sudo ufw status
by default, nginx does not start automatically, so to check if it installed correctly the server has to be started
$ sudo service nginx start
before proceeding, stop both nginx and shiny-server
$ sudo service nginx stop
$ sudo service shiny-server stop
backup nginx configuration
$
edit nginx configuration
$ sudo nano /etc/nginx/sites-available/default
substitute with following
server {
listen 80;
location / {
proxy_pass http://127.0.0.1:3838/;
proxy_redirect http://127.0.0.1:3838/ $scheme://$host/;
auth_basic "Username and Password are required";
auth_basic_user_file /etc/nginx/.htpasswd;
}
}
edit shiny configuration
$
add 127.0.0.1 after 3838
create usernames and passwords for access
$ cd /etc/nginx
$ sudo htpasswd -c /etc/nginx/.htpasswd username
restart Nginx and Shiny
$ sudo service nginx start
$ sudo service shiny-server start
### Uninstall Apache2
- stop any running instance of Apache2
```
sudo service apache2 stop
```
- uninstall Apache2 and its dependent packages. Use *purge* option (instead of *remove*) to remove dependent packages, as well as any configuration files created by them
```
sudo apt-get purge apache2 apache2-utils apache2.2-bin apache2-common
```
- remove any other dependencies that were installed with Apache2, but are no longer used by any other package.
```
sudo apt-get autoremove
```
- check if there still are any files directly belonging to Apache2 (it should return a blank line)
```
whereis apache2
```
- based on previous results, remove manually like in the following example
```
sudo rm -rf /etc/apache2
```
- check apache2 is actually not recognized anymore
```
sudo service apache2 start
```
### Change services ports numbers, and port forwarding in router configuration
HTTP 80 =>
HTTPS =>
SSH 22 => 7345
WEBMIN => 4948
SAMBA =>
MYSQL =>
XRDP =>
RSTUDIO =>
SHINY =>
NEO4J =>
CALIBRE =>
### Change SSH port & Disable SSH root access
- Open SSH configuration file: `sudo nano /etc/ssh/sshd_config`
- Insert/change the following lines:
```
Port xxxx
Protocol 2
PermitRootLogin no
DenyUsers root
AllowUsers username
HostKey /etc/ssh/ssh_host_zzz_key
UsePrivilegeSeparation yes
RSAAuthentication yes
PubkeyAuthentication yes
```
- Restart the service afterwards: `sudo service ssh restart`
### Enable UFW Firewall
- Update the SSH profile in ufw to allow connections BEFORE enabling the service and the new port xxx:
- `sudo ufw allow OpenSSH`
- `sudo ufw allow xxx`
- enable the firewall: `sudo ufw enable`
- allow all of the other connections that the server needs to respond to: HTTP (80), HTTPS (443), FTP (21),
- check the firewall: `sudo ufw status`
- read [this](https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-with-ufw-on-ubuntu-16-04) guide
### Webmin
#### Installation
- Open the list for editing: sudo nano /etc/apt/sources.list
- Add the following lines at the end of the file:
```
deb http://download.webmin.com/download/repository sarge contrib
deb http://webmin.mirror.somersettechsolutions.co.uk/repository sarge contrib
```
- Install the GPG key to access the repository:
```
wget http://www.webmin.com/jcameron-key.asc
sudo apt-key add jcameron-key.asc
```
- Update packages list: `sudo apt-get update`
- Install webmin: `sudo apt-get install webmin`
#### Secure the access
- Webmin start listening to port 10000, and that's the port that should initially be allowed with the firewall: sudo ufw allow 10000
- Navigate to the URL https://url:10000/, then enter the username and password to log in to webmin console.
- Enable SSL Access: Webmin -> Webmin Configuration -> SSL Encryption
- To change port, we first have Webmin to listen on IPv6:
- To find out if Webmin is listening on IPv6 type: netstat -anp | grep 10000
- Ensure the perl IPv6 Socket module is installed: apt-get install libsocket6-perl
- check if IPv6 is enabled in Webmin: grep "ipv6" /etc/webmin/miniserv.conf
- If you don't see any response, you need to configure webmin to listen on IPv6: echo "ipv6=1" >> /etc/webmin/miniserv.conf
- restart Webmin: service webmin restart
- Change Default Port to some random number xxxxx: Webmin -> Webmin Configuration -> Ports and Addresses
- Allow access via firewall, if you want to access the Webmin console from a remote system: sudo ufw allow xxxxx
- Remove access to the standard port 10000: sudo ufw deny 10000
#### Request certificate
#### Add Two Factor Authentication (2FA)
- [Google Compute Engine (**GCE**)](https://cloud.google.com/)
- Click the **TRY IT FREE** button in the upper right corner, that should get you to [this page](https://console.cloud.google.com/freetrial)
- If you need or want to create a new account, click the *More options* link at the bottom. Otherwise, enter your credentials and log in.
- Fill in the form with typical personal information.
- There you are!
- Everything you build is going to be under a *project*. After signing up, a *first project* has been alreday created for you. At this point you can:
- simply work on it as it is,
- just rename the project: from the left menu click *HOME*, under the tab *DASHBOARD* at the bottom of the first card click *Go to project settings*, change the name as you like then click *Save* on the right of the box.
- create a new project: click the name of the current project at the top of the page, then the *plus* sign in the upper right corner of the pop up window.
- From the left menu choose Compute Engine > VM instances, click the *Create instance button*, then:
- Name your future VM correspondingly
- Choose one of the **europe-west2** zone, which is London
- Under *Machine type* choose *Customise*, and then **1 Core + 4GB memory**. We'll think later to add more cores and RAM when needed and capable to manage them.
- In the *Boot disk* section click *Change*, and then **Ubuntu 16.04 LTS** as OS, **SSD** as *Boot disk type* with at least **25GB** *size*.
- In the Firewall section, select both **Allow HTTP traffic** and **Allow HTTPS traffic**.
- Finally, click the *Create* button to actually create the VM. It will take a few minutes... The process is complete when in the subsequent window a green tick appears besides the name of the new machine.
- Now, click on the machine name's link, near the green tick, to open the VM instance configuration page. Scroll down, and click the link *default* under *Network interfaces / Network*. In the following page, scroll down to the *Firewall rules* section. We are going to add multiple rules, each requires click the button *Add firewall rule* and enter the following information:
| Name | Targets | Source IP ranges| Specified protocols and ports |
| ---- | --- | --- | --- |
| rstudio-server | ALL | 0.0.0.0/0 | tcp:8787 |
| shiny-server | ALL | 0.0.0.0/0 | tcp:3838 |
| mysql-server | ALL | 0.0.0.0/0 | tcp:3306 |
| postgres-server | ALL | 0.0.0.0/0 | tcp:5432 |
| neo4j-server | ALL | 0.0.0.0/0 | tcp:7474 |
| jupyter-nb | ALL | 0.0.0.0/0 | tcp:8888 |
| zeppelin-nb | ALL | 0.0.0.0/0 | tcp:8080 |
| webmin | ALL | 0.0.0.0/0 | tcp:10000 |
The following notes describe how to implement a data web analytics stack using a group of robust open source software tools:
- the Linux [Ubuntu](http://www.ubuntu.com/) operating system
- the [Nginx](http://nginx.org/en/) HTTP and reverse proxy server (which replaces the more common [Apache](http://httpd.apache.org/) server due to its increased performance and security)
- a choice of [MySQL](http://www.mysql.com/) and [Postgres](https://www.postgresql.org/) relational databases, the graph database [Neo4j](https://neo4j.com/) and the big data cluster-computing processing framework [Spark](https://spark.apache.org/)
- a diverse mix of programming languages:
- the [R](https://cran.r-project.org/) statistical language, equipped with hundreds of [packages](https://cran.r-project.org/web/views/) to all kinds of analytics tasks, and its Web based IDE [RStudio Server](https://www.rstudio.com/products/rstudio/#Server)
- the [Python](https://www.python.org/) general purpose language, boosted by the scientific and analytics [Scipy](https://www.scipy.org/) stack
- the [Scala](https://www.scala-lang.org) functional language and its interactive build tool [SBT](http://www.scala-sbt.org/)
- the [Jupyter](http://jupyter.org/) Python Web-based notebook
- the [Zeppelin](https://zeppelin.apache.org/) Apache Web-based notebook
- the R [Shiny Server](https://www.rstudio.com/products/shiny/) platform for deploying dynamic and interactive content and visualization
In this guide, we will learn how to set up R on a DigitalOcean Droplet running Ubuntu 16.04 using a VM created over the [Google Compute Engine](https://cloud.google.com/compute/), but other providers will do as well: [DO - Digital Ocean](https://www.digitalocean.com/), [AWS - Amazon Web Services](https://aws.amazon.com/), [Microsoft Azure](https://azure.microsoft.com/en-gb/) just to mention the big ones.
Before you start building your cloud machine using this guide, you should have a separate, non-root user account set up on your server.
<p align="right"><small><i> updated on 15-09-2017 </i></small></p>
### Prerequisites
- [Introduction](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-01-introduction-md)
- [VM on GCE](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-02-vm-on-gce-md)
- [Server stuff](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-03-server-stuff-md): ports, firewall, webmin, ...
- [nginx](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-04-nginx-md)
- [Docker](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-05-docker-md)
### R stack
- [core](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-11-r-core-md)
- [RStudio Server](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-12-rstudio-server-md)
- [Shiny Server](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-13-shiny-server-md)
- [Linux dependencies](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-14-linux-dependencies-md)
- [Some packages to start with...](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-15-r-packages-md)
- [Additional packages for a data viz powerhouse](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-16-data-viz-powerhouse-md)
- [Additional fonts](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-18-additional-fonts-md)
- [Configurations](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-19-configurations-md)
### Databases:
- [MySQL](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-21-mysql-md)
- [PostgreSQL](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-22-postgresql-md)
- [Neo4j](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-23-neo4j-md)
### Python
- [ScyPy stack](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-31-python-scypy-stack-md)
- [Jupyter Notebook](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-32-jupyter-notebook-md)
### Big Data, High Performance
- [Scala](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-41-scala-md)
- [Spark](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-42-spark-md)
- [Zeppelin](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-43-zeppelin-notebook-md)
### Whatever else
- [Install Linux on a physical machine](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-91-physical-machine-md)
- [Additional Resources](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-97-additional-resources-md)
- [Credits](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-98-credits-md)
- [About](https://gist.github.com/lvalnegri/74a4e941afa7cd3a8d2b625c62adf916#file-99-about-md)
以上是关于markdown 使用Google Compute Engine(GCE)和开源工具设置您自己的云分析机器的主要内容,如果未能解决你的问题,请参考以下文章
ImportError:没有名为 google_compute_engine 的模块
ruby Google Cloud Compute使用Rotate创建快照
使用 Go 在 Google Container/Compute Engine 中登录到 Google Cloud
在 Google Compute Engine 上使用 gcloud 安装 node.js