ubuntu14.04安装pyspider

Posted zxpo

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ubuntu14.04安装pyspider相关的知识,希望对你有一定的参考价值。

sudo apt-get install libcurl4-openssl-dev libxml2-dev libxslt1-dev

sudo atp-get install phantomjs

激活虚拟环境(python3.6.7)

pip install pyspider

执行pysqpider 即可 

 

如果出现mysql相关的错误执行下面的语句先。

sudo apt-get purge mysql* 

sudo apt-get autoremove 

sudo apt-get autoclean

sudo apt-get dist-upgrade

 发布

This document is based on MySQL + RabbitMQ

config.json

Although you can use command-line to specify the parameters. A config file is a better choice.

{
  "taskdb": "mysql+taskdb://username:[email protected]:port/taskdb",
  "projectdb": "mysql+projectdb://username:[email protected]:port/projectdb",
  "resultdb": "mysql+resultdb://username:[email protected]:port/resultdb",
  "message_queue": "amqp://username:[email protected]:port/%2F",
  "webui": {
    "username": "some_name",
    "password": "some_passwd",
    "need-auth": true
  }
}

Database Connection URI type: should be one of `taskdb`, `projectdb`, `resultdb`.

running

You should run components alone with subcommands. You may add & after command to make it running in background and use screen or nohup to prevent exit after your ssh session ends. It‘s recommended to manage components with Supervisor.

# start **only one** scheduler instance
pyspider -c config.json scheduler

# phantomjs
pyspider -c config.json phantomjs

# start fetcher / processor / result_worker instances as many as your needs
pyspider -c config.json --phantomjs-proxy="localhost:25555" fetcher
pyspider -c config.json processor
pyspider -c config.json result_worker

# start webui, set `--scheduler-rpc` if scheduler is not running on the same host as webui
pyspider -c config.json webui

you can get complete options by running pyspider --help and pyspider webui --help for subcommands. 

"webui" in JSON is configs for subcommands. You can add parameters for other components similar to this one.

To deploy pyspider components in each single processes, you need at least one database service. pyspider now supports MySQLMongoDB and PostgreSQL. You can choose one of them.

And you need a message queue service to connect the components together. You can use RabbitMQBeanstalk or Redis as message queue.

pip install --allow-all-external pyspider[all]

Even if you had install pyspider using pip before. Install with pyspider[all] is necessary to install the requirements for MySQL/MongoDB/RabbitMQ

 



以上是关于ubuntu14.04安装pyspider的主要内容,如果未能解决你的问题,请参考以下文章

pyspider 在ubuntu上安装失败怎么搞?

ubuntu14.04源代码安装postgresql 9.1

Ubuntu 14.04 安装VMware 12

sh Ubuntu 14.04 PHP7(从源代码安装)

sh 从源代码为Ubuntu 14.04安装最新的nginx

Ubuntu 14.04 安装 Sublime Text 3