Python3爬虫入门到精通 | 环境安装

Posted COCOgsta

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Python3爬虫入门到精通 | 环境安装相关的知识,希望对你有一定的参考价值。

学习视频来源:崔庆才《Python3爬虫入门到精通》

Python安装

Anaconda

国内镜像:Index of /anaconda/archive/ | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror

conda list,看到所有安装的包,几乎不需要额外再安装其他包

安装时用pip或conda安装都可以

官方安装

下载executable installer(64位),安装时需要添加到环境变量中(路径可自定义)

IDE开发工具

Pycharm

Ubuntu安装

sudo apt-get install python3-dev build-essential libssl-dev libffi-dev libxml2 libxml2-dev libxslt1-dev zlib1g-dev
sudo apt-get install python3
sudo apt-get install python3-pip

输入pip3,进入pip3的环境

MAC OS

​homebrew
~ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install python3

输入python3进入到环境中

MongoDB环境安装

Windows安装

windows选择下载2008 server版本

Server/3.4目录下新建data文件夹,进入后新建db文件夹

bin目录下,shift+右键,选择“在此处打开命令窗口”,输入mongod --dbpath C:\\MongoDB\\Server\\3.4\\data\\db,启动mongodb

浏览器访问localhost:27017

bin目录下,shift+右键,选择“在此处打开命令窗口”,输入mongo,进入mongo客户单交互模式,输入db,返回test数据库

db.test.insert({'a':'b'}),插入一条数据

在data文件夹下建立logs文件夹,进入后再新建mongo.log

以管理员身份运行cmd.exe,cd C:\\MongoDB\\Server\\3.4\\bin,输入mongod --bind_ip 0.0.0.0 --logpath C:\\MongoDB\\Server\\3.4\\data\\logs\\mongo.log --logappend --dbpath C:\\MongoDB\\Server\\3.4\\data\\db --port 27017 --serviceName "MongoDB" --serviceDisplayName "MongoDB" --install,配置mongoDB服务

查看计算机服务,可以看到MongoDB服务,右键启动

Robomongo客户端,可视化查看MongoDB数据

https://robomongo.org/download 下载

Ubuntu安装

sudo apt-get install mongodb

mongod,自动创建db文件夹

mongo,进入命令行交互模式

show dbs

use local

db.test.insert({'a':'b'}),插入数据

MAC OS

确认homebrew已经安装

brew install mongodb

自己的电脑MAC OS10.11,无法使用brew下载,该mongodb官网下载mac版

home目录下创建文件夹java,解压下载文件拷贝至该目录下

open -e .bash_profile,编辑,添加bash_profile

guoliangs-MacBook-Pro-15-inch:~ guoliang$ cat .bash_profile
# mongodb
MONGODB_HOME=/Users/guoliang/java/mongodb-osx-x86_64-3.4.19
# maven
export M2_HOME=/usr/local/apache-maven-3.2.2
export PATH=$PATH:$M2_HOME/bin:$MONGODB_HOME/bin
# java
# export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
# tomcat
export PATH=$PATH:/Library/tomcat/bin
# mysql
export PATH=${PATH}:/usr/local/mysql/bin
# Scala
SCALA_HOME="/Library/scala-2.12.5/"
export PATH=$PATH:$SCALA_HOME/bin
# added by Anaconda3 5.1.0 installer
export PATH="/anaconda3/bin:$PATH"
# added by Anaconda3 4.4.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
# added by Anaconda3 4.2.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
# added by Anaconda3 4.2.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
# added by Anaconda3 4.4.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
guoliangs-MacBook-Pro-15-inch:~ guoliang$ 

source .bash_profile是配置生效

mongod -version查看版本

终端,进入/Users/guoliang/java/mongodb-osx-x86_64-3.4.19目录

mkdir data

mkdir log

mongod --dbpath data --logpath log/mongod.log --logappend --fork

mongo,进入命令行交互模式,同linux

Redis安装

Windows

https://github.com/MSOpenTech/redis/releases,下载Redis-x64-3.2.100.msi

​​​​​​https://github.com/uglide/RedisDesktopManager/releases,下载redis-desktop-manager-0.8.8.384.exe打开Redis Desktop Manager,点击“Connect to Redis Server”,Host为localhost

Ubuntu

sudo apt-get install redis-server -y

redis-cli,进入redis命令行模式

set 'a' 'b'

get 'a'

sudo vi /etc/redis/redis.conf

注释 bind 127.0.0.1,这样就可以远程访问

取消注释 requirepass foobared,这样可以设置redis连接密码,默认为foobared

sudo service redis restart

redis-cli

get 'a',会提示没有权限

redis-cli -a foobared

get 'a',可以得到正常的值

MAC OS

brew install redis

redis-cli

set 'name' 'Mike'

/usr/local/etc/redis.conf下可以修改配置文件,同Linux配置

brew services list

brew services restart redis

redis-cli

 MySQL安装

Windows

百度搜索mysql,百度软件中心有mysql-5.7.17.msi下载

百度搜索mysql-front下载,Host为localhost,密码为安装mysql安装时设置的123456

Ubuntu

sudo apt-get install mysql-server mysql-client

设置密码为123456

mysql -uroot -p

show databases;

use mysql;

select * from db;

vi /etc/mysql/mysql.conf.d/mysqld.cnf

注释 bind-address

sudo service mysql restart

MAC OS

brew install mysql

mysql -uroot -p

密码为root

show databases

Python多版本共存配置

关于环境变量

Windows

where python,查到python的路径值

默认python和pip按照环境变量中排在前面的优先调用

修改python36目录下的python.exe文件名,改为python3.exe;anaconda下python.exe修改为python-conda.exe;python27目录下的python.exe改为python2.exe

同理修改pip文件名,注意pip-conda.exe -V无法正常执行,需要复制pip-script.py至pip-conda-script.py

Ubuntu、MAC OS

echo $PATH

whereis python2,whereis python3;MAC OS为which

ln -s /usr/bin/python3.5 /usr/bin/python3

ln -s /usr/bin/python2.7 /usr/bin/python2

PyCharm设置

自行选择解释器

爬虫常用库的安装

Windows

urllib re

内置,不需要安装

requests

pip install requests

selenium

pip install selenium

chromedriver

https://chromedriver.storage.googleapis.com/index.html?path=2.22/,下载chromedriver_win32.zip

把解压后的chromedriver.exe放入到python的安装目录下,如c:/python36/Scripts

要求windows电脑上chrome的版本是53.0

进入python
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('www.baidu.com')​
driver.page_source

phantomjs

phantomjs.org/download.html,下载phantomjs-2.1.1-windows.zip

解压并将该目录(bin目录)添加到环境变量中

进入python
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('http://www.baidu.com')
driver.page_source​

lxml

pip install lxml或lxml · PyPI 下载,pip install x:/xx/lxml-3.7.3-cp36-cp36m-win_amd64.whl离线安装。前提是必须安装pip install wheel

beautifulsoup

pip install beautifulsoup4

进入python
from bs4 import BeautifulSoup
soup = BeautifulSoup('<html></html>', 'lxml')​

pyquery

pip install pyquery

进入python
from pyquery import PyQuery as pq
doc = pq('<html>Hello</html>')
result = doc('html').text()

pymysql

pip install pymysql

进入python
import pymysql
conn = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='mysql')
cursor = conn.cursor()
cursor.execute('select * from db')
cursor.fetchone()​

pymongo

pip install pymongo

进入python
import pymongo
client = pymongo.MongoClient=('localhost')
db = client['newtestdb']
db['table'].insert({'name': 'Bob'})
db['table'].find_one({'name':'Bob'})​​

redis

pip install redis

进入python
import redis
r = redis.Redis('localhost', 6379)
r.set('name', 'Bob')
r.get('name')​

flask

pip install flask

django

pip install django

jupyter

pip install jupyter

Linux MAC

pip install requests selenium beautifulsoup4 pyquery pymysql pymongo redis flask django jupyter

以上是关于Python3爬虫入门到精通 | 环境安装的主要内容,如果未能解决你的问题,请参考以下文章

如何把Python入门?

从python基础到爬虫的书有啥值得推荐

python网络爬虫可以干什么

Python入门到精通精品第十章 - 爬虫

Python 3从入门到精通01-环境搭建

Python3实战——爬虫入门