爬虫学习之-操作mysql

Posted brady-wang

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬虫学习之-操作mysql相关的知识,希望对你有一定的参考价值。

在操作数据库的时候,python2中一般使用mysqldb,但在python3中已经不在支持mysqldb了,我们可以用pymysql和mysql.connector。本文的所有操作都是在python3的pymysql下完成的。

一、基本操作

(1) 查询

import pymysql

conn = pymysql.connect(host=\'127.0.0.1\', user=\'root\', passwd=\'root\', db=\'test\')
cur = conn.cursor()

# 查询
sql = "select * from info"
reCount = cur.execute(sql)  # 返回受影响的行数
print(reCount)
data = cur.fetchall()  # 返回数据,返回的是tuple类型
print(data)
"""
((1, \'mj\', \'tokyo\'), (2, \'alex\', \'newyork\'), (3, \'tommy\', \'beijing\'))
"""

cur.close()
conn.close()

(2) 修改

import pymysql

conn = pymysql.connect(host=\'127.0.0.1\', user=\'root\', passwd=\'root\', db=\'test\')
cur = conn.cursor()

# 插入数据
sql2 = "insert into info(NAME,address ) VALUES(%s,%s)"  # sql语句,%s是占位符(%s是唯一的,不论什么数据类型都使用%s)用来防止sql注入
params = (\'eric\', \'wuhan\')  # 参数
reCount = cur.execute(sql2, params)

# 批量插入
li = [(\'a1\', \'b1\'), (\'a2\', \'b2\')]
sql3 = \'insert into info(NAME ,address) VALUES (%s,%s)\'
reCount = cur.executemany(sql3, li)

conn.commit()  # 提交,执行多条命令只需要commit一次就行了

cur.close()
conn.close()

(3) 返回dict类型的数据

import pymysql

conn = pymysql.connect(host=\'127.0.0.1\', user=\'root\', passwd=\'root\', db=\'test\')
# cur = conn.cursor()
cur = conn.cursor(cursor=pymysql.cursors.DictCursor)    #创建cursor的时候,指定1其返回的cursor类型为dict

# 查询
sql = "select * from info"
reCount = cur.execute(sql)  # 返回受影响的行数
print(reCount)
data = cur.fetchall()  # 返回数据,返回的是tuple类型
print(data)

cur.close()
conn.close()

"""
[{\'address\': \'tokyo\', \'name\': \'mj\', \'id\': 1}, {\'address\': \'newyork\', \'name\': \'alex\', \'id\': 2}, {\'address\': \'beijing\', \'name\': \'tommy\', \'id\': 3}]
"""

(4)获取自增id

通过cur.lastrowid来获取自增id


# 插入数据
sql = "insert into info(NAME,address ) VALUES(%s,%s)" 
params = (\'eric\', \'/usr/bin/a.txt\')  
reCount = cur.execute(sql, params)
conn.commit()

new_id = cur.lastrowid  #获取自增id,提交完之后才能取到值
print(new_id)

二、cursor定位

使用fechone来逐条获取数据

data = cur.fetchone()
print(data)  

data = cur.fetchone()
print(data)  

data = cur.fetchone()
print(data)  

"""
(1, \'mj\', \'tokyo\')
(2, \'alex\', \'newyork\')
(3, \'tommy\', \'beijing\')
"""

(1) 绝对定位

cur.scroll(0,mode=\'absolute\')
data = cur.fetchone()
print(data) 

cur.scroll(0,mode=\'absolute\')

data = cur.fetchone()
print(data) 

data = cur.fetchone()
print(data)
"""
(1, \'mj\', \'tokyo\')
(1, \'mj\', \'tokyo\')
(2, \'alex\', \'newyork\')
"""

(2) 相对定位

cur.scroll(-1,mode=\'relative\')
data = cur.fetchone()
print(data) 

data = cur.fetchone()
print(data)  

cur.scroll(-1,mode=\'relative\')

data = cur.fetchone()
print(data)
"""
(1, \'mj\', \'tokyo\')
(2, \'alex\', \'newyork\')
(2, \'alex\', \'newyork\')
"""

三、解耦

这里简单实现一个用户登录的功能,以便对整个业务结构有一个整体的认识。

项目目录结构

index.py

#!/usr/bin/env python3
# -*- coding:utf-8 -*-

from model.admin import Admin


def main():
    usr = input("username:")
    pwd = input("password:")
    admin = Admin()
    result = admin.CheckValidate(usr, pwd)
    if not result:  # 一般会把简单的逻辑放在上面,复杂的逻辑放下面
        print("登录失败!")
    else:
        print("登陆成功!进入后台管理界面..")


if __name__ == "__main__":
    main()

admin.py

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
from utility.sql_helper import MySqlHelper


class Admin():
    def __init__(self):
        self.__helper = MySqlHelper()

    def CheckValidate(self,username,password):
        sql = "select * from admin where username=%s and password=%s"
        params=(username,password)
        return self.__helper.getOne(sql,params)

sql_helper.py

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
import pymysql
import conf


class MySqlHelper(object):
    def __init__(self):
        self.__conn_dict = conf.conn_dict  # 把数据库连接信心提取到conf中

    def getDict(self, sql, params):
        conn = pymysql.connect(**self.__conn_dict)
        cur = conn.cursor(cursor=pymysql.cursors.DictCursor)
        cur.execute(sql, params)
        data = cur.fetchall()
        cur.close()
        conn.close()
        return data

    def getOne(self, sql, params):
        conn = pymysql.connect(**self.__conn_dict)  # 加**后表示传入的是字典里的数据,否则报错
        cur = conn.cursor(cursor=pymysql.cursors.DictCursor)
        cur.execute(sql, params)
        data = cur.fetchone()
        cur.close()
        conn.close()
        return data

conf.py

#!/usr/bin/env python3
# -*- coding:utf-8 -*-

conn_dict = dict(host=\'127.0.0.1\', user=\'root\', passwd=\'root\', db=\'test\')
 

以上是关于爬虫学习之-操作mysql的主要内容,如果未能解决你的问题,请参考以下文章

爬虫学习之webmagic源码剖析

爬虫学习之第四章爬虫进阶之多线程爬虫

爬虫概念与编程学习之如何爬取视频网站页面(用HttpClient)

爬虫概念与编程学习之如何爬取网页源代码

python学习之爬虫一

爬虫学习之HTTP协议初步了解