scrapy中把数据写入mongodb

Posted ptwg

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了scrapy中把数据写入mongodb相关的知识,希望对你有一定的参考价值。

1.setting.py中打开管道

ITEM_PIPELINES = 
   # ‘tianmao.pipelines.TianmaoPipeline‘: 300,

2.setting.py中写入mongodb配置

# mongodb
HOST = "127.0.0.1"  # 服务器地址
PORT = 27017  # mongo默认端口号
USER = "用户名"
PWD = "密码"
DB = "数据库名"
TABLE = "表名"

3.pipeline.py文件中倒入pymongo,数据写入数据库

from pymongo import MongoClient
class TianmaoPipeline(object):
    def __init__(self, host, port, user, pwd, db, table):
        self.host = host
        self.port = port
        self.user = user
        self.pwd = pwd
        self.db = db
        self.table = table

    @classmethod
    def from_crawler(cls, crawler):
        HOST = crawler.settings.get(HOST)
        PORT = crawler.settings.get(PORT)
        USER = crawler.settings.get(USER)
        PWD = crawler.settings.get(PWD)
        DB = crawler.settings.get(DB)
        TABLE = crawler.settings.get(TABLE)
        return cls(HOST, PORT, USER, PWD, DB, TABLE)

    def open_spider(self, spider):
        self.client = MongoClient(mongodb://%s:%s@%s:%s %(self.user,self.pwd,self.host,self.port))

    def close_spider(self, spider):
        self.client.close()

    def process_item(self, item, spider):
        self.client[self.db][self.table].save(dict(item))

 

以上是关于scrapy中把数据写入mongodb的主要内容,如果未能解决你的问题,请参考以下文章

Hadoop MapReduce中把分析数据写入mysql中

在 Scrapy 中将项目写入 MySQL 数据库

scrapy 分布式爬取数据同步写入数据库

scrapy实战8关于数据异步写入mysql:

四十六 Python分布式爬虫打造搜索引擎Scrapy精讲—elasticsearch(搜索引擎)scrapy写入数据到elasticsearch中

scrapy 爬虫并把所有网址和所有图片对应起来写入到Excel中