得到Del.icio.us公司来自搜索的链接

Posted 2021-02-26

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了得到Del.icio.us公司来自搜索的链接相关的知识，希望对你有一定的参考价值。

find great websites by scraping links from delicious.com

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# (C) 2009 HalOtis Marketing
# written by Matt Warren
# http://halotis.com/
 
"""
Scraper for Del.icio.us SERP.
 
This pulls the results for a match for a query on http://del.icio.us.
"""
 
import urllib2
import re
 
from BeautifulSoup import BeautifulSoup
 
def get_delicious_results(query, page_limit=10):
 
    page = 1
    links = []
 
    while page &lt; page_limit :
        url='http://delicious.com/search?p=' + '%20'.join(query.split()) + '&amp;context=all&amp;lc=1&amp;page=' + str(page)
        req = urllib2.Request(url)
        html = urllib2.urlopen(req).read()
        soup = BeautifulSoup(HTML)
 
        next = soup.find('a', attrs={'class':re.compile('.*next$', re.I)})
 
        #links is a list of (url, title) tuples
        links +=   [(link['href'], ''.join(link.findAll(text=True)) ) for link in soup.findAll('a', attrs={'class':re.compile('.*taggedlink.*', re.I)}) ]
 
        if next :
            page = page+1
        else :
            break
 
    return links
 
if __name__=='__main__':
    links = get_delicious_results('halotis marketing')
    print links

以上是关于得到Del.icio.us公司来自搜索的链接的主要内容，如果未能解决你的问题，请参考以下文章