python 来自http://knarfeh.com/2016/03/11/leetcode-%E7%AC%94%E8%AE%B0%E8%AF%B4%E6%98%8E/
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python 来自http://knarfeh.com/2016/03/11/leetcode-%E7%AC%94%E8%AE%B0%E8%AF%B4%E6%98%8E/相关的知识,希望对你有一定的参考价值。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib2
import re
from bs4 import BeautifulSoup
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
leetcode_md = u"""title: "%s"
date: 2014-03-11 00:33:34
tags: [algorithms, leetcode, %s]
---
### 描述
---
这里是描述
<!--more-->
### 分析
---
这里是分析
### 解决方案1(C++)
---
### 解决方案2(Java)
---
### 解决方案3(Python)
---
### 相关问题
---
%s
### [题目来源](%s)
"""
def get_tag_content(tag):
u"""
用于提取bs中tag.contents的内容
"""
return "".join([unicode(x) for x in tag.contents])
def get_attr(dom, attr, defaultValue=""):
u"""
获取bs中tag.content的指定属性
若content为空或者没有指定属性则返回默认值
"""
if dom is None:
return defaultValue
return dom.get(attr, defaultValue)
leetcode_problems = 'https://leetcode.com/problemset/algorithms/'
html = urllib2.urlopen(leetcode_problems)
content = html.read()
soup = BeautifulSoup(content, 'lxml')
problem_list = soup.select('table.table tbody tr')
# print problem_list
for item in problem_list[:5]:
soup = BeautifulSoup(str(item), 'lxml')
problem_id = get_tag_content(soup.select('td')[1])
problem_name = get_tag_content(soup.select('td a')[0]).replace(' ', '-')
href = get_attr(soup.select('td a')[0], 'href')
problem_href = 'https://leetcode.com' + href
filename = 'leetcode-' + str(problem_id) + '-' + str(problem_name) + ".md"
problem_name_md = 'leetcode-' + str(problem_id) + '-' + str(problem_name)
html = urllib2.urlopen(problem_href)
content = html.read()
soup = BeautifulSoup(content, 'lxml')
problem_tag_list = []
similar_problem_list = []
if len(soup.select('span.hidebutton')) > 0:
problem_tag_list = soup.select('span.hidebutton')[0].select('a')
if len(soup.select('span.hidebutton')) > 1:
similar_problem_list = soup.select('span.hidebutton')[1].select('a')
tags = []
for tag_item in problem_tag_list:
soup = BeautifulSoup(str(tag_item), 'lxml')
tag = get_tag_content(soup.select('a')[0]).strip().replace(' ', '-')
tags.append(tag)
similar_problem = {}
for similar_item in similar_problem_list:
soup = BeautifulSoup(str(similar_item), 'lxml')
similar_problem_name = get_tag_content(soup.select('a')[0]).strip()
href = get_attr(soup.select('a')[0], 'href')
similar_problem_href = 'https://leetcode.com' + href
similar_problem[similar_problem_name] = similar_problem_href
title = problem_name_md
md_tags = ', '.join(tags)
similar_problem_md = ''
for key, value in similar_problem.items():
similar_problem_md += ('['+key+']'+'('+value+') \n') # 加的两个空格是为了在md中显示换行
now_leetcode_md = leetcode_md % (title, md_tags, similar_problem_md, problem_href)
print(u"完成" + filename)
f = open(filename, 'w')
f.write(now_leetcode_md)
f.close()
以上是关于python 来自http://knarfeh.com/2016/03/11/leetcode-%E7%AC%94%E8%AE%B0%E8%AF%B4%E6%98%8E/的主要内容,如果未能解决你的问题,请参考以下文章
python 来自Python的os文件系统
一封来自“Python”的信
来自 Black Hat Python 书的 Python 嗅探
“路径 python3(来自 --python=python3)不存在”错误
python Python装饰模板(来自“Head First Python ed.2”)
来自嵌套字典的 Python 数据类