python CSS选择器与Xpath的用法样例
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python CSS选择器与Xpath的用法样例相关的知识,希望对你有一定的参考价值。
# -*- coding: utf-8 -*-
# @Time : 2018/6/17 下午7:16
# @Author : 李新星
# @Site :
# @File : study_CSS_xpath_selector.py
# @Software: PyCharm
from scrapy.selector import Selector
s0 = """
<div class="entry-header">
<h1>2016 腾讯软件开发面试题(部分)</h1>
</div>
"""
response = Selector(text=s0)
print(response.css(".entry-header h1").extract())
print(response.css(".entry-header h1::text").extract())
print(response.xpath("//div[@class='entry-header']/h1/text()").extract())
# ['<h1>2016 腾讯软件开发面试题(部分)</h1>']
# ['2016 腾讯软件开发面试题(部分)']
# ['2016 腾讯软件开发面试题(部分)']
s1 = """
<p class="entry-meta-hide-on-mobile">
2017/02/18 ·
<a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
·
<a href="#article-comment"> 9 评论 </a>
·
<a href="http://blog.jobbole.com/tag/%e9%9d%a2%e8%af%95/">面试</a>
</p>
"""
response = Selector(text=s1)
print(response.css(".entry-meta-hide-on-mobile::text").extract())
print(response.xpath("//p[@class='entry-meta-hide-on-mobile']/text()").extract())
# ['\n 2017/02/18 ·\n ', '\n ·\n ', '\n ·\n ', '\n']
# ['\n 2017/02/18 ·\n ', '\n ·\n ', '\n ·\n ', '\n']
s2 = """
<span data-post-id="110287" class=" btn-bluet-bigger href-style vote-post-up register-user-only ">
<i class="fa fa-thumbs-o-up"></i>
<h10 id="110287votetotal">2</h10>
赞
</span>
"""
response = Selector(text=s2)
print(response.css(".vote-post-up h10::text").extract())
print(response.xpath("//span[contains(@class,'vote-post-u')]/h10/text()").extract())
# ['2']
# ['2']
s3 = """
<span data-book-type="1" data-site-id="2" data-item-id="110287" data-item-type="1" class=" btn-bluet-bigger href-style bookmark-btn register-user-only ">
<i class="fa fa-bookmark-o "></i>
28 收藏
</span>
"""
response = Selector(text=s3)
print(response.css(".bookmark-btn::text").extract())
print(response.xpath("//span[contains(@class,'bookmark-btn')]/text()").extract())
# ['\n ', ' \n 28 收藏\n']
# ['\n ', ' \n 28 收藏\n']
s4 = """
<a href="#article-comment">
<span class="btn-bluet-bigger href-style hide-on-480">
<i class="fa fa-comments-o"></i>
9 评论
</span>
</a>
"""
response = Selector(text=s4)
print(response.css("a[href='#article-comment'] span::text").extract())
print(response.xpath("//a[@href='#article-comment']/span/text()").extract())
# ['\n ', ' \n 9 评论\n ']
# ['\n ', ' \n 9 评论\n ']
s5 = """
<p class="entry-meta-hide-on-mobile">
<a href="#article-comment"> 9 评论 </a>
<h10 id="110287votetotal">2</h10>
<a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
<a href="#article-comment"> 9 评论 </a>
·
<a href="http://blog.jobbole.com/tag/%e9%9d%a2%e8%af%95/">面试</a>
</p>
<a href="#article-comment"> 9 评论 </a>
<a href="#article-comment"> 9 评论 </a>
<h10 id="110287votetotal">2</h10>
<a href="#article-comment"> 9 评论 </a>
"""
response = Selector(text=s5)
print(response.css("p>a::text").extract())
print(response.css("p+a::text").extract())
# [' 9 评论 ', '职场', ' 9 评论 ', '面试']
# [' 9 评论 ']
s6 = """
<p class="entry-meta-hide-on-mobile" id="1189381312">
<a href="#article-comment"> 9 评论 </a>
<h10 id="110287votetotal">2</h10>
<a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
</p>
<p class="entry-meta-hide-on-mobile" id="13123141241">
<a href="#article-comment"> 9 评论 </a>
<h10 id="110287votetotal">2</h10>
<a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
</p>
"""
response = Selector(text=s6)
print(response.css("p#1189381312>a::text").extract())
print(response.css("p#1189381312 > a::text").extract())
# [' 9 评论 ', '职场']
# [' 9 评论 ', '职场']
# 获取某个属性值
response = Selector(text=s7)
print(response.css("meta[name='csrf-token']::attr(content)").extract_first())
#'OcAFkXBwRt03gmZi85JJdu9056tSjxq8H4JpYfL0uZi9svi4JxzUFZbtagbDTGoyRMdyf8H04KoMWGfRW6dGjg=='
以上是关于python CSS选择器与Xpath的用法样例的主要内容,如果未能解决你的问题,请参考以下文章
Python爬虫从入门到放弃(十四)之 Scrapy框架中选择器的用法