python CSS选择器与Xpath的用法样例

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python CSS选择器与Xpath的用法样例相关的知识,希望对你有一定的参考价值。

# -*- coding: utf-8 -*-
# @Time    : 2018/6/17 下午7:16
# @Author  : 李新星
# @Site    : 
# @File    : study_CSS_xpath_selector.py
# @Software: PyCharm

from scrapy.selector import Selector


s0 = """
<div class="entry-header">
    <h1>2016 腾讯软件开发面试题(部分)</h1>
</div>
"""
response = Selector(text=s0)
print(response.css(".entry-header h1").extract())
print(response.css(".entry-header h1::text").extract())
print(response.xpath("//div[@class='entry-header']/h1/text()").extract())
# ['<h1>2016 腾讯软件开发面试题(部分)</h1>']
# ['2016 腾讯软件开发面试题(部分)']
# ['2016 腾讯软件开发面试题(部分)']


s1 =  """
<p class="entry-meta-hide-on-mobile">
    2017/02/18 ·
    <a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
    ·
    <a href="#article-comment"> 9 评论 </a>
    ·
    <a href="http://blog.jobbole.com/tag/%e9%9d%a2%e8%af%95/">面试</a>
</p>
"""

response = Selector(text=s1)
print(response.css(".entry-meta-hide-on-mobile::text").extract())
print(response.xpath("//p[@class='entry-meta-hide-on-mobile']/text()").extract())
# ['\n    2017/02/18 ·\n    ', '\n    ·\n    ', '\n    ·\n    ', '\n']
# ['\n    2017/02/18 ·\n    ', '\n    ·\n    ', '\n    ·\n    ', '\n']


s2 = """
<span data-post-id="110287" class=" btn-bluet-bigger href-style vote-post-up   register-user-only ">
    <i class="fa  fa-thumbs-o-up"></i>
    <h10 id="110287votetotal">2</h10> 
    赞
</span>
"""
response = Selector(text=s2)
print(response.css(".vote-post-up h10::text").extract())
print(response.xpath("//span[contains(@class,'vote-post-u')]/h10/text()").extract())
# ['2']
# ['2']


s3 = """
<span data-book-type="1" data-site-id="2" data-item-id="110287" data-item-type="1" class=" btn-bluet-bigger href-style bookmark-btn  register-user-only ">
    <i class="fa fa-bookmark-o  "></i> 
    28 收藏
</span>
"""
response = Selector(text=s3)
print(response.css(".bookmark-btn::text").extract())
print(response.xpath("//span[contains(@class,'bookmark-btn')]/text()").extract())
# ['\n    ', ' \n    28 收藏\n']
# ['\n    ', ' \n    28 收藏\n']


s4 = """
<a href="#article-comment">
    <span class="btn-bluet-bigger href-style hide-on-480">
        <i class="fa fa-comments-o"></i> 
        9 评论
    </span>
</a>
"""
response = Selector(text=s4)
print(response.css("a[href='#article-comment'] span::text").extract())
print(response.xpath("//a[@href='#article-comment']/span/text()").extract())
# ['\n        ', ' \n        9 评论\n    ']
# ['\n        ', ' \n        9 评论\n    ']



s5 =  """
<p class="entry-meta-hide-on-mobile">
    <a href="#article-comment"> 9 评论 </a>
    <h10 id="110287votetotal">2</h10>
    <a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
    <a href="#article-comment"> 9 评论 </a>
    ·
    <a href="http://blog.jobbole.com/tag/%e9%9d%a2%e8%af%95/">面试</a>
</p>
<a href="#article-comment"> 9 评论 </a>
<a href="#article-comment"> 9 评论 </a>
<h10 id="110287votetotal">2</h10>
<a href="#article-comment"> 9 评论 </a>
"""
response = Selector(text=s5)
print(response.css("p>a::text").extract())
print(response.css("p+a::text").extract())
# [' 9 评论 ', '职场', ' 9 评论 ', '面试']
# [' 9 评论 ']


s6 =  """
<p class="entry-meta-hide-on-mobile" id="1189381312">
    <a href="#article-comment"> 9 评论 </a>
    <h10 id="110287votetotal">2</h10>
    <a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
</p>
<p class="entry-meta-hide-on-mobile" id="13123141241">
    <a href="#article-comment"> 9 评论 </a>
    <h10 id="110287votetotal">2</h10>
    <a href="http://blog.jobbole.com/category/career/" rel="category tag">职场</a>
</p>

"""
response = Selector(text=s6)
print(response.css("p#1189381312>a::text").extract())
print(response.css("p#1189381312 > a::text").extract())
# [' 9 评论 ', '职场']
# [' 9 评论 ', '职场']


# 获取某个属性值
response = Selector(text=s7)
print(response.css("meta[name='csrf-token']::attr(content)").extract_first())
#'OcAFkXBwRt03gmZi85JJdu9056tSjxq8H4JpYfL0uZi9svi4JxzUFZbtagbDTGoyRMdyf8H04KoMWGfRW6dGjg=='

以上是关于python CSS选择器与Xpath的用法样例的主要内容,如果未能解决你的问题,请参考以下文章

Python爬虫从入门到放弃(十四)之 Scrapy框架中选择器的用法

XPath JAVA用法总结及代码样例

无法定位元素:python 爬行中的 css 选择器或 xpath

jquery 选择器与 css3 选择器的性能

Python lxml包下面的xpath基本用法

HTML+CSS Day05 基本CSS选择器复合CSS选择器与CSS继承性