JsoupXpathTest.java

Posted tangyongathuse

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了JsoupXpathTest.java相关的知识,希望对你有一定的参考价值。

import cn.wanghaomiao.xpath.exception.NoSuchAxisException;
import cn.wanghaomiao.xpath.exception.XpathSyntaxErrorException;
import cn.wanghaomiao.xpath.model.JXDocument;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

import java.io.File;
import java.io.IOException;
import java.util.List;

/**
 * Created by Administrator on 2017/5/27.
 */

public class JsoupXpathTest {
//    http://www.cnblogs.com/ 为例
//            "//a/@href";
//"//div[@id=\'paging_block\']/div/a[text()=\'Next >\']/@href";
//"//div[@id=\'paging_block\']/div/a[text()*=\'Next\']/@href";
//"//h1/text()";
//"//h1/allText()";
//"//h1//text()";
//"//div/a";
//"//div[@id=\'post_list\']/div[position()<3]/div/h3/allText()";
//"//div[@id=\'post_list\']/div[first()]/div/h3/allText()";
//"//div[@id=\'post_list\']/div[1]/div/h3/allText()";
//"//div[@id=\'post_list\']/div[last()]/div/h3/allText()";
////查找评论大于1000的条目(当然只是为了演示复杂xpath了,谓语中可以各种嵌套,这样才能测试的更全面嘛)
//"//div[@id=\'post_list\']/div[./div/div/span[@class=\'article_view\']/a/num()>1000]/div/h3/allText()";
////轴支持
//"//div[@id=\'post_list\']/div[self::div/div/div/span[@class=\'article_view\']/a/num()>1000]/div/h3/allText()";
//"//div[@id=\'post_list\']/div[2]/div/p/preceding-sibling::h3/allText()";
//"//div[@id=\'post_list\']/div[2]/div/p/preceding-sibling::h3/allText()|//div[@id=\'post_list\']/div[1]/div/h3/allText()";

    public static void main(String[] args) throws  IOException{
//        String xpath="//div[text()=\'工商注册\']/text()";
//        String xpath="//div[@id=\'post_list\']";


        String xpath="//div/span[text()=\'获投信息\']/parent::*/following-sibling::*[1]/div[1]/div[2]/table[1]/tr[position()>=1]";
//        String xpath="//span[@class=\'details_1221_d05_d02_s01\']/text()|//span[@class=\'details_1221_d05_d02_s02\']/text()";
//        String xpath="//span[@class=\'details_1221_d05_d02_s01\']|//span[@class=\'details_1221_d05_d02_s02\']";

//        String xpath="//div[@id=\'post_list\']/div[./div/div/span[@class=\'article_view\']/a/num()>1000]/div/h3/allText()";
        //通过URL数据源
        /*Document doc = Jsoup.connect("http://www.cnblogs.com/").get();
        JXDocument jxDocument = new JXDocument(doc);
        List<Object> rs = null;
        try {
            rs = jxDocument.sel(xpath);
        } catch (XpathSyntaxErrorException e) {
            e.printStackTrace();
        }
        for (Object o:rs){
            if (o instanceof Element){
                int index = ((Element) o).siblingIndex();
                System.out.println(index);
            }
            System.out.println("\\n"+o.toString()+"\\n");
        }*/

        //通过本地数据源
//        Document doc1 = Jsoup.parse(new File("D:\\\\Test\\\\228.html"), "UTF-8");
        Document doc1 = Jsoup.parse(new File("D:\\\\Test\\\\It桔子\\\\2.html"), "UTF-8");
        JXDocument jxDocument = new JXDocument(doc1);
        List<Object> rs1 = null;
        try {
            rs1 = jxDocument.sel(xpath);
        } catch (XpathSyntaxErrorException e) {
            e.printStackTrace();
        }
        //System.out.println("\\n"+rs1.toString()+"\\n");
        for (Object o:rs1){
            if (o instanceof Element){
                int index = ((Element) o).siblingIndex();
                System.out.println(index);
            }
            System.out.println(o.toString());
        }


    }
}

  

以上是关于JsoupXpathTest.java的主要内容,如果未能解决你的问题,请参考以下文章

VSCode自定义代码片段——CSS选择器

谷歌浏览器调试jsp 引入代码片段,如何调试代码片段中的js

片段和活动之间的核心区别是啥?哪些代码可以写成片段?

VSCode自定义代码片段——.vue文件的模板

VSCode自定义代码片段6——CSS选择器

VSCode自定义代码片段——声明函数