json 用于Micro Web Scraper配置文件的JSONSchema

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了json 用于Micro Web Scraper配置文件的JSONSchema相关的知识,希望对你有一定的参考价值。

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Micro Web Scraper",
    "description": "describes requests.get() args/kwargs & nested XPaths",
    "properties": {
        "_url":     { "type": "string" },
        "_params":  { "type": "object" },
        "_headers": { "type": "object" }
    },
    "$ref": "#/definitions/xpath_object",
    "additionalProperties": false,
    "definitions": {
        "xpath": { "type": "string" },
        "xpath_object": {
            "type": "object",
            "patternProperties": {
                "^[^_].+": { "anyOf": [
                    { "$ref": "#/definitions/xpath" },
                    { "$ref": "#/definitions/xpath_sequence" }
            ]}},
            "additionalProperties": false
        },
        "xpath_sequence": {
            "type": "array",
            "items": [
                { "$ref": "#/definitions/xpath" },
                { "anyOf": [
                    { "$ref": "#/definitions/xpath_object" },
                    { "$ref": "#/definitions/xpath_sequence" }
                ]}
            ],
            "minItems": 2,
            "maxItems": 2
        }
    }
}

以上是关于json 用于Micro Web Scraper配置文件的JSONSchema的主要内容,如果未能解决你的问题,请参考以下文章

json Huginn RSS Scraper

简易数据分析 09 | Web Scraper 自动控制抓取数量 & Web Scraper 父子选择器

python Web Image Scraper

ruby 基本Web Scraper

使用web scraper抓取分页内容

Web scraper / CSV未以正确的格式保存