其他RDF与SPARQL
Posted 宣之于口
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了其他RDF与SPARQL相关的知识,希望对你有一定的参考价值。
RDF(Resource Description Framework)
RDF(资源描述框架)是描述网络资源的 W3C 标准
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:si="http://www.runoob.com/rdf/">
<rdf:Description rdf:about="http://www.runoob.com">
<si:title>runoob.com</si:title>
<si:author>Jan Egil Refsnes</si:author>
</rdf:Description>
</rdf:RDF>
定义 : <s, p, o>
- s : URIs (incl. rdf:type) and Blank nodes
- p: URIs (incl. rdf:type)
- o: URIs (incl. rdf:type) and Blank nodes and Literals(文字)
一、元素
<rdf:RDF>
是 RDF 文档的根元素。它把 XML 文档定义为一个 RDF 文档。它也包含了对 RDF 命名空间的引用
<rdf:Description>
元素可通过 about 属性标识一个资源; 可包含描述资源的那些元素
举例说明: RDF 仅仅定义了这个框架。而 artist等元素必须被其他人进行定义
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
...
</rdf:Description>
</rdf:RDF>
<rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist rdf:resource="http://www.recshop.fake/cd/dylan" />
...
</rdf:Description>
二、RDF序列化方法
RDF的表示形式和类型有了,那我们如何创建RDF数据集,将其序列化(Serialization)呢?换句话说,就是我们怎么存储和传输RDF数据。目前,RDF序列化的方式主要有:RDF/XML,N-Triples,Turtle,RDFa,JSON-LD等几种
- Turtle, 应该是使用得最多的一种RDF序列化方式了。它比RDF/XML紧凑,且可读性比N-Triples好
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix foaf: <http://xmlns.com/foaf/0.1/>
@prefix ex: <http://www.cs.man.ac.uk/>
ex:sattler
foaf:title "Dr." ;
foaf:knows ex:bparsia ;
foaf:knows
[
foaf:title "Count";
foaf:lastName "Dracula"
]
- JSON-LD,即“JSON for Linking Data”,用键值对的方式来存储RDF数据
三、RDF Schema
RDF Schema 不提供实际的应用程序专用的类和属性,而是提供了描述应用程序专用的类和属性的框架
rdfs:subClassOf
rdfs:subPropertyOf
rdfs:domain
rdfs:range
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#">
<rdfs:Class rdf:ID="animal" />
<rdfs:Class rdf:ID="horse">
<rdfs:subClassOf rdf:resource="#animal"/>
</rdfs:Class>
</rdf:RDF>
##四、RDF高级用法
参考资料:RDF
1. Reification
RDF reification vocabulary
rdf:Statement rdf:subject rdf:predicate rdf:object
假设简单的triple: <ex:a> <ex:b> <ex:c> .
则该三元组的reification如下表示:
_:xxx rdf:type rdf:Statement .
_:xxx rdf:subject <ex:a> .
_:xxx rdf:predicate <ex:b> .
_:xxx rdf:object <ex:c> .
举例说明:
:Tolkien :wrote :LordOfTheRings .
# 可以有一个单独的资源来代表一个声明,这样你就可以陈述关于声明本身的其他事情, 增加"Wikipedia said that"
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
_:x rdf:type rdf:Statement .
_:x rdf:subject :Tolkien .
_:x rdf:predicate :wrote .
_:x rdf:object :LordOfTheRings .
_:x :said :Wikipedia .
五、SPARQL
参考文档:维基数据查询用户手册 和Wikidata:SPARQL tutorial
SPARQL即SPARQL Protocol and RDF Query Language的递归缩写,专门用于访问和操作RDF数据,是语义网的核心技术之一
SPARQL的部分关键词
- SELECT, 指定我们要查询的变量。
- WHERE,指定我们要查询的图模式。含义上和SQL的WHERE没有区别。
- FROM,指定查询的RDF数据集。
- PREFIX,用于IRI的缩写。
1. Wikidata
主语: Q30; 谓语: P36; 宾语: Q61
wd:Q30 wdt:P36 wd:Q61 .
前缀(Prefixes)
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wds: <http://www.wikidata.org/entity/statement/>
PREFIX wdv: <http://www.wikidata.org/value/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
SELECT ?s ?desc WHERE
?s wdt:P279 wd:Q7725634 .
OPTIONAL
?s rdfs:label ?desc filter (lang(?desc) = "en").
2. 基本用法
主语,谓语,宾语的形式 - SPO (Subject, Predicate, Object) also known as a Semantic Triple
SELECT ?a ?b ?c
WHERE
x y ?a.
m n ?b.
?b f ?c.
应用到wikidata:
SELECT ?child
WHERE
# ?child has father Bach
?child wdt:P22 wd:Q1339.
3. 高级用法
多个条件
SELECT ?child ?childLabel
WHERE
# p22 父亲, p25母亲
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487.
SERVICE wikibase:label bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".
注意: ;
和 ,
的区别
;
: romeo loves juliet; kills romeo.,
: romeo kills tybalt, romeo.
嵌套查询
SELECT ?grandChild ?grandChildLabel
WHERE
# Bach has a child ?child, ?child has a child ?grandChild.
wd:Q1339 wdt:P40 [ wdt:P40 ?grandChild ].
SERVICE wikibase:label bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".
过滤
SELECT ?book ?title
WHERE
?book dc:title ?title .
?book inv:price ?price .
FILTER ( ?price < 15 )
?book inv:quantity ?num .
FILTER ( ?num > 0 )
注意 not in
和 mins
的 区别:
# person 的职业不是 Q1028181
minus ?person wdt:P106 wd:Q1028181 .
# person 的职业可以不是 Q1028181
FILTER ( ?item not in ( wd:Q1028181 ) )
**UNION and DISTINCT **
# ?person 是[x]的学生 UNION [x]的学生是 ?person, DISTINCT 限制?person 仅出现1次
SELECT DISTINCT ?person
WHERE
?person wdt:P31 wd:Q5.
?person wdt:P1066 [wdt:P106 wd:Q1028181]
UNION [wdt:P106 wd:Q1028181] wdt:P802 ?person.
OPTIONAL
# OPTIONAL 表示参数可选
# data
person:a foaf:name "Alice" . person:a foaf:nick "A-online" . person:b foaf:name "Bob" .
# query
SELECT ?name ?nick
?x foaf:name ?name .
OPTIONAL ?x foaf:nick ?nick
#answer
?name | ?nick
"Alice" | "A-online"
"Bob" | NULL
ASK: 返回true/false
# query
ASK
?x foaf:name ?name .
OPTIONAL ?x foaf:nick ?nick
# answer
true
CONSTRUCT :
CONSTRUCT ?person vc:FN ?name
WHERE
?person foaf:name ?name .
# answer
person:a vc:FN "Alice" .
person:b vc:FN "Bob" .
实例和类(Instances and classes)
- instance of (P31)
- subclass of (P279).
SELECT ?work ?workLabel
WHERE
?work wdt:P31 wd:Q838948. # instance of work of art
SERVICE wikibase:label bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".
?item wdt:P31/wdt:P279* ?class.
: This means that there’s one “instance of” and then any number of “subclass of” statements between the item and the class.
SELECT ?work ?workLabel
WHERE
?work wdt:P31/wdt:P279* wd:Q838948. # instance of any subclass of work of art
SERVICE wikibase:label bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".
注意:正则
- ‘*’: 0或more
- ‘+’: 1或more
- ‘|’: 或
4. 实例
1.Canadian subjects with no English article in Wikipedia
#added before 2019-02
SELECT ?item ?itemLabel ?cnt WHERE
SELECT ?item (COUNT(?sitelink) AS ?cnt) WHERE
?item wdt:P27|wdt:P205|wdt:P17 wd:Q16 . #Canadian subjects.
minus ?item wdt:P106 wd:Q488111 . #Minus occupations that would be inappropriate in most situations.
minus ?item wdt:P106 wd:Q3286043 .
minus ?item wdt:P106 wd:Q4610556 .
?sitelink schema:about ?item .
FILTER NOT EXISTS
?article schema:about ?item .
?article schema:isPartOf <https://en.wikipedia.org/> . #Targeting Wikipedia language where subjects has no article.
GROUP BY ?item ORDER BY DESC (?cnt) LIMIT 1000 #Sorted by amount of articles in other languages. Result limited to 1000 lines to not have a timeout error.
SERVICE wikibase:label bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr,es,de" #Service to resolve labels in (fallback) languages: automatic user language, English, French, Spanish, German.
ORDER BY DESC (?cnt)
以上是关于其他RDF与SPARQL的主要内容,如果未能解决你的问题,请参考以下文章