ATLA 2016 - Long Beach

What is Linked Data?

Libraries and cultural heritage organizations are shifting from MARC21 and XML metadata standards to Linked-Data vocabularies. Library Linked Data is an international effort to bring machine-readable data to the web. Based on RDF (Resource Description Framework) graphs, Library Linked Data is made up a series of statements, called triples, that take the form of subject - predicate - object.

Linked Data is about representing information in RDF Triples i.e. subject-predicate-object. like BIBFRAME and Schema.org. RDA and MARC can also be reformulated into Linked Data triples as well.

IRIs the "Link" in Linked Data

The power of linked data is deceptively simple and is based on the idea behind the common URL, the uniform resource locator, used by most people everyday when using the web. Internationalized Resource Identifier or IRI extend the concept of URL to use IRI as a global identifier and unlike a URL, an IRI can contain unicode characters from other non-European languages. An IRI serves as the critical linking structure between local and global information sources. NOTE an IRI doesn't have to exist to be useful, only that is uniquely identifiable like an URL.

source

What is a Graph?

A graph data structure is made up nodes (also known as vertices or points) with connections between the nodes called edges or directed edges (also referred to as arcs or lines)

Graphs are used to describe transportation systems, computer networks, and social relationships. Graphs easily support heterogeneous environments with different vocabularies that can scale to billions (if not trillions) of nodes and edges.


RDF Graphs

Linked Data in libraries is constructed using Resource Description Framework (RDF) graphs, a type of directed graphs, where the nodes are made up of either IRI (international resource indicator i.e. URLs or URIs), blank nodes (a type of identifier placeholder in a RDF graph), or literal values. Originating with the World Wide Web Consortium (W3C) specifications, RDF graphs are built using three element statements called triples that model relationships between resources, IRIs, and descriptive information.

In a RDF triple, the first element is called a subject and represents a resource with the second element, the predicate, describing an aspect that connects to the third element, an object made up a value. One or more triples make up a RDF graph.

The big shift for libraries …

Managing records (MARC) to managing triples (BIBFRAME, Schema.org)



Subjects

Subjects must be either an IRI or a blank node and represents the resource or entity

Predicates

A relationship between the subject and object is a predicate, also called a property. A predicate can only be an IRI.

Objects

Object must be either an IRI, blank node, or literal value.


RDF Serializations

RDF graphs can be serialized (converted into text) with a number of different formats including XML and JSON, along with RDF graph specific formats called Turtle N-Triples.

RDF/XML

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
   xmlns:bf="http://id.loc.gov/ontologies/bibframe/"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>
  <rdf:Description rdf:about="https://en.wikipedia.org/wiki/Religion">
    <rdfs:label>Religion</rdfs:label>
    <bf:Topic rdf:resource="http://id.loc.gov/authorities/subjects/sh85112549"/>
  </rdf:Description>
</rdf:RDF>
        

Turtle (Terse RDF Triple Language)

@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://en.wikipedia.org/wiki/Religion> rdfs:label "Religion" ;
    bf:Topic <http://id.loc.gov/authorities/subjects/sh85112549> .
        

JSON-LD

[
  {
    "@id": "https://en.wikipedia.org/wiki/Religion",
    "http://id.loc.gov/ontologies/bibframe/Topic": [
      {
        "@id": "http://id.loc.gov/authorities/subjects/sh85112549"
      }
    ],
    "http://www.w3.org/2000/01/rdf-schema#label": [
      {
        "@value": "Religion"
      }
    ]
  }
]
        

N-Triples

<https://en.wikipedia.org/wiki/Religion> <http://id.loc.gov/ontologies/bibframe/Topic> <http://id.loc.gov/authorities/subjects/sh85112549> .
<https://en.wikipedia.org/wiki/Religion> <http://www.w3.org/2000/01/rdf-schema#label> "Religion" .
        

Creating a RDF Graph

To illustrate how to create a RDF graph, we will use the Python programming language using the popular and powerful RDFLib.

>>> import rdflib    
>>> religion = rdflib.Graph()
>>> wikipedia_religion_uri = rdflib.URIRef("https://en.wikipedia.org/wiki/Religion")
>>> religion.add(
	(wikipedia_religion_uri,
	 rdflib.RDFS.label,
	 rdflib.Literal("Religion", lang="en")))
>>> BF = rdflib.Namespace("http://id.loc.gov/ontologies/bibframe/")    
>>> religion.add(
	(wikipedia_religion_uri,
	 BF.Topic,
	 rdflib.URIRef("http://id.loc.gov/authorities/subjects/sh85112549")))
>>> print(religion.serialize(format='turtle').decode())
@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://en.wikipedia.org/wiki/Religion> rdfs:label "Religion"@en ;
    bf:Topic <http://id.loc.gov/authorities/subjects/sh85112549> .


Embedding RDF in HTML

RDFa

<div resource="https://en.wikipedia.org/wiki/Religion" 
        typeof="http://id.loc.gov/ontologies/bibframe/Topic>
    <span property="http://www.w3.org/2000/01/rdf-schema#label>
        Religion
    </span>
</div>
        

Microdata

<div id="https://en.wikipedia.org/wiki/Religion"
         itemtype="http://id.loc.gov/ontologies/bibframe/Topic" >
    <span itemproperty="http://www.w3.org/2000/01/rdf-schema#label>
        Religion 
    </span>
</div>
        

Schema.org

Schema.org is an international effort for adding structured and metadata to web resources. Started in 2011, Schema.org is sponsored by Google, Microsoft, Yandex, Microsoft, and others and is available at http://schema.org/. OCLC and others sponsored an extension to Schema.org for bibliographic descriptions specific to libraries that is available at http://bib.schema.org/. OCLC has added schema.org to Worldcat records that are available as a download option when looking at record view.

Part of the Turtle Graph for Zen and Japanese Buddism available from it's Worldcat record at http://www.worldcat.org/oclc/779008232.
<http://www.worldcat.org/oclc/779008232>
        a                           schema:Book , schema:CreativeWork ;
        library:oclcnum             "779008232" ;
        library:placeOfPublication  <http://id.loc.gov/vocabulary/countries/nju> , 
                                    <http://experiment.worldcat.org/entity/work/data/347763#Place/princeton_n_j> ;
        schema:about                <http://id.worldcat.org/fast/1352415> , 
                                    <http://id.loc.gov/authorities/subjects/sh2008124283> , 
                                    <http://id.worldcat.org/fast/1204082> , 
                                    <http://dewey.info/class/294.3927/> , 
                                    <http://id.worldcat.org/fast/1184197> , 
                                    <http://experiment.worldcat.org/entity/work/data/347763#Place/japan> , 
                                    <http://id.loc.gov/authorities/classification/BQ9262> ;
        schema:bookFormat           bgn:PrintBook ;
        schema:creator              <http://viaf.org/viaf/46767643> ;
        schema:datePublished        "1970" ;
        schema:exampleOfWork        <http://worldcat.org/entity/work/id/347763> ;
        schema:inLanguage           "en" ;
        schema:isPartOf             <http://experiment.worldcat.org/entity/work/data/347763#Series/bollingen_series> , 
                                    <http://experiment.worldcat.org/entity/work/data/347763#Series/princeton_bollingen_paperbacks_221> ;
        schema:name                 "Zen and Japanese Buddism"@en ;
        schema:productID            "779008232" ;
        schema:publication          <http://www.worldcat.org/title/-/oclc/779008232#PublicationEvent/princeton_n_j_princeton_university_press_1970_1959> ;
        schema:publisher            <http://experiment.worldcat.org/entity/work/data/347763#Agent/princeton_university_press> ;
        schema:workExample          <http://worldcat.org/isbn/9780691098494> , 
                                    <http://worldcat.org/isbn/9780691017709> ;
        wdrs:describedby            <http://www.worldcat.org/title/-/oclc/779008232> .

    

SPARQL

Much of the power and flexibility of RDF graph databases comes from a query/update language called SPARQL - short for SPARQL Protocol and RDF Query Language - that allows you to search and manipulate the subject, predicate, and objects in your triplestore.

Example 1: Retrieving all triples

One of the simplest SPARQL queries is to simply retrieve all triples in a triplestore.

In this SPARQL query, the SELECT and WHERE are reserved keywords. In the SELECT row, three variables are outputted to the calling client that filters the query based on the contents in WHERE clause, in this case just returning all triple patterns.

SELECT ?subject ?predicate ?object
WHERE {
	?subject ?predicate ?object .
}
    

Example 2: Retrieving all subjects of a particular type

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX schema: <http://schema.org/> 

SELECT DISTINCT ?subject 
WHERE {
    ?subject rdf:type schema:Book .
} ORDER BY ?subject
    

This query returns all of the subjects in the triplestore that have a type (class) of the http://schema.org/Book. The ORDER BY clause returns the subjects in ascending order in an alpha-numerical sort.


BIBFRAME

Copyright © 2016 by Jeremy Nelson and KnowledgeLinks, content licensed under Creative Commons.