Building Bibliographic RDF Applications and Microservices

Ingester Microservice

For this last activity, we will extend the MARCIngester and create our own ingester Class

  1. Open up your text editor and add a description and your name and import the MARCIngester class
    """Custom MARC21 Ingester Class"""
    __author__ = "Jeremy Nelson"
    
    import rdflib
    from bibcat.ingesters.marc import MARCIngester
        
  2. Create a new SpecialMARCIngester that extends MARCIngester Class with a overridden __init__ method where we set an instance variable.
    
    class SpecialMARCIngester(MARCIngester):
    
        def __init__(self, **kwargs):
            self.legacy_ils_pattern = "http://tiger.coloradocollege.edu/record={}"
            kwargs['custom'] = 'cc-marc.ttl'
            super(SpecialMARCIngester, self).__init__(**kwargs)
    
        
  3. Taking the generate_item_iri function from the Ingesting MARC activity, we will now add that function as a method to our SpecialMARCIngester
    
        def generate_item_iri(self, record):
            if not '907' in record:
                return
            bib_number = record['907']['a'][1:-1]
            return rdflib.URIRef(self.legacy_ils_pattern.format(bib_number))
    
        
  4. Finally, we will override the transform method to generate the item_iri before calling the MARCIngester.transform
    
        def transform(self, record, instance_uri=None, item_uri=None):
            item_iri = self.generate_item_iri(record)
            super(SpecialMARCIngester, self).transform(record=record, 
                                                       instance_uri=instance_uri, 
                                                       item_uri=item_iri) 
        
  5. Save this python module in your rdf-app directory as special.py
  6. Going back to your Python session, import your new module
    >>> from .special import SpecialMARCIngester
    Extract the next record from the MARC reader
    >>> fourth_record = next(reader)
    Display the MARC Record
    print(fourth_record)
    =LDR  01048nam a22003251a 4500
    =001  395428
    =003  OCoLC
    =005  19970819134740.0
    =008  840816s1964\\\\cauac\\\\b\\\\000\0\eng\c
    =010  \\$a64021712 //r84
    =035  \\$a.b12958736$btbp$cr
    =040  \\$aDLC/ICU$cCGU
    =041  1\$aengger
    =049  \\$aCOCA
    =050  0\$aQA21$b.M413
    =090  \\$aQA21.M413
    =100  1\$aMeschkowski, Herbert.
    =240  10$aDenkweisen grosser Mathematiker.$lEnglish.
    =245  10$aWays of thought of great mathematicians :$ban approach to the history of mathematics /$cTranslated by John Dyer-Bennet.
    =260  \\$aSan Francisco :$bHolden-Day,$c1964.
    =300  \\$aviii, 110 p. :$bill., ports. ;$c23 cm.
    =490  1\$aThe Mathesis series.
    =504  \\$aBibliography: p. 105-108.
    =650  \0$aMathematics$xHistory.
    =830  \0$aMathesis series.
    =902  \\$a150104
    =907  \\$a.b12958736
    =945  \\$aQA21.M413$g1$i33027003548553$j0$ltbp  $h0$or$p$0.00$q $r-$s-$t1$u1$v0$w0$x0$y.i13531906$z970819
    =994  \\$atbp
    =999  \\$b2$c970819$dm$ea$fr$g0
        
        
  7. Create a new instance of special ingester
    >>> ingester = SpecialMARCIngester()
  8. Transform the MARC record
    >>> ingester.transform(fourth_record)
  9. Display the Turtle serialization of the ingester.graph
    
    @prefix bc: <http://knowledgelinks.io/ns/bibcat/> .
    @prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
    @prefix kds: <http://knowledgelinks.io/ns/data-structures/> .
    @prefix loc: <http://id.loc.gov/authorities/> .
    @prefix m21: <http://knowledgelinks.io/ns/marc21/> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix relators: <http://id.loc.gov/vocabulary/relators/> .
    @prefix schema: <http://schema.org/> .
    @prefix skos: <http://www.w3.org/2004/02/skos/core#> .
    @prefix void: <http://rdfs.org/ns/void#> .
    @prefix xml: <http://www.w3.org/XML/1998/namespace> .
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
    
     <http://tiger.coloradocollege.edu/record=b1295873> a bf:Item ;
        bf:generationProcess [ a bf:GenerationProcess ;
                bf:generationDate "2017-03-06T02:59:42.647374" ;
                rdf:value "Generated by BIBCAT version 1.7.5 from KnowledgeLinks.io"@en ] ;
        bf:itemOf <http://dpla.coloradovirtuallibrary.org/f1d684e4-0218-11e7-aaab-a8667f19014b> .
    
    <http://dpla.coloradovirtuallibrary.org/f1d684e4-0218-11e7-aaab-a8667f19014b> a bf:Instance ;
        bf:copyrightDate "1964." ;
        bf:dimensions "23 cm." ;
        bf:extent [ a bf:Extent ;
                rdf:value "viii, 110 p. :" ] ;
        bf:generationProcess [ a bf:GenerationProcess ;
                bf:generationDate "2017-03-06T02:59:42.594217" ;
                rdf:value "Generated by BIBCAT version 1.7.5 from KnowledgeLinks.io"@en ] ;
        bf:instanceOf [ a bf:Work ;
                bf:originDate "1964" ] ;
        bf:provisionActivity [ a bf:Publication ;
                relators:pbl "Holden-Day," ] ;
        bf:seriesStatement "The Mathesis series." ;
        bf:subject [ a bf:Topic ;
                rdf:value "Mathematics" ] ;
        bf:supplementaryContent [ a bf:SupplementaryContent ;
                rdf:value "Bibliography: p. 105-108." ] ;
        bf:title [ a bf:InstanceTitle ;
                bf:mainTitle "Ways of thought of great mathematicians :" ;
                bf:subtitle "an approach to the history of mathematics /" ] ;
        relators:aut [ a bf:Person ;
                schema:name "Meschkowski, Herbert." ] .