Skip to content

Commit

Permalink
[#42] Let 010 and 016 take precedence over 001
Browse files Browse the repository at this point in the history
This fixes the NALT URIs.
  • Loading branch information
danmichaelo committed Nov 8, 2017
1 parent 1fdf627 commit 546a68c
Show file tree
Hide file tree
Showing 5 changed files with 31 additions and 10 deletions.
8 changes: 6 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,9 @@ Pull requests for adding more vocabularies are very welcome!
URIs can be also be generated on the fly from an URI template specified with option
``--uri``. The following template parameters are recognized:

* ``{control_number}`` is the 001 value
* ``{control_number}`` is the control number from 001, 010 or 016. The current approach
is to use 010 or 016 if defined, otherwise 001. If you find examples where this approach
fails, please add them to [#42](https://github.com/scriptotek/mc2skos/issues/42).
* ``{collection}`` is "class", "table" or "scheme"
* ``{object}`` is a member of the classification scheme (with spaces replaced by
hyphens) and part of a ``{collection}``, such as a specific class or table.
Expand Down Expand Up @@ -123,10 +125,12 @@ the 7XX fields to skos:altLabel.
========================================================== =====================================
MARC21XML RDF
========================================================== =====================================
``001`` Control Number ``dcterms:identifier``
``001`` Control Number (see note above on 001, 010 & 016) ``dcterms:identifier``
``005`` Date and time of latest transaction ``dcterms:modified``
``008[0:6]`` Date entered on file ``dcterms:created``
``008[8]="d" or "e"`` Classification validity ``owl:deprecated``
``010`` Control Number (see note above on 001, 010 & 016) ``dcterms:identifier``
``016`` Control Number (see note above on 001, 010 & 016) ``dcterms:identifier``
``153 $a``, ``$c``, ``$z`` Classification number ``skos:notation``
``153 $j`` Caption ``skos:prefLabel``
``153 $e``, ``$f``, ``$z`` Classification number hierarchy ``skos:broader``
Expand Down
4 changes: 2 additions & 2 deletions examples/nalt-142.ttl → examples/nalt-1396.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://lod.nal.usda.gov/nalt/142> a skos:Concept ;
<http://lod.nal.usda.gov/nalt/1396> a skos:Concept ;
dcterms:created "2016-12-08"^^xsd:date ;
dcterms:identifier "142" ;
dcterms:identifier "nalt00001396" ;
dcterms:modified "2016-12-08"^^xsd:date ;
skos:altLabel "2-oxoisocaproate dehydrogenase"@en,
"2-oxoisovalerate (lipoate) dehydrogenase"@en,
Expand Down
File renamed without changes.
27 changes: 22 additions & 5 deletions mc2skos/record.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,11 +137,16 @@ def process_formatter(matches):
start = int(matches.group('start')) if matches.group('start') else None
end = int(matches.group('end')) if matches.group('end') else None
value = kwargs[matches.group('param')][start:end]
formatter_str = '{0' + matches.group('formatter') + '}' if matches.group('formatter') else '{0}'
if 'd' in formatter_str:
value = int(value)
elif 'f' in formatter_str:
value = float(value)
if len(value) == 0:
# Empty string can be used for the scheme URI.
# Trying to convert this to decimal or float will fail!
formatter_str = '{0}'
else:
formatter_str = '{0' + matches.group('formatter') + '}' if matches.group('formatter') else '{0}'
if 'd' in formatter_str:
value = int(value)
elif 'f' in formatter_str:
value = float(value)

return formatter_str.format(value)

Expand Down Expand Up @@ -231,6 +236,18 @@ def parse(self, options):
# 001
self.control_number = self.record.text('mx:controlfield[@tag="001"]')

# 010 : If present, it takes precedence over 001.
# <https://github.com/scriptotek/mc2skos/issues/42>
value = self.record.text('mx:datafield[@tag="010"]/mx:subfield[@code="a"]')
if value is not None:
self.control_number = value

# 016 : If present, it takes precedence over 001
# <https://github.com/scriptotek/mc2skos/issues/42>
value = self.record.text('mx:datafield[@tag="016"]/mx:subfield[@code="a"]')
if value is not None:
self.control_number = value

# 003
self.control_number_identifier = self.record.text('mx:controlfield[@tag="003"]')

Expand Down
2 changes: 1 addition & 1 deletion mc2skos/vocabularies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ subject_schemes:
a:
concept: http://id.loc.gov/authorities/subjects/{control_number}
scheme: http://id.loc.gov/authorities/subjects
d: http://lod.nal.usda.gov/nalt/{control_number}
d: http://lod.nal.usda.gov/nalt/{control_number[4:]:d}
usvd:
concept: http://data.ub.uio.no/usvd/c{control_number[4:]}
scheme: http://data.ub.uio.no/usvd/
Expand Down

0 comments on commit 546a68c

Please sign in to comment.