Skip to content

basic_classificationsand gnd_subjects both as fixed values and xpath (with the possibility of several values)

Currently, the only option to enter gnd_subjects and basic_classifications are through fixed values. Both of them allow currently several values. This should still remain, but with the option for xpaths.

Currently, this is what is possible:

  basic_classifications:
    - id: '17.97'
      url: http://uri.gbv.de/terminology/bk/
      value: Texts by a single author
    - id: '18.10'
      url: http://uri.gbv.de/terminology/bk/
      value: German Literature

For gnd_subjects this is very similar:

  gnd_subjects:
    - id: 8515558
      url: https://d-nb.info/gnd/
      value: poetry
    - id: 4002214-6
      url: https://d-nb.info/gnd/
      value: anthology

My suggestion is to make it like this:

  basic_classifications:
    - id:
         value:
         xpath:
      url:
         value:
         xpath:
      value:
         value:
         xpath:

The gnd_subjects should be analogue to this. Let's take an example for how the gnd_subjects would be modelled in the TEI files as following:

            <keywords scheme="https://d-nb.info/gnd/">
               <term ref="4136947-6" xml:lang="de">Volkserzählung</term>
               <term ref="4050479-7" xml:lang="de">Roman</term>
            </keywords> 

And then we use the xpaths like that:

  gnd_subjects:
    - id:
         value:
         xpath: //keywords[@scheme="https://d-nb.info/gnd/"]/term/@ref
      url:
         value: https://d-nb.info/gnd/
         xpath: 
      value:
         value:
         xpath: //keywords[@scheme="https://d-nb.info/gnd/"]/term

These id-xpath would extract a list with two items (["4136947-6", "4050479-7"]) and the value-xpath would extract a list with two values (["Volkserzählung"], ["Roman"]). These values and ids should be then zipped together so that the id is placed with its correct value.

It's not expected that the people would introduce both value and xpath in the same subfield. For example, we don't expect that the id gets both a value and an xpath. It would be always or one thing or the other. However, as in other cases, we can prioritize the fixed value.

What it can happen is that people give the subfield url as fixed value, but xpaths for the value and the id. In any case, we should be consider if we really need the subfield for the url in the collection.yaml or if we not prefer to hide this. In general, we don't expect people inserting other values for the urls, neither for the basic_classifiction (http://uri.gbv.de/terminology/bk/) nor for the gnd_subjects (https://d-nb.info/gnd/)