Adding Basic Classification classes to metadata files
I would like to add Basic Classification (BK) classes to the works, editions and texts (especially texts). The idea is to be able to retrieve the texts later based on the subject they belong to (German Studies, Romance Studies...). This can give a broader overview than languages, for example (especially when small languages are queried).
Here is the documentation for the BK: https://wiki.k10plus.de/pages/viewpage.action?pageId=437452809 Here it is published as LOD: http://uri.gbv.de/terminology/bk/
For each object in TGR I would probably add two classes. For example, for texts related to German literature, I would add the following classes
- 18.10 Deutsche Literatur
- 17.97 Texte eines einzelnen Autors
For Spanish literature, I would add the following classes
- 18.32 Spanische Literatur
- 17.97 Texte eines einzelnen Autors
Each of these classes has two parts:
- id (18.10, 18.32, 17.97)
- label (Deutsche Literatur, Spanische Literatur, Texte eines einzelnen Autors)
Each of these has to be stored separately in the TG metadata files. I suppose the label should also be stored in the metadata files, but of course another way would be to retrieve the labels from http://uri.gbv.de/terminology/bk/. At the moment these labels are only available in German. A translation into several other languages (especially English and Spanish) could be considered (the person who would ultimately decide about this also works at the SUB). We could also think of writing the labels in both English and German in the metadata files.
How do we want users to use this data? First, users should be able to query the ids of the classes, e.g:
- give me all objects assigned to class 18.10 This is the most basic use case that I expect to work from the start.
Secondly, users should be able to use certain regular expressions with the ids of the classes. For example, asking for the objects assigned to classes 18.5.* will give you objects in any Slavic language. Another example would be 18.[23], which would give you all objects related to Romance languages. I understand that this is probably not possible at the moment, but it would be important to develop things towards this goal in the near future.
Thirdly, users should be able to ask for specific tokens in the class labels, e.g. with the word "Englisch" they will get objects from English literature as well as from North American and other English-speaking literatures. We can limit this to tokens in German only, or we can include tokens in English as well.
The next step is to display the most frequent BK classes (id + labels) as facets in the left menu. Of course, this only makes sense if we have enough material with this information.
I want to add this information to the ELTeC corpora, so we would need a solution in the next weeks/months.
Many of these use cases mimic what people can already do in regular library catalogues when searching for primary and secondary literature. That is, what people learn about who to find things in a library catalogue, they could use directly in the repo. That's the charm :)