Vocabulary management v2
For many of the properties to describe entities in MP we want to use dedicated controlled vocabularies.
The data model supports the notion of Vocabularies
consisting of Concepts
and allows to restrict the value ranges of individual dynamic properties to selected Vocabularies
.
Until now, however, the vocabularies have been managed by PoolParty Taxonomy Server
[TODO: add link!, also add link to Development guidelines!] and manually synchronized with the MP-backend, which seriously hampers dynamic evolution of vocabularies, brought about especially during the ingestion process. The issue is detailed in SSHOCMP Specification (v2.0 DRAFT) - Curation Component in the section "Vocabulary Management" , main topic being the handling of newly encountered/proposed terms/concepts.
Out of the possible implementation options, we decided to go with the "indirection"-mode, where there still will be an external component for managing the vocabularies, however the synchronisation between this external component and the MP-backend will be handled by the MP-backend, so that both ingestion scripts and human curators can in their basic routines restrict themselves to interaction with the marketplace. MP will accept new concepts and pass them on as candidate concepts to the external vocabulary management component, where they can be handled accordingly in a separate vocabulary curation process. Crucial point is the disentangling of the items ingestions or curation processes from the vocabulary curation process, allowing them to run asynchronously, while at the same time ensuring that newly created concepts can be used right away to describe MP-items (even though they will have to carry an "freshman"-earmark).
Implementation-wise this will require:
- probably adjustment to the data model, enriching the representation of Concepts (nearer to SKOS data model)
- adjusting the API of SSHOCMP-backend, esp. with respect to candidate concepts (to be detailed)
- implementing communication of SSHOCMP-backend to the external Vocabulary management component (this is currently PoolParty, which already provides the full API needed for managing a vocabulary, including adding "candidate concepts")
- possibly adaptations of the ingest pipelines to rely on MP as the authoritative source of vocabularies, or allowing to push candidate concepts. This may/should actually simplify the ingestion pipelines.
- Edit forms allowing to create new ("candidate") concepts.
(Note: This is a follow-up to an older issue "Vocabulary management" #31 (closed), which revolved around individual vocabularies and how to handle and curate them, in the initial phase of the development and is now obsoleted by this issue.)
(notify: @mkozak, @klaus.illmayer, @ymoranv, @stefan.probst, @lbarbot, @frank.fischer01, @tparkola, @swolarz, @sotiris.karampatakis )