Ingesting linked items
We started extracting mentions of tools in papers and other texts. (see INgestion of Programming historian #5 (closed) and DH conferences #8 (closed) ) Currently, we use a separate java-application, the Tool Extractor, which matches tool names in texts and generates lists of mentioned tools per document. This can be used as additional data to ingest.
The texts themselves should then become items (InformationObjects) in the Marketplace.
There is also the alternative setup, where the extraction is part of the ingest process. This would require that the full-text of the material is passed into the ingestion process, and optionally also even stored along the metadata (this would allow to rerun the extraction when new tools get imported)
Decision needed:
- integrate extraction into ingestion process? (or leave it as separate step)
- how to represent papers
- store full-text?
(notify: @tparkola, @mkozak, @frank.fischer01, @ymoranv , @klaus.illmayer, @lbarbot, @sotiris.karampatakis )