Curating MP actors
Few experiments have already been conducted to curate MP actors, esp. methods to identify duplicates & merge them (see for ex.#13) and identify multi-values to disambiguate (see #4) are set up.
These methods are currently summarised in an actor notebook. As we - @martin.kirnbauer and me - are trying to re-open the actors curation activities, we've noticed several points that could be checked/improved in the notebook. @cesare.concordia could you have a look at the following, please, and let us know what you think?
- would it be possible to automatise the merging steps for the more than 2800 duplicates found. Currently the notebook follows the merge POST and curators need to manually add actor IDs to perform the merge. Would it be possible to automatise this step? Meaning that based on the list of exact matches the merge would be perform automatically, without the comparison step currently implemented. This would be needed at least for this initial and large merge operation (and prevent Martin and me to perform 1 400 merge post manually! :))
- because we discovered that it is not possible to "delete" actors (cf. #be143), actors not attached to any items can not be deleted, but need to be merged with existing ones. So section 3 of the current notebook and associated methods become a bit useless.
- for the multi-value actors to disambiguate, would it be possible to get an output creating one line per actor ID, instead of having an item list?
Once these points are implemented and allow manual correction, a more regular workflow will be set up, following what was foreseen in #10.
notify also @klaus.illmayer @matej.durco