You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
getting rid of mapping types would be a useful thing to do for many reasons:
no one really cares about them it seems
they complicate hashing behaviour of SynonymTerm and thus SynonymDB can have inconsistent behaviour and results
Originally, we had them as a term can have multiple disjointed mapping types in a KB.
original comment
the `get_all` call 'smells' wrong to me here for a couple of reasons:
We're having to iterate over all the synonym terms for every annotation here.
Within the get_all call, we get the synonym term associated with the term_norm, but don't capture it here, and then have to recalculate it within drop_equivalent_id_set_from_synonym_term when we do synonym_term = self._syns_database_by_syn[name][synonym].
If we do that, then this function doesn't need to loop over all the synonym terms for the parser, and can just be:
defdrop_equivalent_id_set_containing_id_from_all_synonym_terms(
self, name: ParserName, id_to_drop: Idx
) ->Tuple[int, int]:
""" remove all EquivalentIdSet's that contain this id from all SynonymTerms in the database :param name: :param id_to_drop: :return: """terms_modified=0terms_dropped=0syn_term: SynonymTermforsyn_terminself.get_associated_syn_terms_for_id(name, id_to_drop):
new_id_sets=frozenset( # or set( if we leave it as-isequiv_id_setforequiv_id_setinsyn_term.associated_id_setsifid_to_dropnotinequiv_id_set.ids
)
result=self._modify_or_drop_synonym_term_after_id_set_change(
new_id_sets, name, syn_term
)
ifresult==DBModificationResult.ID_SET_MODIFIED:
terms_modified+=1elifresult==DBModificationResult.SYNONYM_TERM_DROPPED:
terms_dropped+=1returnterms_modified, terms_dropped
Although doing that makes me wonder if we actually need both this new self._synonym_terms_by_id and the existing self._syns_by_aggregation_strategy, since that also goes from an id to some data associated with a set of SynonymTerms (their term_norms). I wonder if we could use a single data structure here - but equally that might trade off a small-ish amount of space for a bit of performance, so not sure how bothered we would be.
getting rid of mapping types would be a useful thing to do for many reasons:
Originally, we had them as a term can have multiple disjointed mapping types in a KB.
original comment
get_all
call, we get the synonym term associated with the term_norm, but don't capture it here, and then have to recalculate it withindrop_equivalent_id_set_from_synonym_term
when we dosynonym_term = self._syns_database_by_syn[name][synonym]
.Is there a problem I'm missing with replacing:
with
?
If we do that, then this function doesn't need to loop over all the synonym terms for the parser, and can just be:
Although doing that makes me wonder if we actually need both this new
self._synonym_terms_by_id
and the existingself._syns_by_aggregation_strategy
, since that also goes from an id to some data associated with a set of SynonymTerms (their term_norms). I wonder if we could use a single data structure here - but equally that might trade off a small-ish amount of space for a bit of performance, so not sure how bothered we would be.Original comment from @EFord36
The text was updated successfully, but these errors were encountered: