Skip to content

Commit

Permalink
Merge pull request #6549 from RasaHQ/slow-crf-entity-extractor-training
Browse files Browse the repository at this point in the history
Fix slow training of CRFEntityExtractor when using Entity Roles and Groups
  • Loading branch information
tabergma committed Sep 3, 2020
2 parents f07b286 + bb97548 commit c7f7d37
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 4 deletions.
1 change: 1 addition & 0 deletions changelog/6549.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix slow training of ``CRFEntityExtractor`` when using Entity Roles and Groups.
12 changes: 8 additions & 4 deletions rasa/nlu/extractors/crf_entity_extractor.py
Original file line number Diff line number Diff line change
Expand Up @@ -406,14 +406,18 @@ def _create_features_for_token(
# get the features to extract for the token we are currently looking at
current_feature_idx = pointer_position + half_window_size
features = configured_features[current_feature_idx]

prefix = prefixes[current_feature_idx]

# we add the 'entity' feature to include the entity type as features
# for the role and group CRFs
# (do not modify features, otherwise we will end up adding 'entity'
# over and over again, making training very slow)
additional_features = []
if include_tag_features:
features.append("entity")

prefix = prefixes[current_feature_idx]
additional_features.append("entity")

for feature in features:
for feature in features + additional_features:
if feature == "pattern":
# add all regexes extracted from the 'RegexFeaturizer' as a
# feature: 'pattern_name' is the name of the pattern the user
Expand Down

0 comments on commit c7f7d37

Please sign in to comment.