-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create UserModel class #184
Conversation
The model right now estimates the preferances for slots from liked/disliked movies. We should also keep track of explicitly stated preferances, e.g., "I like drama". |
moviebot/user_modeling/user_model.py
Outdated
movie choices. Defaults to None. | ||
""" | ||
self.user_id = user_id | ||
self._movies_choices = defaultdict(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._movies_choices = defaultdict(str) | |
self._movies_choices = defaultdict(list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline comments. I have several questions regarding choices. The main one being: Why keep historical choices instead of the latest one only?
moviebot/user_modeling/user_model.py
Outdated
preference = 0.0 | ||
count_rated = 0 | ||
for movie_id, choices in self._movies_choices.items(): | ||
if movie_id in tag_set: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe fetchall
returns a list of tuples. You might want to extract first element of the tuple to get the list of ids.
Additionally, you might want to convert the list to set()
to get a much faster lookup.
tag_set = {row[0] for row in tag_set}
moviebot/user_modeling/user_model.py
Outdated
movie_id: Id of the movie. | ||
choice: User choice (i.e., accept, reject). | ||
""" | ||
self._movies_choices[movie_id] = self._movies_choices[movie_id] + [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the latest choice be appended to or overwrite the previous preference?
moviebot/user_modeling/user_model.py
Outdated
if movie_id in tag_set: | ||
# TODO: decide how to handle contradictory choices (e.g., the | ||
# same movie was accepted and rejected) | ||
for choice in choices: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have multiple choices for each movie? If a user changes their mind, shouldn't we simply use the latest preference?
Choices include various values for different intents, for example, the choice can Based on your comments, I would make the following suggestion:
Would this make the code and logic more clear? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to get clarity on the purpose of this class. To me, it serves two main roles:
- provide a structured representation of the preferences the user expressed to the recommender model; this could be in the form of tags with weights or could be just a list of NL utterances that expressed a preference; it depends on the type of recommender model which format they can take as input
- provide input to the explainability component that would generate an explainable user model. As with 1) this could be either in the form of raw/annotated NL utterances or tags with weights.
Given the above, this class should have only two public methods: get_utterances_with_preference()
and get_tags_preferences()
. All other methods should be private.
We might want to distinguish between item and slot preferences... Not sure if tags are represented as a slot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, This will likely need to be iterated on, but it is better to merge and see how we will end up using it.
Co-authored-by: Ivica Kostric <ivica_kostric@yahoo.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments. How about tests for this class?
moviebot/user_modeling/user_model.py
Outdated
self.slot_preferences: Dict[Dict[str, float]] = defaultdict( | ||
lambda: defaultdict(float) | ||
) | ||
self.slot_preferences_nl: Dict[ | ||
Dict[str, AnnotatedUtterance] | ||
] = defaultdict(lambda: defaultdict(list)) | ||
|
||
# Structured and unstructured item preferences | ||
self.item_preferences: Dict[Dict[str, float]] = defaultdict( | ||
lambda: defaultdict(float) | ||
) | ||
self.item_preferences_nl: Dict[ | ||
Dict[str, AnnotatedUtterance] | ||
] = defaultdict(lambda: defaultdict(list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all these cases, is it a dict of dicts, in which the key of the outer dict is missing, or is it just a single dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the types. The idea is that for slots we have a dict of dicts ({slot:{value1:preference1, value2:preference2}}) and for items we have a single dict ({item: preference}).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to also mention this in the comments above the variable declarations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments remain, but LGTM-ed from my side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but why not just have the enum values as the preferences?
https://stackoverflow.com/questions/69716107/can-i-use-an-enum-with-floating-point-values-and-comparators-in-python-3-9-and-s
Not sure it needs Python 3.9, but also there is probably no reason for us not to use 3.9 for MovieBot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up issue #226
ItemConstraint("reason", Operator.EQ, "watched"), | ||
ItemConstraint("preference", Operator.EQ, preference), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue for creating enums for the possible contraints that can be expressed ("reason", "perference", etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up issue #225
moviebot/user_modeling/user_model.py
Outdated
self.slot_preferences: Dict[Dict[str, float]] = defaultdict( | ||
lambda: defaultdict(float) | ||
) | ||
self.slot_preferences_nl: Dict[ | ||
Dict[str, AnnotatedUtterance] | ||
] = defaultdict(lambda: defaultdict(list)) | ||
|
||
# Structured and unstructured item preferences | ||
self.item_preferences: Dict[Dict[str, float]] = defaultdict( | ||
lambda: defaultdict(float) | ||
) | ||
self.item_preferences_nl: Dict[ | ||
Dict[str, AnnotatedUtterance] | ||
] = defaultdict(lambda: defaultdict(list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to also mention this in the comments above the variable declarations.
This is a first attempt at modelling a user. The model stores the previous choices (e.g., accept, reject, and dont_like). The preference towards a set of movies can be estimated.
Note that the tag preferences are not computationally efficient as of now (e.g., a database call and exploration of the results are performed each time to get the tag preference).