Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
multimodality
multimodal-learning
multimodal-deep-learning
multimodal-data
multimodal-fusion
multimodal-action-recognition
cross-attention
-
Updated
Jun 7, 2021 - Python