You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How does the Octopus dataset is organized and trained on LLaVA architecture? LLaVA doesn't support in-context learning, if we merge all subtasks into a multi-turn conversation, another problem raises: LLaVA will input all subtask's images embeddings at once, and this problem seems hard to solve.
So how do you deal with that, input no images and only use env information? could you provide a demo.json to show me how dataset is organized on LLaVA architecture? thanks a lot
The text was updated successfully, but these errors were encountered:
The LLaVA-version Octopus will be released once the official LLaVA video is released, as we used some internal code from the mentioned project. The release should be soon but I am not quite sure about the exact date.
How does the Octopus dataset is organized and trained on LLaVA architecture? LLaVA doesn't support in-context learning, if we merge all subtasks into a multi-turn conversation, another problem raises: LLaVA will input all subtask's images embeddings at once, and this problem seems hard to solve.
So how do you deal with that, input no images and only use env information? could you provide a demo.json to show me how dataset is organized on LLaVA architecture? thanks a lot
The text was updated successfully, but these errors were encountered: