Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
jdf-prog committed Apr 13, 2024
1 parent 653557e commit ceb785c
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions _posts/2024-04-12-mantis.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,12 @@ We report the performance in 3 domains daily life, robotics, and comics. Results

We also evaluate the performance of Mantis-LLaVA-7b on single-image reasoning tasks, including AI2D, GQA, InfoVQA, MME-C(ognition), MME-P(ception), MMMU, OKVQA, TextVQA. The results are shown in Figure 1. After training on the Mantis-Instruct dataset, Mantis-LLaVA maintains the performance on single-image reasoning tasks compared to its base model LLaVA-1.5-7b. Surprisingly, the performance on the AI2D, InfoVQA, MME-C, MMMU, and TextVQA all get significant improvements.

### Case study
![Figure 3: Mantis case study]({{"/assets/Mantis/images/cases.jpeg" | relative_url }})

We present 2 cases where Mantis is doing do compared to LLaVA-1.5, which does not have multi-image reasoning ability, along with Emu-2 and GPT-4V, which also have multi-image reasoning ability. It is clear that Mantis is able to capture the information from multiple images and generate a more accurate answer.


# Ongoing Work

Mantis is a active work in progres. We have demonstrated that Mantis-bakLLaVA-7b has achieved remarkable performance on various benchmarks, including NLVR2, Birds-to-words, Mementos, and Qbench2. However, there are still some limitations and future directions that we need to address, such as performance drops on single-image reasoning tasks, and the context length limitation of the model. We plan to keep improving the model's performance on single-image reasoning tasks and explore more efficient ways to handle multiple images. Larger models and more diverse datasets will be used to further improve the model's performance.
Expand Down
Binary file added assets/Mantis/images/cases.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit ceb785c

Please sign in to comment.