Skip to content
#

multimodality

Here are 140 public repositories matching this topic...

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.

  • Updated Nov 7, 2024
  • Python

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

  • Updated Dec 10, 2024
  • Python

Improve this page

Add a description, image, and links to the multimodality topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodality topic, visit your repo's landing page and select "manage topics."

Learn more