Skip to content
View dongyh20's full-sized avatar
  • Tsinghua University

Highlights

  • Pro

Block or report dongyh20

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dongyh20/README.md

Hi there 👋

🔭 I’m currently working on the topic of visual perception and my long-term goal is to build general foundation models.

⚡ Recently I'm focusing on vision-language model and unified visual models.

📫 If you are also interested in relevant issues, feel free to chat with me!

Pinned Loading

  1. Oryx-mllm/Oryx Oryx-mllm/Oryx Public

    MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

    Python 294 14

  2. Octopus Octopus Public

    🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.

    Python 274 19

  3. EvolvingLMMs-Lab/lmms-eval EvolvingLMMs-Lab/lmms-eval Public

    Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

    Python 2.2k 178

  4. Insight-V Insight-V Public

    Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

    Python 122 4

  5. Chain-of-Spot Chain-of-Spot Public

    Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models

    Python 89 6