Tianheng Cheng wondervictor

Hi there 👋

I'm Tianheng Cheng, and have finished my Ph.D. career at Huazhong University of Science and Technology. I’m now a reseacher at ByteDance Seed and working on cutting-edge large multimodal models and world models.

My lifelong research goal is to enable machines/robots to see, understand, and live like human beings.

Previous works/publications are listed at Google Scholar 📚.

Currently, I'm devoted to research about large multimodal models, foundational visual-language modeling, and image generation. Before that, I mainly focused on fundamental tasks such as object detection and instance segmentation, as well as visual perception for autonomous driving.

Highlighted Works of those pinned works:

🔥 ControlAR (arXiv) explores controllable image generation with autoregressive models and empowers autoregressive models with arbitrary-resolution generation.
🔥 EVF-SAM (arXiv) empowers segment-anything (SAM, SAM-2) with the strong text-prompting ability. Try our demo on HuggingFace.
OSP (ECCV 2024) explores sparse set of points to predict 3D semantic occupancy for autonomous vehicles, which is a brand new formulation!
🔥 YOLO-World (CVPR 2024) for real-time open-vocabulary object detection; Symphonies (CVPR 2024) for camera-based 3D scene completion.
SparseInst (CVPR 2022) aims for real-time instance segmentation with a simple fully convolutional framework! MobileInst (AAAI 2024) further explores temporal consistency and kernel reuse for efficient mobile video instance segmentation.
BoxTeacher (CVPR 2023) bridges the gap between fully supervised and box-supervised instance segmentation. With ~1/10 annotation cost, BoxTeacher can achieve 93% performance versus fully supervised methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tianheng Cheng wondervictor

Achievements

Achievements

Highlights

Organizations

Block or report wondervictor

Hi there 👋

Pinned Loading