Skip to content

Latest commit

 

History

History
1672 lines (947 loc) · 48.6 KB

README.md

File metadata and controls

1672 lines (947 loc) · 48.6 KB

CVPR2020-Code

CVPR 2020 论文开源项目合集,同时欢迎各位大佬提交issue,分享CVPR 2020开源项目

CNN

Exploring Self-attention for Image Recognition

Improving Convolutional Networks with Self-Calibrated Convolutions

Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets

图像分类

Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

Spatially Attentive Output Layer for Image Classification

目标检测

D2Det: Towards High Quality Object Detection and Instance Segmentation

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Scale-Equalizing Pyramid Convolution for Object Detection

论文:https://arxiv.org/abs/2005.03101

代码:https://github.com/jshilong/SEPC

Revisiting the Sibling Head in Object Detector

Scale-equalizing Pyramid Convolution for Object Detection

Detection in Crowded Scenes: One Proposal, Multiple Predictions

Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection

BiDet: An Efficient Binarized Object Detector

Harmonizing Transferability and Discriminability for Adapting Object Detectors

CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

EfficientDet: Scalable and Efficient Object Detection

3D目标检测

Train in Germany, Test in The USA: Making 3D Object Detectors Generalize

MLCVNet: Multi-Level Context VoteNet for 3D Object Detection

3DSSD: Point-based 3D Single Stage Object Detector

Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation

End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

DSGN: Deep Stereo Geometry Network for 3D Object Detection

LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud

视频目标检测

Memory Enhanced Global-Local Aggregation for Video Object Detection

论文:https://arxiv.org/abs/2003.12063

代码:https://github.com/Scalsol/mega.pytorch

目标跟踪

SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking

D3S -- A Discriminative Single Shot Segmentation Tracker

ROAM: Recurrently Optimizing Tracking Model

Siam R-CNN: Visual Tracking by Re-Detection

Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises

High-Performance Long-Term Tracking with Meta-Updater

AutoTrack: Towards High-Performance Visual Tracking for UAV with Automatic Spatio-Temporal Regularization

Probabilistic Regression for Visual Tracking

MAST: A Memory-Augmented Self-supervised Tracker

Siamese Box Adaptive Network for Visual Tracking

语义分割

Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation

Single-Stage Semantic Segmentation from Image Labels

Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation

MSeg: A Composite Dataset for Multi-domain Semantic Segmentation

CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement

Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

Temporally Distributed Networks for Fast Video Segmentation

Context Prior for Scene Segmentation

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

Cars Can't Fly up in the Sky: Improving Urban-Scene Segmentation via Height-driven Attention Networks

Learning Dynamic Routing for Semantic Segmentation

实例分割

D2Det: Towards High Quality Object Detection and Instance Segmentation

PolarMask: Single Shot Instance Segmentation with Polar Representation

CenterMask : Real-Time Anchor-Free Instance Segmentation

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

Deep Snake for Real-Time Instance Segmentation

Mask Encoding for Single Shot Instance Segmentation

全景分割

Pixel Consensus Voting for Panoptic Segmentation

BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation

论文:https://arxiv.org/abs/2003.14031

代码:https://github.com/Mooonside/BANet

视频目标分割

A Transductive Approach for Video Object Segmentation

State-Aware Tracker for Real-Time Video Object Segmentation

Learning Fast and Robust Target Models for Video Object Segmentation

Learning Video Object Segmentation from Unlabeled Videos

超像素分割

Superpixel Segmentation with Fully Convolutional Networks

NAS

AOWS: Adaptive and optimal network width search with latency constraints

Densely Connected Search Space for More Flexible Neural Architecture Search

MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning

FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions

Neural Architecture Search for Lightweight Non-Local Networks

Rethinking Performance Estimation in Neural Architecture Search

CARS: Continuous Evolution for Efficient Neural Architecture Search

GAN

Semantically Mutil-modal Image Synthesis

Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping

Learning to Cartoonize Using White-box Cartoon Representations

GAN Compression: Efficient Architectures for Interactive Conditional GANs

Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Distributions

Re-ID

COCAS: A Large-Scale Clothes Changing Person Dataset for Re-identification

Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking

Pose-guided Visible Part Matching for Occluded Person ReID

Weakly supervised discriminative feature learning with state information for person identification

3D点云(分类/分割/配准等)

3D点云卷积

Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds

Grid-GCN for Fast and Scalable Point Cloud Learning

FPConv: Learning Local Flattening for Point Convolution

3D点云分类

PointAugment: an Auto-Augmentation Framework for Point Cloud Classification

3D点云语义分割

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Weakly Supervised Semantic Point Cloud Segmentation:Towards 10X Fewer Labels

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

Learning to Segment 3D Point Clouds in 2D Image Space

3D点云实例分割

PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation

3D点云配准

D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features

RPM-Net: Robust Point Matching using Learned Features

3D点云补全

Cascaded Refinement Network for Point Cloud Completion

3D点云目标跟踪

P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds

人脸

人脸识别

CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition

Learning Meta Face Recognition in Unseen Domains

人脸检测

人脸活体检测

Searching Central Difference Convolutional Networks for Face Anti-Spoofing

人脸表情识别

Suppressing Uncertainties for Large-Scale Facial Expression Recognition

人脸转正

Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

人脸3D重建

AvatarMe: Realistically Renderable 3D Facial Reconstruction "in-the-wild"

FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

人体姿态估计(2D/3D)

2D人体姿态估计

HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation

The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation

Distribution-Aware Coordinate Representation for Human Pose Estimation

3D人体姿态估计

Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach

Bodies at Rest: 3D Human Pose and Shape Estimation from a Pressure Image using Synthetic Data

Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis

Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation

VIBE: Video Inference for Human Body Pose and Shape Estimation

Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose Estimation

Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS

人体解析

Correlating Edge, Pose with Parsing

场景文本检测

UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection

场景文本识别

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

超分辨率

图像超分辨率

Learning Texture Transformer Network for Image Super-Resolution

Image Super-Resolution with Cross-Scale Non-Local Attention and Exhaustive Self-Exemplars Mining

Structure-Preserving Super Resolution with Gradient Guidance

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy

论文:https://arxiv.org/abs/2004.00448

代码:https://github.com/clovaai/cutblur

视频超分辨率

Space-Time-Aware Multi-Resolution Video Enhancement

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

模型压缩/剪枝

DMCP: Differentiable Markov Channel Pruning for Neural Networks

Forward and Backward Information Retention for Accurate Binary Neural Networks

Towards Efficient Model Compression via Learned Global Ranking

HRank: Filter Pruning using High-Rank Feature Map

GAN Compression: Efficient Architectures for Interactive Conditional GANs

Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression

视频理解/行为识别

Intra- and Inter-Action Understanding via Temporal Action Parsing

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding

TEA: Temporal Excitation and Aggregation for Action Recognition

X3D: Expanding Architectures for Efficient Video Recognition

Temporal Pyramid Network for Action Recognition

基于骨架的动作识别

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

人群计数

深度估计

Focus on defocus: bridging the synthetic to real domain gap for depth estimation

Bi3D: Stereo Depth Estimation via Binary Classifications

AANet: Adaptive Aggregation Network for Efficient Stereo Matching

Towards Better Generalization: Joint Depth-Pose Learning without PoseNet

单目深度估计

On the uncertainty of self-supervised monocular depth estimation

3D Packing for Self-Supervised Monocular Depth Estimation

Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation

6D目标姿态估计

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

EPOS: Estimating 6D Pose of Objects with Symmetries

主页:http://cmp.felk.cvut.cz/epos

论文:https://arxiv.org/abs/2004.00605

G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features

手势估计

HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation

Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data

显著性检测

JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection

UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders

去噪

A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising

CycleISP: Real Image Restoration via Improved Data Synthesis

去雨

Multi-Scale Progressive Fusion Network for Single Image Deraining

去模糊

视频去模糊

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior

去雾

Multi-Scale Boosted Dehazing Network with Dense Feature Fusion

特征点检测与描述

ASLFeat: Learning Local Features of Accurate Shape and Localization

视觉问答(VQA)

VC R-CNN:Visual Commonsense R-CNN

视频问答(VideoQA)

Hierarchical Conditional Relation Networks for Video Question Answering

视觉语言导航

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

视频压缩

Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement

视频插值

Space-Time-Aware Multi-Resolution Video Enhancement

Scene-Adaptive Video Frame Interpolation via Meta-Learning

Softmax Splatting for Video Frame Interpolation

风格迁移

Diversified Arbitrary Style Transfer via Deep Feature Perturbation

Collaborative Distillation for Ultra-Resolution Universal Style Transfer

车道线检测

Inter-Region Affinity Distillation for Road Marking Segmentation

"人-物"交互(HOT)检测

Detailed 2D-3D Joint Representation for Human-Object Interaction

Cascaded Human-Object Interaction Recognition

VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions

行人轨迹预测

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction

运动预测

Collaborative Motion Prediction via Neural Motion Message Passing

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

虚拟试衣

Towards Photo-Realistic Virtual Try-On by Adaptively Generating↔Preserving Image Content

HDR

Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

对抗样本

Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance

深度补全

Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End

论文:https://arxiv.org/abs/2006.03349

代码:https://github.com/abdo-eldesokey/pncnn

语义场景补全

3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior

数据集

Open Compound Domain Adaptation

Intra- and Inter-Action Understanding via Temporal Action Parsing

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

COCAS: A Large-Scale Clothes Changing Person Dataset for Re-identification

KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations

MSeg: A Composite Dataset for Multi-domain Semantic Segmentation

AvatarMe: Realistically Renderable 3D Facial Reconstruction "in-the-wild"

Learning to Autofocus

FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

Bodies at Rest: 3D Human Pose and Shape Estimation from a Pressure Image using Synthetic Data

FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding

A Local-to-Global Approach to Multi-modal Movie Scene Segmentation

Deep Homography Estimation for Dynamic Scenes

Assessing Image Quality Issues for Real-World Problems

UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

PANDA: A Gigapixel-level Human-centric Video Dataset

IntrA: 3D Intracranial Aneurysm Dataset for Deep Learning

Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS

其他

Open Compound Domain Adaptation

Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision

QEBA: Query-Efficient Boundary-Based Blackbox Attack

Equalization Loss for Long-Tailed Object Recognition

Instance-aware Image Colorization

Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting

Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching

Epipolar Transformers

Bringing Old Photos Back to Life

MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask

Self-Supervised Viewpoint Learning from Image Collections

Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

Towards Learning Structure via Consensus for Face Segmentation and Parsing

Plug-and-Play Algorithms for Large-scale Snapshot Compressive Imaging

Lightweight Photometric Stereo for Facial Details Recovery

Footprints and Free Space from a Single Color Image

Self-Supervised Monocular Scene Flow Estimation

Quasi-Newton Solver for Robust Non-Rigid Registration

A Local-to-Global Approach to Multi-modal Movie Scene Segmentation

DeepFLASH: An Efficient Network for Learning-based Medical Image Registration

Self-Supervised Scene De-occlusion

Polarized Reflection Removal with Perfect Alignment in the Wild

Background Matting: The World is Your Green Screen

What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Video Object Grounding using Semantic Roles in Language Description

Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives

SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization

On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location

GhostNet: More Features from Cheap Operations

AdderNet: Do We Really Need Multiplications in Deep Learning?

Deep Image Harmonization via Domain Verification

Blurry Video Frame Interpolation

Extremely Dense Point Correspondences using a Learned Feature Descriptor

Filter Grafting for Deep Neural Networks

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

Detecting Attended Visual Targets in Video

Deep Image Spatial Transformation for Person Image Generation

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

https://github.com/charlesCXK/3D-SketchAware-SSC

https://github.com/Anonymous20192020/Anonymous_CVPR5767

https://github.com/avirambh/ScopeFlow

https://github.com/csbhr/CDVD-TSP

https://github.com/ymcidence/TBH

https://github.com/yaoyao-liu/mnemonics

https://github.com/meder411/Tangent-Images

https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

https://github.com/sjmoran/deep_local_parametric_filters

https://github.com/charlesCXK/3D-SketchAware-SSC

https://github.com/bermanmaxim/AOWS

https://github.com/dc3ea9f/look-into-object

不确定中没中

FADNet: A Fast and Accurate Network for Disparity Estimation

https://github.com/rFID-submit/RandomFID:不确定中没中

https://github.com/JackSyu/AE-MSR:不确定中没中

https://github.com/fastconvnets/cvpr2020:不确定中没中

https://github.com/aimagelab/meshed-memory-transformer:不确定中没中

https://github.com/TWSFar/CRGNet:不确定中没中

https://github.com/CVPR-2020/CDARTS:不确定中没中

https://github.com/anucvml/ddn-cvprw2020:不确定中没中

https://github.com/dl-model-recommend/model-trust:不确定中没中

https://github.com/apratimbhattacharyya18/CVPR-2020-Corr-Prior:不确定中没中

https://github.com/onetcvpr/O-Net:不确定中没中

https://github.com/502463708/Microcalcification_Detection:不确定中没中

https://github.com/anonymous-for-review/cvpr-2020-deep-smoke-machine:不确定中没中

https://github.com/anonymous-for-review/cvpr-2020-smoke-recognition-dataset:不确定中没中

https://github.com/cvpr-nonrigid/dataset:不确定中没中

https://github.com/theFool32/PPBA:不确定中没中

https://github.com/Realtime-Action-Recognition/Realtime-Action-Recognition