Papers

Group Normalization -Kaiming He, et al, arxiv2018
Graph Convolutional Network -Xiaolong Wang, Yufei Ye, Abhinav Gupta, CVPR2018
DetNAS: Backbone Search for Object Detection
Mixup

network

CabViT: Cross Attention among Blocks for Vision Transformer -Intellifusion, arxiv2022, code
EfficientFormerV2Rethinking Vision Transformers for MobileNet Size and Speed -Snap, arxiv2022, code
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning -ICLR2022,code
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP -sensetime, ECCV2022, code
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications -arxiv2022, code
Edgevits: Competing light-weight cnns on mobile devices with vision transformers -ECCV2022,code
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios -bytedance, arxiv2022, code
TRT-ViT: TensorRT-oriented Vision Transformer -bytedance, arxiv2022
EfficientFormer: Vision Transformers at MobileNet Speed -snap, arxiv2022, code
UNeXt: MLP-based Rapid Medical Image Segmentation Network -arxiv2022, code
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation -tencent, CVPR2022, code
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer apple, ICLR2022, code
PP-LCNet: A Lightweight CPU Convolutional Neural Network -baidu, arxiv2022
Metaformer is actually what you need for vision -Yanshuicheng, CVPR2022
TinyNetModel Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets -huawei, NeurIPS2020
GhostNet: More Features from Cheap Operations -huawei, CVPR2020
EfficientNet
SqueezeNet
Mobilenets -google, arxiv2017
MobileNet-V2 -google, CVPR2018 caffe-code
MobileNetV3
NasNet-A-Learning transferable architectures for scalable image recognition -google brain, CoRR2017
ShuffleNet -megvii, CoRR2017
ShuffleNetV2
ThunderNet
DarkNet/Tiny YOLOv3/Tiny YOLOv2/Yolo-Nano/SlimYOLO/YOLO-LITE/Gaussian YOLOv3
LightweightNet: Toward fast and lightweight convolutional neural networks via architecture distillation -XuTingbin, PR2019
Mobilefacenets
EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse
Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution
HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs
Joint Architecture and Knowledge Distillation in Convolutional Neural Network for Offline Handwritten Chinese Text Recognition -dujun, arxiv2019 Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition -huoqiang, PR2019 vovnet
http://openaccess.thecvf.com/content_CVPRW_2019/papers/CEFRL/Lee_An_Energy_and_GPU-Computation_Efficient_Backbone_Network_for_Real-Time_Object_CVPRW_2019_paper.pdf

model compression

teacher-student/mutual-learning/Self-Distillation
low-rank/SVD-decomposition/Tucker-decomposition/CP-decomposition

InformationExtraction

database

EPHOIE - visual information extraction (VIE) in educational documents
[PubLayNet] - pretrain
[RVL-CDIP][IIT-CDIP]- document classification
[FUNSD]
[CORD]- receipt sementic entity extraction
[DocVQA]

knowledge distillation

Decoupled Knowledge Distillation -megvii, CVPR2022, code
Efficient knowledge distillation for rnn-transducer models -google/facebook, ICASSP2021
Investigation of Sequence-level Knowledge Distillation Methods for CTC Acoustic Models -NICT japan, ICASSP2019
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation -IBM, Interspeech2019
Explaining sequence-level knowledge distillation as data-augmentation for neural machine translation -arxiv2019
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion -microsoft, Interspeech2019
Knowledge Distillation for Sequence Model -AISpeech, Interspeech2018
Improved knowledge distillation from bi-directional to uni-directional LSTM CTC for end-to-end speech recognition -IBM, SLT2018
An Investigation of a Knowledge Distillation Method for CTC Acoustic Models -NICT japan, ICASSP2018
Sequence-Level Knowledge Distillation -Yoon Kim, EMNLP2016

Document Enhancement

Document Rectification

MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary -baidu, arxiv2023
Deep Unrestricted Document Image Rectification -arxiv2023, code
End-to-End Piece-Wise Unwarping of Document Images -amazon, ICCV2021, code
Geometric Representation Learning for Document Image Rectification -ECCV2022, code
Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild -MM2022, jinlianwen, code
Fourier Document Restoration for Robust Document Dewarping and Recognition -CVPR2022, bai song database
Revisiting document image dewarping by grid regularization -alibaba,CVPR2022,code
Learning From Documents in the Wild to Improve Document Unwarping -snap, SIGGRAPH2022, code
DocScanner: Robust Document Image Rectication with Progressive Learning -arxiv2021
Doctr: Document image transformer for geometric unwarping and illumination correction -MM2021, code
Document Dewarping with Control Points -ICDAR2021, code&dataset
Document Rectification and Illumination Correction using a Patch-based CNN -SIGGRAPH2019, code
Learning to Calibrate Straight Lines for Fisheye Image Rectification -CVPR2019

image alignment/registration

DocAligner: Annotating Real-world Photographic Document Images by Simply Taking Pictures -jinlianwen, arxiv2023, code
Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping -IJDAR2023, code

Inpainting

Inpaint Anything: Segment Anything Meets Image Inpainting -arxiv2023, code
LAMA:Resolution-Robust Large Mask Inpainting With Fourier Convolutions -samsung, WACV2022, code
RePaint: Inpainting Using Denoising Diffusion Probabilistic Models -CVPR2022, code
MAT: Mask-Aware Transformer for Large Hole Image Inpainting -adobe, CVPR2022, code
Incremental Transformer Structure Enhanced Image Inpainting With Masking Positional Encoding -CVPR2022, code
Aggregated Contextual Transformations for High-Resolution Image Inpainting -arxiv2021, code
Free-Form Image Inpainting with Gated Convolution -bytedance, ICCV2019, code

Graph

Joint stroke classification and text line grouping in online handwritten documents with edge pooling attention networks -PR2021
A Comprehensive Survey on Graph Neural Networks -TNN2020
Contextual Stroke Classification in Online Handwritten Documents with Edge Graph Attention Networks -SNCS2020
Deepgcns: Can gcns go as deep as cnns? -ICCV2019
Heterogeneous graph attention network -WWW2019
Contextual Stroke Classification in Online Handwritten Documents with Graph Attention Networks -ICDAR2019
Graph Convolutional Networks for Text Classification -AAAI2019
Graph Attention Networks -ICLR2018
Semi-Supervised Classification with Graph Convolutional Networks -ICLR2017

Name		Name	Last commit message	Last commit date
Latest commit History 236 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Papers

depth estimation

layout

asr

Contextual Biasing

table detection & recognition

mathematical expression recognition

word_vector

Chemical Structure

Seq2Seq

ReID

PoseEstimation

EdgeDetection

line segmentation

video_classification

dnn_base

network

model compression

InformationExtraction

database

knowledge distillation

Document Enhancement

Document Rectification

image alignment/registration

Inpainting

Graph

About

Releases

Packages

yflv-yanxia/Papers

Folders and files

Latest commit

History

Repository files navigation

Papers

depth estimation

layout

asr

Contextual Biasing

table detection & recognition

mathematical expression recognition

word_vector

Chemical Structure

Seq2Seq

ReID

PoseEstimation

EdgeDetection

line segmentation

video_classification

dnn_base

network

model compression

InformationExtraction

database

knowledge distillation

Document Enhancement

Document Rectification

image alignment/registration

Inpainting

Graph

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages