음성 관련 논문을 읽고 공부 자료를 기록해두는 repository 입니다.
-
2020.12.22 One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
-
2020.12.22 Streaming Automatic Speech Recogniton with the Transformer Model
-
2020.12.31 wav2vec: Unsupervised Pre-training for Speech Recognition
-
2020.12.31 Voice Separation with an Unknown Number of Multiple Speakers
-
2021.01.07 Data-driven Harmonic Filters for Audio Representation Learning
-
2021.01.07 SpeechMix - Augmenting Deep Sound Recognition using Hidden Space Interpolations
-
2021.01.14 Automatic voice onset time estimation from reassignment spectra
-
2021.01.21 Conformer: Convolution-augmented Transformer for Speech Recognition
-
2021.01.21 Embeddings for Multi-Modality
-
2021.01.28 Jasper: An End-to-End Convolutional Neural Acoustic Model
-
2021.02.03 Contrastive Predictive Coding of Audio with an Adversary
-
2021.03.18 Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
-
2021.04.08 Cross Modal Audio Search and Retrieval with Joint Embeddings based on Text and Audio
-
2021.04.22 Coordinate Attention for Efficient Mobile Network Design
-
2021.06.03 VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
-
2021.06.14 GPT Understands, Too
-
2021.07.02 RNN-T MODELS FAIL TO GENERALIZE TO OUT-OF-DOMAIN AUDIO: CAUSES AND SOLUTIONS
-
2021.07.22 Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network
-
2021.08.19 Luna: Linear Unified Nested Attention