Motion meets Attention: Video Motion Prompts (2024.07.03)
Qixiang Chen, Lei Wang, Piotr Koniusz, Tom Gedeon
Towards a Personal Health Large Language Model (2024.06.10)
J. Cosentino, Anastasiya Belyaeva, Xin Liu, N. Furlotte, Zhun Yang, etc
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning (2024.06.10)
Joongwon Kim, Bhargavi Paranjape, Tushar Khot, Hanna Hajishirzi
Towards Lifelong Learning of Large Language Models: A Survey (2024.06.10)
Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma
Towards Semantic Equivalence of Tokenization in Multimodal LLM (2024.06.07)
Shengqiong Wu, Hao Fei, Xiangtai Li, Jiayi Ji, Hanwang Zhang, etc
LLMs Meet Multimodal Generation and Editing: A Survey (2024.05.29)
Yin-Yin He, Zhaoyang Liu, Jingye Chen, Zeyue Tian, Hongyu Liu, etc . - 【arXiv.org】
Tool Learning with Large Language Models: A Survey (2024.05.28)
Changle Qu, Sunhao Dai, Xiaochi Wei, Hengyi Cai, Shuaiqiang Wang, etc . - 【arXiv.org】
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models (2024.05.16)
Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, etc . - 【arXiv.org】
Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach (2024.04.24)
Linyu Liu, Yu Pan, Xiaocheng Li, Guanting Chen . - 【arXiv.org】
A Survey on the Memory Mechanism of Large Language Model based Agents (2024.04.21)
Zeyu Zhang, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, etc . - 【arXiv.org】
The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey (2024.04.17)
Tula Masterman, Sandi Besen, Mason Sawtell, Alex Chao . - 【arXiv.org】
CausalBench: A Comprehensive Benchmark for Causal Learning Capability of Large Language Models (2024.04.09)
Yu Zhou, Xingyu Wu, Beichen Huang, Jibin Wu, Liang Feng, etc . - 【arXiv.org】
AI-Tutoring in Software Engineering Education: Experiences with Large Language Models in Programming Assessments (2024.04.03)
Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu . - 【2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET)】
A Survey on Large Language Model-Based Game Agents (2024.04.02)
Sihao Hu, Tiansheng Huang, Fatih Ilhan, S. Tekin, Gaowen Liu, etc . - 【arXiv.org】
Large Language Models for Education: A Survey and Outlook (2024.03.26)
Shen Wang, Tianlong Xu, Hang Li, Chaoli Zhang, Joleen Liang, etc . - 【arXiv.org】
The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) (2024.03.21)
Joschka Haltaufderheide, R. Ranisch
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey (2024.03.21)
Zeyu Han, Chao Gao, Jinyang Liu, Jeff Zhang, Sai Qian Zhang
ChatGPT Alternative Solutions: Large Language Models Survey (2024.03.16)
H. Alipour, Nick Pendar, Kohinoor Roy . - 【Networks, Blockchain and Internet of Things】
MM1: Methods, Analysis&Insights from Multimodal LLM Pre-training (2024.03.14)
Brandon McKinzie, Zhe Gan, J. Fauconnier, Sam Dodge, Bowen Zhang, etc
Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey (2024.03.14)
Xiaoyu Liu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, etc
Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies (2024.03.06)
Felix Brakel, Uraz Odyurt, A. Varbanescu
Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation (2024.03.05)
Bin Zhang, Yuxiao Ye, Guoqing Du, Xiaoru Hu, Zhishuai Li, etc
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods (2024.03.05)
Hanlei Jin, Yang Zhang, Dan Meng, Jun Wang, Jinghua Tan
Large Language Models for Data Annotation: A Survey (2024.02.21)
Zhen Tan, Alimohammad Beigi, Song Wang, Ruocheng Guo, Amrita Bhattacharjee, etc
A Survey on Knowledge Distillation of Large Language Models (2024.02.20)
Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, etc
Investigating Cultural Alignment of Large Language Models (2024.02.20)
Badr AlKhamissi, Muhammad N. ElNokrashy, Mai AlKhamissi, Mona Diab
MM-LLMs: Recent Advances in MultiModal Large Language Models (2024.01.24)
Duzhen Zhang, Yahan Yu, Chenxing Li, Jiahua Dong, Dan Su, etc . - 【arXiv.org】
Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions (2024.01.16)
Nooshin Pourkamali, Shler Ebrahim Sharifi . - 【arXiv.org】
Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives (2023.12.19)
Chen Gao, Xiaochong Lan, Nian Li, Yuan Yuan, Jingtao Ding, etc . - 【arXiv.org】
Retrieval-Augmented Generation for Large Language Models: A Survey (2023.12.18)
Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, etc
ChatGPT v.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models (2023.10.23)
Zehao Wen, Rabih Younes . - 【Applied and Computational Engineering】
Zachary Levonian, Chenglu Li, Wangda Zhu, Anoushka Gade, Owen Henkel, etc . - 【arXiv.org】
A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future (2023.09.27)
Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Tao He, etc . - 【arXiv.org】
The Rise and Potential of Large Language Model Based Agents: A Survey (2023.09.14)
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, etc . - 【arXiv.org】
Textbooks Are All You Need II: phi-1.5 technical report (2023.09.11)
Yuan-Fang Li, Sébastien Bubeck, Ronen Eldan, Allison Del Giorno, Suriya Gunasekar, etc . - 【arXiv.org】
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models (2023.09.03)
Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, etc . - 【arXiv.org】
Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, etc
Large language models in medicine: the potentials and pitfalls (2023.08.31)
J. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
Large Graph Models: A Perspective (2023.08.28)
Ziwei Zhang, Haoyang Li, Zeyang Zhang, Yi Qin, Xin Wang, etc . - 【arXiv.org】
A Survey on Large Language Model based Autonomous Agents (2023.08.22)
Lei Wang, Chengbang Ma, Xueyang Feng, Zeyu Zhang, Hao-ran Yang, etc . - 【arXiv.org】
Instruction Tuning for Large Language Models: A Survey (2023.08.21)
Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, etc . - 【arXiv.org】
Scientific discovery in the age of artificial intelligence (2023.08.01)
Hanchen Wang, Tianfan Fu, Yuanqi Du, Wenhao Gao, Kexin Huang, etc . - 【Nature】
Foundational Models Defining a New Era in Vision: A Survey and Outlook (2023.07.25)
Muhammad Awais, Muzammal Naseer, Salman Siddique Khan, R. Anwer, Hisham Cholakkal, etc . - 【arXiv.org】
Foundational Models Defining a New Era in Vision: A Survey and Outlook (2023.07.25)
Muhammad Awais, Muzammal Naseer, Salman Siddique Khan, R. Anwer, Hisham Cholakkal, etc
Challenges and Applications of Large Language Models (2023.07.19)
Jean Kaddour, Joshua Harris, Maximilian Mozes, Herbie Bradley, Roberta Raileanu, etc . - 【arXiv.org】
Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators (2023.07.08)
Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse . - 【International Conference on Conversational User Interfaces】
A Survey on Evaluation of Large Language Models (2023.07.06)
Yu-Chu Chang, Xu Wang, Jindong Wang, Yuanyi Wu, Kaijie Zhu, etc . - 【arXiv.org】
A Survey on Evaluation of Large Language Models (2023.07.06)
Yu-Chu Chang, Xu Wang, Jindong Wang, Yuanyi Wu, Kaijie Zhu, etc . - 【arXiv.org】
A Survey on Multimodal Large Language Models (2023.06.23)
Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, etc . - 【arXiv.org】
Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health (2023.06.15)
Shubo Tian, Qiao Jin, Lana Yeganova, Po-Ting Lai, Qingqing Zhu, etc . - 【arXiv.org】
A Survey of Vision-Language Pre-training from the Lens of Multimodal Machine Translation (2023.06.12)
Jeremy Gwinnup, Kevin Duh . - 【arXiv.org】
How Can Recommender Systems Benefit from Large Language Models: A Survey (2023.06.09)
Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, etc . - 【arXiv.org】
Harnessing Large Language Models in Nursing Care Planning: Opportunities, Challenges, and Ethical Considerations (2023.06.01)
A. Nashwan, Ahmad A. Abujaber . - 【Cureus】
Structural Ambiguity and its Disambiguation in Language Model Based Parsers: the Case of Dutch Clause Relativization (2023.05.24)
Gijs Wijnholds, Michael Moortgat
Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective (2023.05.24)
Guhao Feng, Yuntian Gu, Bohang Zhang, Haotian Ye, Di He, etc
Unit-based Speech-to-Speech Translation Without Parallel Data (2023.05.24)
Anuj Diwan, Anirudh Srinivasan, David F. Harwath, Eunsol Choi
A Neural Space-Time Representation for Text-to-Image Personalization (2023.05.24)
Yuval Alaluf, Elad Richardson, Gal Metzer, Daniel Cohen-Or
Visual Programming for Text-to-Image Generation and Evaluation (2023.05.24)
Jaemin Cho, Abhay Zala, Mohit Bansal
Towards Foundation Models for Relational Databases [Vision Paper] (2023.05.24)
Liane Vogel, Benjamin Hilprecht, Carsten Binnig
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers (2023.05.24)
Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model (2023.05.24)
Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, etc
LMs with a Voice: Spoken Language Modeling beyond Speech Tokens (2023.05.24)
Eliya Nachmani, Alon Levkovitch, Julian Salazar, Chulayutsh Asawaroengchai, Soroosh Mariooryad, etc
DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models (2023.05.24)
Sungnyun Kim, Junsoo Lee, Kibeom Hong, Daesik Kim, Namhyuk Ahn
Pre-training Multi-party Dialogue Models with Latent Discourse Inference (2023.05.24)
Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator (2023.05.24)
Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, etc
CSTS: Conditional Semantic Textual Similarity (2023.05.24)
Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak S. Murahari, Victoria Graf, etc
STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models (2023.05.24)
Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, etc
Contrastive Learning of Sentence Embeddings from Scratch (2023.05.24)
Junlei Zhang, Zhenzhong Lan, Junxian He
Meta-Learning Online Adaptation of Language Models (2023.05.24)
Nathan J. Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn
Who Wrote this Code? Watermarking for Code Generation (2023.05.24)
Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, etc
Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering (2023.05.24)
Jiajie Zhang, Shulin Cao, Tingjia Zhang, Xin Lv, Jiaxin Shi, etc
Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis (2023.05.24)
Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan
Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation (2023.05.24)
Mete Sertkan, Sophia Althammer, Sebastian Hofstatter
Ghostbuster: Detecting Text Ghostwritten by Large Language Models (2023.05.24)
Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein
Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science (2023.05.24)
Veniamin Veselovsky, Manoel Horta Ribeiro, Akhil Arora, Martin Josifoski, Ashton Anderson, etc
Active Learning for Natural Language Generation (2023.05.24)
Yotam Perlitz, Ariel Gera, Michal Shmueli-Scheuer, Dafna Sheinwald, Noam Slonim, etc
SmartTrim: Adaptive Tokens and Parameters Pruning for Efficient Vision-Language Models (2023.05.24)
Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Ming Liu, Bing Qin
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives (2023.05.24)
Xinpeng Wang, Leonie Weissweiler, Hinrich Schutze, Barbara Plank
ChatAgri: Exploring Potentials of ChatGPT on Cross-linguistic Agricultural Text Classification (2023.05.24)
Biao Zhao, Weiqiang Jin, Javier Del Ser, Guang Yang
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models (2023.05.24)
Gen Luo, Yiyi Zhou, Tianhe Ren, Shengxin Chen, Xiaoshuai Sun, etc
Unlocking Temporal Question Answering for Large Language Models Using Code Execution (2023.05.24)
Xingxuan Li, Liying Cheng, Qingyu Tan, Hwee Tou Ng, Shafiq Joty, etc
Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation (2023.05.24)
Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin
Injecting Knowledge into Biomedical Pre-trained Models via Polymorphism and Synonymous Substitution (2023.05.24)
Hongbo Zhang, Xiang Wan, Benyou Wang
The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning (2023.05.24)
Jingyuan Qi, Zhiyang Xu, Ying Shen, Minqian Liu, Di Jin, etc
Reasoning with Language Model is Planning with World Model (2023.05.24)
Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, etc
MuLER: Detailed and Scalable Reference-based Evaluation (2023.05.24)
Taelin Karidi, Leshem Choshen, Gal Patel, Omri Abend
Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers (2023.05.24)
Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang, etc
Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality (2023.05.24)
Tanay Dixit, Fei Wang, Muhao Chen
MMNet: Multi-Mask Network for Referring Image Segmentation (2023.05.24)
Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu
Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks (2023.05.24)
Abhinav Rao, Sachin Vashistha, Atharva Naik, Somak Aditya, Monojit Choudhury
Editing Commonsense Knowledge in GPT (2023.05.24)
Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, etc
Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages (2023.05.24)
Qi Gou, Zehua Xia, Wen-Hau Du
Trade-Offs Between Fairness and Privacy in Language Modeling (2023.05.24)
Cleo Matzken, Steffen Eger, Ivan Habernal
Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning (2023.05.24)
L. Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection (2023.05.24)
Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, etc
PIVOINE: Instruction Tuning for Open-world Information Extraction (2023.05.24)
Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, etc
Text encoders are performance bottlenecks in contrastive vision-language models (2023.05.24)
Amita Kamath, Jack Hessel, Kai-Wei Chang
HARD: Hard Augmentations for Robust Distillation (2023.05.24)
Arne F. Nix, Max F. Burg, Fabian H Sinz
Privacy Implications of Retrieval-Based Language Models (2023.05.24)
Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen
Interpretable by Design Visual Question Answering (2023.05.24)
Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, D. Roth
Leveraging GPT-4 for Automatic Translation Post-Editing (2023.05.24)
Vikas Raunak, Amr Sharaf, Hany Hassan Awadallah, Arul Menezes
CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering (2023.05.24)
Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, etc
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers (2023.05.24)
Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, D. Pan
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation (2023.05.24)
Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang
Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification (2023.05.24)
Chengyu Dong, Zihan Wang, Jingbo Shang
Text Conditional Alt-Text Generation for Twitter Images (2023.05.24)
Nikita Srivatsan, Sofia Samaniego, Omar Florez, Taylor Berg-Kirkpatrick
A Controllable QA-based Framework for Decontextualization (2023.05.24)
Benjamin Newman, Luca Soldaini, Raymond Fok, Arman Cohan, Kyle Lo
SSD-2: Scaling and Inference-time Fusion of Diffusion Language Models (2023.05.24)
Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning (2023.05.24)
Ahmed Masry, Parsa Kavehzadeh, Xuan Long Do, Enamul Hoque, Shafiq Joty
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation (2023.05.24)
Dongxu Yue, Qin Guo, Munan Ning, Jiaxi Cui, Yuesheng Zhu, etc
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding (2023.05.24)
Weijia Shi, Xiaochuang Han, M. Lewis, Yulia Tsvetkov, Luke Zettlemoyer, etc
GlobalBench: A Benchmark for Global Progress in Natural Language Processing (2023.05.24)
Y. Song, Catherine Cui, Simran Khanuja, Pengfei Liu, FAHIM FAISAL, etc
The student becomes the master: Matching GPT3 on Scientific Factual Error Correction (2023.05.24)
Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnab'as P'oczos
PruMUX: Augmenting Data Multiplexing with Model Compression (2023.05.24)
Yushan Su, Vishvak S. Murahari, Karthik Narasimhan, Kai Li
Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts (2023.05.24)
Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, etc
SELFOOD: Self-Supervised Out-Of-Distribution Detection via Learning to Rank (2023.05.24)
Dheeraj Mekala, Adithya Samavedhi, Chengyu Dong, Jingbo Shang
Fusion-in-T5: Unifying Document Ranking Signals for Improved Information Retrieval (2023.05.24)
S. Yu, Chenghao Fan, Chenyan Xiong, David Jin, Zhiyuan Liu, etc
Emergent inabilities? Inverse scaling over the course of pretraining (2023.05.24)
James A. Michaelov, B. Bergen
Optimal Linear Subspace Search: Learning to Construct Fast and High-Quality Schedulers for Diffusion Models (2023.05.24)
Zhongjie Duan, Chengyu Wang, Cen Chen, Jun Huang, Weining Qian
Ishani Mondal, Michelle Yuan, N Anandhavelu, Aparna Garimella, Francis Ferraro, etc
A Joint Time-frequency Domain Transformer for Multivariate Time Series Forecasting (2023.05.24)
Yushu Chen, Shengzhuo Liu, Jinzhe Yang, Hao Jing, Wenlai Zhao, etc
Meta-review Generation with Checklist-guided Iterative Introspection (2023.05.24)
Qi Zeng, Mankeerat S. Sidhu, Hou Pong Chan, Lu Wang, Heng Ji
Reinforcement Learning finetuned Vision-Code Transformer for UI-to-Code Generation (2023.05.24)
Davit Soselia, Khalid Saifullah, Tianyi Zhou
KNN-LM Does Not Improve Open-ended Text Generation (2023.05.24)
Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, etc
Language Models with Rationality (2023.05.23)
Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schütze, etc
Margarita Bugueno, Gerard de Melo
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models (2023.05.23)
Leonardo Ranaldi, Elena Sofia Ruzzetti, Davide Venditti, Dario Onorati, Fabio Massimo Zanzotto
Question Answering as Programming for Solving Time-Sensitive Questions (2023.05.23)
Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, etc
PaD: Program-aided Distillation Specializes Large Models in Reasoning (2023.05.23)
Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xingwei Long, Bowen Zhou
Aligning Large Language Models through Synthetic Feedback (2023.05.23)
Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, etc
LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models (2023.05.23)
Fangkai Jiao, Zhiyang Teng, Shafiq Joty, Bosheng Ding, Aixin Sun, etc
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer (2023.05.22)
Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, etc . - 【arXiv.org】
DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules (2023.05.22)
Yanchen Liu, William Held, Diyi Yang
Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision (2023.05.22)
Yucheng Cai, Hong Liu, Zhijian Ou, Y. Huang, Junlan Feng
Sentence Representations via Gaussian Embedding (2023.05.22)
Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda
MacLaSa: Multi-Aspect Controllable Text Generation via Efficient Sampling from Compact Latent Space (2023.05.22)
Hanxing Ding, Liang Pang, Z. Wei, Huawei Shen, Xueqi Cheng, etc
Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models (2023.05.22)
Ioana Baldini, Chhavi Yadav, Payel Das, K. Varshney
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis (2023.05.22)
Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance (2023.05.22)
Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, etc
InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT (2023.05.22)
Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuo Wang, etc
Making Language Models Better Tool Learners with Execution Feedback (2023.05.22)
Shuofei Qiao, Honghao Gui, Huajun Chen, Ningyu Zhang
GPT-SW3: An Autoregressive Language Model for the Nordic Languages (2023.05.22)
Ariel Ekgren, Amaru Cuba Gyllensten, F. Stollenwerk, Joey Ohman, Tim Isbister, etc
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination (2023.05.22)
Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, Min Zhang
Infor-Coef: Information Bottleneck-based Dynamic Token Downsampling for Compact and Efficient language model (2023.05.21)
Wenxin Tan
Contrastive Learning with Logic-driven Data Augmentation for Logical Reasoning over Text (2023.05.21)
Qiming Bao, Alex Yuxuan Peng, Zhenyun Deng, Wanjun Zhong, Neset Tan, etc
Retrieving Texts based on Abstract Descriptions (2023.05.21)
Shauli Ravfogel, Valentina Pyatkin, Amir D. N. Cohen, Avshalom Manevich, Yoav Goldberg
Pruning Pre-trained Language Models with Principled Importance and Self-regularization (2023.05.21)
Siyu Ren, Kenny Q. Zhu
Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers (2023.05.21)
Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, etc
Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs (2023.05.20)
Yatin Nandwani, Vineet Kumar, Dinesh Raghu, Sachindra Joshi, L. Lastras
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization (2023.05.19)
Mengqi Huang, Zhendong Mao, Zhuowei Chen, Yongdong Zhang . - 【arXiv.org】
Late-Constraint Diffusion Guidance for Controllable Image Synthesis (2023.05.19)
Chang Liu, Dong Liu . - 【arXiv.org】
Self-QA: Unsupervised Knowledge Guided Language Model Alignment (2023.05.19)
Xuanyu Zhang, Qing Yang
Self-Agreement: A Framework for Fine-tuning Language Models to Find Agreement among Diverse Opinions (2023.05.19)
Shiyao Ding, Takayuki Ito . - 【arXiv.org】
BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases (2023.05.19)
Xin Liu, Muhammad Khalifa, Lu Wang
STOAT: Structured Data to Analytical Text With Controls (2023.05.19)
Deepanway Ghosal, Preksha Nema, A. Raghuveer . - 【arXiv.org】
Decouple knowledge from paramters for plug-and-play language modeling (2023.05.19)
Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan . - 【arXiv.org】
Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona (2023.05.19)
Yihong Tang, Bo Wang, Miao Fang, Dongming Zhao, Kun Huang, etc . - 【arXiv.org】
LLM Itself Can Read and Generate CXR Images (2023.05.19)
Suhyeon Lee, Won Jun Kim, Jong-Chul Ye . - 【arXiv.org】
Post Hoc Explanations of Language Models Can Improve Language Models (2023.05.19)
Satyapriya, Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, etc . - 【arXiv.org】
Federated Foundation Models: Privacy-Preserving and Collaborative Learning for Large Models (2023.05.19)
Sixing Yu, J. P. Muñoz, A. Jannesari . - 【arXiv.org】
Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning (2023.05.19)
Po-Nien Kung, Nanyun Peng . - 【arXiv.org】
Democratized Diffusion Language Model (2023.05.18)
Nikita Balagansky, Daniil Gavrilov . - 【arXiv.org】
VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation (2023.05.18)
Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, etc . - 【arXiv.org】
LDM3D: Latent Diffusion Model for 3D (2023.05.18)
Gabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, etc . - 【arXiv.org】
Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling (2023.05.18)
Shitong Shao, Xu Dai, Shouyi Yin, Lujun Li, Huanran Chen, etc . - 【arXiv.org】
Ahead-of-Time P-Tuning (2023.05.18)
Daniil Gavrilov, Nikita Balagansky . - 【arXiv.org】
Zero-Day Backdoor Attack against Text-to-Image Diffusion Models via Personalization (2023.05.18)
Yihao Huang, Qing Guo, Felix Juefei-Xu . - 【arXiv.org】
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation (2023.05.18)
Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng . - 【arXiv.org】
How does the task complexity of masked pretraining objectives affect downstream performance? (2023.05.18)
Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa . - 【arXiv.org】
Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings (2023.05.18)
Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, etc . - 【arXiv.org】
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval (2023.05.18)
Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, etc . - 【arXiv.org】
LIMA: Less Is More for Alignment (2023.05.18)
Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, etc . - 【arXiv.org】
The Web Can Be Your Oyster for Improving Large Language Models (2023.05.18)
Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jingyuan Wang, J. Nie, etc . - 【arXiv.org】
TOME: A Two-stage Approach for Model-based Retrieval (2023.05.18)
Ruiyang Ren, Wayne Xin Zhao, J. Liu, Huaqin Wu, Ji-rong Wen, etc . - 【arXiv.org】
When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario (2023.05.17)
Chengcheng Han, Liqing Cui, Renyu Zhu, J. Wang, Nuo Chen, etc . - 【arXiv.org】
SLiC-HF: Sequence Likelihood Calibration with Human Feedback (2023.05.17)
Yao Zhao, Rishabh Joshi, Tianqi Liu, Misha Khalman, Mohammad Saleh, etc . - 【arXiv.org】
LeTI: Learning to Generate from Textual Interactions (2023.05.17)
Xingyao Wang, Hao Peng, Reyhaneh Jabbarvand, Heng Ji . - 【arXiv.org】
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models (2023.05.17)
Chuang Liu, Renren Jin, Yuqi Ren, Linhao Yu, Tianyu Dong, etc . - 【arXiv.org】
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling (2023.05.17)
Weijia Xu, Andrzej Banburski-Fahey, N. Jojic . - 【arXiv.org】
Prompt Engineering for Healthcare: Methodologies and Applications (2023.04.28)
Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, etc . - 【arXiv.org】
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond (2023.04.26)
Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, etc
A Survey of Large Language Models (2023.03.31)
On the Creativity of Large Language Models (2023.03.27)
Giorgio Franceschelli, Mirco Musolesi . - 【arXiv.org】
Augmented Language Models: a Survey (2023.02.15)
Grégoire Mialon, Roberto Dessì, M. Lomeli, Christoforos Nalmpantis, Ramakanth Pasunuru, etc . - 【ArXiv】
A Survey for In-context Learning (2022.12.31)
Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, etc . - 【ArXiv】
Towards Reasoning in Large Language Models: A Survey (2022.12.20)
Jie Huang, K. Chang . - 【ArXiv】
Reasoning with Language Model Prompting: A Survey (2022.12.19)
Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, etc . - 【ArXiv】
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (2021.07.28)
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, etc . - 【ACM Computing Surveys】
Aligning Large Language Models with Human: A Survey
Yufei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, etc . - 【arXiv.org】