- [深入理解语言模型的突现能力]
- [拆解追溯 GPT-3.5 各项能力的起源]
- [Towards Complex Reasoning: the Polaris of Large Language Models]
- [GPT3] Language Models are Few-Shot Learners, NeurIPS 2020
- Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity, ACL2022
- Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
- Transformers learn in-context by gradient descent
- Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning
- [InstructGPT] Training language models to follow instructions with human feedback, NeurIPS2022
- [T0] Multitask Prompted Training Enables Zero-Shot Task Generalization, ICLR2022
- [Flan-T5/PaLM] Scaling Instruction-Finetuned Language Models
- [Flan2020] The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
- InstructDial: Improving Zero and Few-shot Generalization in Dialogue, EMNLP2022
- Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes,ACL2023finding
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, NeurIPS 2022
- Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models
- Self-Consistency Improves Chain of Thought Reasoning in Language Models, ICLR2023
- Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning
- Multimodal Chain-of-Thought Reasoning in Language Models
- Complexity-Based Prompting for Multi-Step Reasoning
- Generated Knowledge Prompting for Commonsense Reasoning,ACL2022
- Generating Training Data with Language Models: Towards Zero-Shot Language Understanding, NeurIPS2022
- SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions
- Generate rather than Retrieve: Large Language Models are Strong Context Generators, ICLR2023