[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
-
Updated
Oct 20, 2024 - Jupyter Notebook
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
DSIR large-scale data selection framework for language model training
A Survey on Data Selection for Language Models
⛔ [DEPRECATED] Adapt Transformer-based language models to new text domains
InstructionGPT-4
[ACL 2023] The code for our ACL'23 paper Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach
Enhancing Efficiency in Multidevice Federated Learning through Data Selection
This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (NeurIPS 2023).
Enhanced spatio-temporal electric load forecasts with less data using active deep learning
Keras sentence classification
Repository for the experiments in my paper accepted to the CLIN Journal: "Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts"
Dynamic Transfer Learning for Low-Resource Neural Machine Translation
Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)
A Python package for studying neural learning
This repository contains the data and code for the paper "Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation" (EMNLP2022-Findings).
CORE: Mitigating Catastrophic Forgetting in Continual Learning through Cognitive Replay (CogSci 2024 Oral)
Code for Generative Deduplication For Socia Media Data Selection (Findings of EMNLP 2024)
Quilt: Robust Data Segment Selection against Concept Drifts (AAAI 2024)
A project to select only part of a PDF file. It's usefull when you want to extract informations with some python library like fitz.
Add a description, image, and links to the data-selection topic page so that developers can more easily learn about it.
To associate your repository with the data-selection topic, visit your repo's landing page and select "manage topics."