Submission for the NeurIPS Large Language Model Efficiency Challenge

Members: Anmol Agarwal, Ajinkya Deshpande, Shashank Shet, Arun Iyer, Suresh Parthasarathy

Affiliation: Microsoft Research India

EDIT: the current submission is ranked 2nd across all entries when evaluated on a combination of both the hidden-secret-eval tasks, and public HELM tasks.

Our datasets were derived from CNN-DM, MMLU, BigBench, TruthfulQA, BBQ, ARC (by AllenAI), GSM-8k and MathQA (by Hendrycks). We not only include data from the tasks mentioned in the sample_conf file, but also include other diverse tasks from BigBench.

In order to add robustness and fairness to our models, we also include special queries in our dataset which have been perturbed to measure robustness and fairness in the same way as HELM does perturbations. Our initial findings suggested that models like Mistral are very sensitive to the sequence in which various options are presented, so we also shuffled the options in the query to introduce option permutation invariance which led to slight gains in performance.

We also use an ensemble of models. For each query, we classify into which sort of task does it belong to (fact-based knowledge based task OR reasoning based-task) using Regex.

For any queries, please contact at either of the following emails: t-agarwalan@microsoft.com, anmolagarwal4453@gmail.com, ariy@microsoft.com or feel free to open an issue. We operate from the +5:30 GMT timezone and hence, it might take us a day to respond.

The dataset our winning submission uses is: https://huggingface.co/datasets/ajdesh2000/pegasus_combined_general_train_dataset

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
finetuning_dir/training_dockers_for_models/training_dir_for_reproduction		finetuning_dir/training_dockers_for_models/training_dir_for_reproduction
inference_script/run_inference_1/lit-gpt		inference_script/run_inference_1/lit-gpt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Submission for the NeurIPS Large Language Model Efficiency Challenge

About

Releases

Packages

Languages

anmolagarwal999/Submission-NeurIPS-Large-Language-Model-Efficiency-Challenge-2023

Folders and files

Latest commit

History

Repository files navigation

Submission for the NeurIPS Large Language Model Efficiency Challenge

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages