Automated-Essay-Scoring-2.0

Overview

GitHub repo of the 1,495th place solution for the Kaggle Competition : Automated Essay Scoring 2.0 by the Learning Agency Lab.

Final performance:

0.821 private LB
0.808 public LB

Usage

This project was run on Kaggle using the T4 GPUs for accelerated training and inference. To replicate the setup:

Set Up Kaggle Environment:
- Navigate to Kaggle and create an account if you don’t have one.
- Create a new notebook and select the T4 x 2 GPU accelerator.
Import Data and Code:
- Import the competition dataset
- Upload code from this repository to your Kaggle notebook.
Run each LLM models-preload python files
Run inference python file
- Import post-run files to the inference Kaggle notebook

Final Approach

Fine-Tuning LLMs: Using Hugging Face AutoModelForSequenceClassification (num_labels=1), three pre-trained LLMs were fine-tuned to the dataset.
- Longformer - 2048 max embeddings
- DeBerta V3 small - 1024 max embeddings
- XLNet - 1024 max embeddings
Ensemble Averaging: Predictions from the three fine-tuned models were averaged.
Optuna Hyperparameter Tuning: Generated optimized thresholds for classification. (source)

Approaches Not Included in Final Submission

Early submissions scored as low as LB 0.59 and gradually improved over the competition. Things that did not improve performance:

Embedding extraction with MLP Classifier
Adding persuade corpus 2.0 data into the training set
Stratified K-fold training
Fine-tuning LLMs with 512 max embeddings
Ensembling 6+ models
AutoModelForSequenceClassification (num_labels=6)

Contact

For any questions or comments, please connect with me on LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
LICENSE		LICENSE
README.md		README.md
deberta-models-preload.py		deberta-models-preload.py
fig1.png		fig1.png
inference.py		inference.py
longformer-models-preload.py		longformer-models-preload.py
xlnet-models-preload.py		xlnet-models-preload.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated-Essay-Scoring-2.0

Overview

Usage

Final Approach

Approaches Not Included in Final Submission

Contact

About

Releases

Packages

Languages

License

nolan-clark/Automated-Essay-Scoring-2.0

Folders and files

Latest commit

History

Repository files navigation

Automated-Essay-Scoring-2.0

Overview

Usage

Final Approach

Approaches Not Included in Final Submission

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages