Mitigating gender bias in text classification models by removing the stereotypical examples (paper). We use the classifiers from 🤗 Hugging Face (link).
Healthy data diet can be summarized as follows:
- Generating the counterfactual examples
- Finding the important examples for fairness using the GE score
- Adding a pruned version of the original dataset to the important counterfactual examples
Follow the instructions in the healthy_data_diet_AAAI23.ipynb
file to run the experiments in the paper.
@article{zayed2022deep,
title={Deep Learning on a Healthy Data Diet: Finding Important Examples for Fairness},
author={Zayed, Abdelrahman and Parthasarathi, Prasanna and Mordido, Goncalo and Palangi, Hamid and Shabanian, Samira and Chandar, Sarath},
journal={arXiv preprint arXiv:2211.11109},
year={2022}
}