Indonesian Social Media Text Toxicity Classification. In summary, our contributions are:
- Created Indonesian social media post toxicity dataset (with 4 labels: pornography, racism, radicalism, and hate speech)
- Performed exploratory data analysis, data preprocessing, and modelling for toxicity content classification task
- Compared various machine learning model performance on this task
├── LICENSE
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
└── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
and short description
- Ahmad Izzan
- Christian Wibisono
- Ilham Firdausi Putra