Anti-Profanity

A Highly advanced FastText Classifier/Detector AI for Profanity, for filtering games, forums, and any other user-based text input.
Built using Facebook's FastText vectorizer and text classifier, and using Swastik Gupta's tutorial on "Profanity Detection with FastText".

What's in the Repo

preprocess.py - This file is used to format test.csv into processed-train.txt using a CSV reader, and NLTK's Lemmatizer and Stop Word Library
train.py - Used to Train the FastText Classifier on processed-train.txt, and then quantize, save, and test the model
test.py - Used to simply load the model and test with words
train.csv - Just the training data from Kaggle's Toxic Comment Classification Challenge

Usage?

Just install:
pip install fasttext nltk
And then run:
python preprocess.py
to process the original training data. (Although I provided the output, it took a while to process. The last time I grabbed the training data was August 14, 2024)
Finally, run:
python train.py
to train the data and quantize it. It should take 5 to 10 minutes to complete, depending on the hardware.
You should now have a model! Congrats!\

Credits

Facebook - Thanks for your amazing Fast Text AI, highly appreciated
Kaggle - Thanks for your work on putting all toxic comments in one place
NLTK - Thanks for your library to remove useless noise in the English Language
Swastik Gupta - For putting the idea together, you really should be the one to release this
My savior, Jesus Christ - The sole reason my life has a purpose. If you want to see his work, just look at everything that is good.

This was created by PonderForge, if you use this code, give credit where credit is due.
Pslam 111:2 "Great are the works of the LORD; they are pondered by all who delight in them."\

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
preprocess.py		preprocess.py
processed-train.txt		processed-train.txt
test.py		test.py
test.txt		test.txt
train.csv		train.csv
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anti-Profanity

What's in the Repo

Usage?

Credits

About

Releases 1

Languages

License

PonderForge/Anti-Profanity

Folders and files

Latest commit

History

Repository files navigation

Anti-Profanity

What's in the Repo

Usage?

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages