ML moderation bot for Reddit.
Uses machine learning to flag down comments that are possibly toxic.
Based from python chatbot by Tech with Tim.
- Requires Python 3.6. Virtual environment recommended.
- Use
pip3 install -r requirements.txt
to install dependencies. - Configure the
praw.ini
file, located atenv/lib/python3.6/site-packages/praw
, with your bot tokens. PRAW Documentation - Edit the
settings.json
to match your PRAW configuration. Copy the sample configuration from thesample-config
folder. - Create the
intents.json
file in thetraining
folder. You can copy the sample configuration from thesample-config
folder or modify the categories and labels.
Use this method if there is no model trained yet. This method will gather the latest comments and output them. Sort the comments and it will automatically populate the intents.json
file for you.
- Run the
collecter.py
script. - Comments will be collected automatically and will await user input.
- Type:
- a: If the comment is acceptable.
- n: If the comment is neutral.
- w: If the comment is considered to be a warning.
- Type any other character to skip the entry.
- Press Enter to submit.
- Rebuild the model as needed.
This method will gather the latest comments and output them. It will display what it currently thinks a comment is categorized as. Sort the comments and it will automatically populate the intents.json
file for you.
- Run the
collecter_trainer.py
script. - Comments will be collected automatically and will await user input.
- Type:
- a: If the comment is acceptable.
- n: If the comment is neutral.
- w: If the comment is considered to be a warning.
- Type any other character to skip the entry.
- Press Enter to submit.
- Rebuild the model as needed.
Manual entry method.
- Run the
self_assign.py
script. - Enter a comment, it will be sanitized and re-outputted.
- Type:
- a: If the comment is acceptable.
- n: If the comment is neutral.
- w: If the comment is considered to be a warning.
- Type any other character to skip the entry.
- Press Enter to submit.
- Comments are sanitized of all punctuation and potential offending characters.
- Training data and models have been removed for public release.