From f976f874e880816d666647982e0e737ca890aaae Mon Sep 17 00:00:00 2001
From: Keshav Arora <119474193+CoderOMaster@users.noreply.github.com>
Date: Thu, 1 Feb 2024 18:36:48 +0530
Subject: [PATCH] Create README.md

---
 Toxic Comment Analysis/README.md | 66 ++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)
 create mode 100644 Toxic Comment Analysis/README.md

diff --git a/Toxic Comment Analysis/README.md b/Toxic Comment Analysis/README.md
new file mode 100644
index 000000000..4985e49d5
--- /dev/null
+++ b/Toxic Comment Analysis/README.md	
@@ -0,0 +1,66 @@
+# TOXIC COMMENT ANALYSIS
+
+## GOAL
+Develop a machine learning model to tell whether a comment is toxic or not
+
+## DATASET
+Explore https://www.kaggle.com/datasets/devkhant24/toxic-comment
+
+## MODELS USED
+- Naive Bayes
+- Random Forest
+- Catboost
+- Decision Tree
+- Bidirectional LSTM
+- RNN
+- Logistic Regression
+
+## LIBRARIES
+- Pandas
+- Numpy
+- TensorFlow
+- Seaborn
+- Matplotlib
+- Scikit-Learn
+- OS
+- Re
+- Math
+- Beautiful Soup
+- NLTK
+- Spacy
+
+## IMPLEMENTATION
+1. Loaded Dataset
+2. Converted into standard csv file and renamed columns for ease.
+3. Implemented cleaning and preprocessing to remove any emojis,symbols,links,etc
+4. Classified toxic comment on the basis if intensity of angered comment > 0.55 then its toxic.
+5. Implement tokenization for sequence conversion.
+6. Trained models with various algorithms.
+
+## Models and Accuracies
+
+| Model             | Accuracy   | 
+| ----------------- |:----------:| 
+| Naive Bayes       | 0.77       |                    
+| Random Forest     | 0.76       |                    
+| Catboost          | 0.74       |                    
+| Logistic Regression| 0.77      | 
+| Decision Tree      | 0.73      |
+| RNN                | 0.69      |
+| Bidirectional LSTM | 0.68      |
+
+**VISUALISATION**
+
+![Alt Text](./Images/1.png)
+
+![Alt Text](./Images/2.png)
+
+![Alt Text](./Images/3.png)
+
+**CONCLUSION**
+
+Naive Bayes and Logistic Regression Model have the best accuracy in detecting toxicity of a comment
+
+**NAME**
+
+Keshav Arora