Skip to content

A Machine Learning approach to detect Malwares in the system.

License

Notifications You must be signed in to change notification settings

Ankit152/MalwareDetection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malware Detection 🔍

A Machine Learning approach to detect Malwares in the system.

Malware 👾

Malware is malicious software that was intentionally developed to infiltrate or damage a computer system without consent of the owner. This includes, among others, viruses, worms, and Trojan horses.

Shorthand for malicious software, malware typically consists of code developed by cyberattackers, designed to cause extensive damage to data and systems or to gain unauthorized access to a network. Once installed on a system, malware can cause a wide range of problems, from stealing personal information to destroying critical data. The impact of malware on individuals and organizations can be devastating. For individuals, malware can result in identity theft, financial loss, and loss of privacy. For organizations, malware can cause significant financial and reputational damage, as well as loss of sensitive data.

Malware detection and Removal 🕵️

Malware detection refers to the process of detecting the presence of malware on a host system or of distinguishing whether a specific program is malicious or benign.

Here I have performed many Machine Learning Algorithm so that I can check which algorithm is performing better on classifying whether a specific program is malicious or not.

Countplot of Legitimate 📊

Logistic Regression Performance

The Logistic Regression model gave an accuracy of 98% which is really good. This is my first cut solution for detecting Malwares.

Gradient Boosting Performance

The Gradient Boosting model performed much better than Logistic Regression and gave an accuracy of 98.88%.

MLP Classifier Performance

The MLP Classifier model gave an accuracy of 99% which is better than Logistic Regression and Gradient Boosting.

Decision Tree Performance

The Decision Tree model gave an accuracy of 99.04% which is slightly better than MLP Classifier but not any drastic increase in performance.

Random Forest Performance 💯

The Random Forest model performed the best!! It gave an accuarcy of 99.54%. It is my final solution.

The best model and the scaler is saved and present in the asset folder. Do give it a try.

License

This project is licensed under the MIT License.