Skip to content

Predicting toxicity of molecules. Project on course "Data Mining 2"

Notifications You must be signed in to change notification settings

vasatodorovic/ToxicityOfMolecules

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting Toxicity of Molecules Affecting CRY1 Protein Function

This repository contains the code and materials for a research project focused on predicting the toxicity of molecules influencing the function of the CRY1 protein.

Overview

In this project, we employed various machine learning techniques to develop predictive models for assessing the toxicity of molecules. The primary techniques and methodologies used include:

  • Feature Selection: We applied feature selection methods to identify the most relevant molecular features for toxicity prediction, enhancing model performance and interpretability. Methods that we applied are Variance Threshold, Select K Best and RFE.

  • SMOTE (Synthetic Minority Over-sampling Technique): To handle class imbalance in the dataset, we utilized SMOTE to generate synthetic samples of the minority class, improving model training.

  • Decision Trees: Decision tree models were employed to capture non-linear relationships within the data and provide valuable insights into the toxicity prediction process.

  • Support Vector Machines (SVM): SVMs were used for their ability to handle complex classification tasks and maximize predictive accuracy.

  • Ensemble Methods: Ensemble techniques, such as Bagging, AdaBoost and Random Forest, were implemented to combine the strengths of multiple models and enhance overall predictive performance.