Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request for ML-Crate 💡
Issue Title: Predicting Molecular Properties #636
Closes: #636
Describe the add-ons or changes you've made 📃
Give a clear description of what have you added or modifications made
Modifications Made:
Added interactive 3D visualizations of molecular structures using Plotly, color-coded by atom type, and implemented animations to highlight atoms and bonds.
Analyzed and visualized the distribution of unique atom counts with histograms. Conducted extensive EDA, visualizing scalar coupling constants, Mulliken charges, dipole moments, and potential energy.
Created new features based on atomic interactions, standardized molecular data, and trained multiple regression models to predict scalar coupling constants, evaluating performance with MAE, MSE, and R².
Type of change ☑️
What sort of change have you made:
How Has This Been Tested? ⚙️
Describe how it has been tested
Describe how have you verified the changes made
The changes were tested and verified through various steps. Interactive 3D visualizations were rendered and cross-checked against known molecular structures.
Histograms of unique atom counts were generated and validated by comparing them with raw data. EDA visualizations were inspected for accuracy and consistency.
Newly engineered features were analyzed for relevance and contribution to model performance.
Finally, regression models (Linear Regressor, Random Forest Regressor, K Nearest Neighbors, Support Vector Regressor, Decision Tree Regressor, and Simple Feed Forward Neural Network) were evaluated against benchmarks, and the entire workflow was validated for reproducibility in a clean environment.
Checklist: ☑️