This repository is tracking basic review from data analysis to advantage machine learning skills.
This repository depends on conda, so you must install conda at first, you can choose anaconda or miniconda. Then you should run bellow commands to setup a conda environment:
~ git clone git@github.com:classtag/machine-learning-notebook.git
~ cd machine-learning-notebook
~ conda env create -f environment.yml
~ conda activate machine-learning-notebook
~ ./run.sh
- Numpy for linear algebra
- Pandas for data analysis
- Matpltlib for data visualization
- Seaborn for easier data visualization
- Linear regression algorithm principle derivation
- Gradient descent strategy
- Logistic regression
- Decision tree algorithm
- Ensemble algorithm and random forest
- Bayesian
- Support vector machine
- Cluster KMeans
- Cluster DBSCAN
- Dimension reduction algorithm -PCA principal component analysis
- Neural network
- XGBoost
- Word2Vec
- Python implements logistic regression and gradient descent strategies
- Abnormal transaction data detection
- Build decision tree model with scikit-learn
- Predict survival on the Titanic
- News classification task
- Tuning SVM
- Practice cluster alorithms
- Gensim library is used to construct the vector model of Chinese wiki baidu data words
- Scikit-learn modeling and evaluation
- Analyzes kobe's career
- Time series analysis
- Maximize profits from loan applications
- User loss warning
- EDA Football match dataset
- EDA Fao dataset
- HTTP log clustering analysis