From the classic Titanic machine learning (ML) competition on Kaggle website. The very first steps to dive into ML competitions and familiarize with prediction systems.
The approach shown here makes use of classic ML-based techniques to create a model that predicts which passengers survived the Titanic shipwreck. Python file.
Basically, the process involves:
* Feature engineering
* Data analysis
* Age prediction
* Fares estimation
* Data cleaning
* Data normalization and scaling
* Automatic classification
Composed of 891 rows with individual information each that includes: PassengerID, Survived value, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin and Embarked.
Composed of 418 rows with individual information each that includes: PassengerID, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin and Embarked.
The feature correlation map shows if there exists any statistical association between two variables. The closer value to 1 or -1, the higher the correlation. For instance, th e Survived feature is highly correlated to Pclass and Fares
Now, The Survived feature is highly correlated to Pclass and Sex
Different classification methods were applied in the training set.