samples from three Gaussian distribution for each dataset described below:
Dataset1:
Dataset2:
Consider the first 80% of the data in each class for train and the rest 20% for test and use a Bayesian classifier to classify both the train and test datasets.
Both datasets have equal mean but different covariances, which causes the difference in their cross-section. In Dataset1, the covariance of each class is diagonal, and the elements on the main diagonal are equal, so the cross-sections of the classes are circular, but because the covariance of each class is different from the others (values on the main diagonal ), the size and stretch of the cross-sections of the classes are different. As it is clear in the three-dimensional diagram, class2, which has more variance, has more horizontal stretch(positive kurtosis), and class3, which has the least variance, is narrower and more vertically stretched(negative kurtosis). In Dataset2, the covariances are not diagonal and the cross-section of the classes are ovals with different sizes and in different directions(they are in the direction of stronger features).
In Bayesian classification with one Gaussian distribution, we considered the covariance matrix of the classes to be the same, therefore, the cross-section of the classes had the same shape and size and the decision boundary was linear, but in this project, due to the inequality of the covariances, the boundary separating the classes becomes non-linear.