delete machine learning

MakieOrg · Jan 15, 2025 · fa65c1a · fa65c1a
1 parent 894bda2
commit fa65c1a
Showing 1 changed file with 0 additions and 79 deletions.
diff --git a/docs/src/generated/penguins.jl b/docs/src/generated/penguins.jl
@@ -226,82 +226,3 @@ draw(plt; axis = axis)
 # Note that static 3D plot can be misleading, as they only show one projection
 # of 3D data. They are mostly useful when shown interactively.
 #
-# ## Machine Learning
-#
-# Finally, let us use Machine Learning techniques to build an automated penguin classifier!
-#
-# We would like to investigate whether it is possible to predict the species of a penguin
-# based on its bill size. To do so, we will use a standard classifier technique
-# called [Support-Vector Machine](https://en.wikipedia.org/wiki/Support-vector_machine).
-#
-# The strategy is quite simple. We split the data into training and testing
-# subdatasets. We then train our classifier on the training dataset and use it to
-# make predictions on the whole data. We then add the new columns obtained this way
-# to the dataset and visually inspect how well the classifier performed in both
-# training and testing.
-
-using LIBSVM, Random
-
-## use approximately 80% of penguins for training
-Random.seed!(1234) # for reproducibility
-N = nrow(penguins)
-train = fill(false, N)
-perm = randperm(N)
-train_idxs = perm[1:floor(Int, 0.8N)]
-train[train_idxs] .= true
-nothing # hide
-
-## fit model on training data and make predictions on the whole dataset
-X = hcat(penguins.bill_length_mm, penguins.bill_depth_mm)
-y = penguins.species
-model = SVC() # Support-Vector Machine Classifier
-fit!(model, X[train, :], y[train])
-ŷ = predict(model, X)
-
-## incorporate relevant information in the dataset
-penguins.train = train
-penguins.predicted_species = ŷ
-nothing #hide
-
-# Now, we have all the columns we need to evaluate how well our classifier performed.
-
-axis = (width = 225, height = 225)
-dataset =:train => renamer(true => "training", false => "testing") => "Dataset"
-accuracy = (:species, :predicted_species) => isequal => "accuracy"
-plt = data(penguins) *
-    expectation() *
-    mapping(:species, accuracy) *
-    mapping(col = dataset)
-draw(plt; axis = axis)
-
-# That is a bit hard to read, as all values are very close to `1`.
-# Let us visualize the error rate instead.
-
-error_rate = (:species, :predicted_species) => !isequal => "error rate"
-plt = data(penguins) *
-    expectation() *
-    mapping(:species, error_rate) *
-    mapping(col = dataset)
-draw(plt; axis = axis)
-
-# So, mostly our classifier is doing quite well, but there are some mistakes,
-# especially among `Chinstrap` penguins. Using *at the same time* the `species` and
-# `predicted_species` mappings on different attributes, we can see which penguins
-# are problematic.
-
-prediction = :predicted_species => "predicted species"
-datalayer = mapping(color = prediction, row = :species, col = dataset)
-plt = penguin_bill * datalayer
-draw(plt; axis = axis)
-
-# Um, some of the penguins are indeed being misclassified... Let us try to understand why
-# by adding an extra layer, which describes the density of the distributions of the three
-# species.
-
-pdflayer = density() * visual(Contour, colormap=Reverse(:grays)) * mapping(group = :species)
-layers = pdflayer + datalayer
-plt = penguin_bill * layers
-draw(plt; axis = axis)
-
-# We can conclude that the classifier is doing a reasonable job:
-# it is mostly making mistakes on outlier penguins.