4 Plotlycloth Walkthrough (experimental 🛠)
+Plotlycloth is a Clojure API for creating Plotly.js plots through layered pipelines. It is part of the Hanamicloth library.
+Here, we provide a walkthrough of the API.
+🛠 This part of Hanamicloth is still in experimental stage. Some of the details will change soon. Feedback and comments will help.
+Soon, we will provide more in-depth explanations in additional chapters.
+4.1 Setup
+For this tutorial, we require:
+-
+
The plotlycloth API namepace
+Tablecloth for dataset processing
+the datetime namespace of dtype-next
+the print namespace of tech.ml.dataset for customized dataset printing
+Kindly (to specify how certaiun values should be visualized)
+the datasets defined in the Datasets chapter
+
ns hanamicloth-book.plotlycloth-walkthrough
+ (:require [scicloj.hanamicloth.v1.plotlycloth :as ploclo]
+ (:as tc]
+ [tablecloth.api :as datetime]
+ [tech.v3.datatype.datetime print :as print]
+ [tech.v3.dataset.:as kind]
+ [scicloj.kindly.v4.kind :as str]
+ [clojure.string :as kindly]
+ [scicloj.kindly.v4.api :as datasets])) [hanamicloth-book.datasets
4.2 Basic usage
+Plotlycloth plots are created by passing datasets to a pipeline of layer functions.
+Additional parameters to the functions are passed as maps. By convention, the map keys begin with =
(e.g., :=color
).
For example, let us plot a scatterplot (a layer of points) of 10 random items from the Iris dataset.
+-> datasets/iris
+ (10 {:seed 1})
+ (tc/random
+ (ploclo/layer-point:sepal-width
+ {:=x :sepal-length
+ :=y :species
+ :=color 20
+ :=mark-size 0.6})) :=mark-opacity
4.3 Templates and parameters
+(💡 You do neet need to understand these details for basic usage.)
+Technically, the parameter maps contain Hanami substitution keys, which means they are processed by a simple set of rules, but you do not need to understand what this means yet.
+The layer functions return a Hanami template. Let us print the resulting structure of the previous plot.
+def example1
+ (-> datasets/iris
+ (10 {:seed 1})
+ (tc/random
+ (ploclo/layer-point:sepal-width
+ {:=x :sepal-length
+ :=y :species
+ :=color 20
+ :=mark-size 0.6}))) :=mark-opacity
(kind/pprint example1)
:data :=traces,
+ {:layout :=layout,
+ :aerial.hanami.templates/defaults
+ :com.rpl.specter.impl/NONE,
+ {:=x0
+ :=y-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=x0-after-stat :=x0,
+ :=layers:y :=y-after-stat,
+ [{:trace-base {:mode :=mode, :type :=type, :opacity :=mark-opacity},
+ :color-type :=color-type,
+ :group :=group,
+ :color :=color,
+ :mark :=mark,
+ :x-title :=x-title,
+ :name :=name,
+ :y1 :=y1-after-stat,
+ :size :=size,
+ :size-type :=size-type,
+ :aerial.hanami.templates/defaults
+ :com.rpl.specter.impl/NONE,
+ {:=x0
+ :=y-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=x0-after-stat :=x0,
+ :=layers [],:com.rpl.specter.impl/NONE,
+ :=x1 :com.rpl.specter.impl/NONE,
+ :=title :com.rpl.specter.impl/NONE,
+ :=y1
+ :=y-type-after-stat48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=height :com.rpl.specter.impl/NONE,
+ :=name 0.6,
+ :=mark-opacity
+ :=inferred-group48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=mode48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=y-title :com.rpl.specter.impl/NONE,
+ :=size
+ :=group :=inferred-group,:com.rpl.specter.impl/NONE,
+ :=y0 20,
+ :=mark-size
+ :=size-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:species,
+ :=color :com.rpl.specter.impl/NONE,
+ :=mark-color
+ :=y1-after-stat :=y1,:sepal-width,
+ :=x
+ :=x-after-stat :=x,"rgb(255,255,255)",
+ :=yaxis-gridcolor
+ :=type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=x-type-after-stat48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=traces48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=x-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--10,
+ :=histogram-nbins :com.rpl.specter.impl/NONE,
+ :=stat :com.rpl.specter.impl/NONE,
+ :=width
+ :=color-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--"rgb(255,255,255)",
+ :=xaxis-gridcolor :point,
+ :=mark
+ :=dataset-after-stat48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=x-title
+ :=layout48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:sepal-length,
+ :=y
+ :=x1-after-stat :=x1,@28277720: datasets/iris [10 6]:
+ :=dataset #<WrappedValue
+:rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
+ |
+ |----------:|--------------:|-------------:|--------------:|-------------:|------------|27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
+ | 97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
+ | 127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
+ | 92 | 6.1 | 3.0 | 4.6 | 1.4 | versicolor |
+ | 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
+ | 95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
+ | 125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
+ | 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
+ | 73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
+ | 42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
+ | >,
+"rgb(229,229,229)",
+ :=background
+ :=y0-after-stat :=y0,
+ :=y-after-stat :=y,
+ :=predictors [:=x],
+ :=marker-size-key48211]},
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:y0 :=y0-after-stat,
+ :inferred-group :=inferred-group,
+ :marker-override
+ :color :=mark-color, :=marker-size-key :=mark-size},
+ {:x :=x-after-stat,
+ :x1 :=x1-after-stat,
+ :x0 :=x0-after-stat,
+ :y-title :=y-title,
+ :dataset :=dataset-after-stat}],
+ :com.rpl.specter.impl/NONE,
+ :=x1 :com.rpl.specter.impl/NONE,
+ :=title :com.rpl.specter.impl/NONE,
+ :=y1
+ :=y-type-after-stat48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=height :com.rpl.specter.impl/NONE,
+ :=name :com.rpl.specter.impl/NONE,
+ :=mark-opacity
+ :=inferred-group48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=mode48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=y-title :com.rpl.specter.impl/NONE,
+ :=size
+ :=group :=inferred-group,:com.rpl.specter.impl/NONE,
+ :=y0 :com.rpl.specter.impl/NONE,
+ :=mark-size
+ :=size-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=color :com.rpl.specter.impl/NONE,
+ :=mark-color
+ :=y1-after-stat :=y1,:x,
+ :=x
+ :=x-after-stat :=x,"rgb(255,255,255)",
+ :=yaxis-gridcolor
+ :=type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=x-type-after-stat48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=traces48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--
+ :=x-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--10,
+ :=histogram-nbins :com.rpl.specter.impl/NONE,
+ :=stat :com.rpl.specter.impl/NONE,
+ :=width
+ :=color-type48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--"rgb(255,255,255)",
+ :=xaxis-gridcolor :point,
+ :=mark
+ :=dataset-after-stat48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:com.rpl.specter.impl/NONE,
+ :=x-title
+ :=layout48211],
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:y,
+ :=y
+ :=x1-after-stat :=x1,@28277720: datasets/iris [10 6]:
+ :=dataset #<WrappedValue
+:rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
+ |
+ |----------:|--------------:|-------------:|--------------:|-------------:|------------|27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
+ | 97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
+ | 127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
+ | 92 | 6.1 | 3.0 | 4.6 | 1.4 | versicolor |
+ | 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
+ | 95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
+ | 125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
+ | 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
+ | 73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
+ | 42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
+ | >,
+"rgb(229,229,229)",
+ :=background
+ :=y0-after-stat :=y0,
+ :=y-after-stat :=y,
+ :=predictors [:=x],
+ :=marker-size-key48211]},
+ #function[scicloj.hanamicloth.v1.dag/fn-with-deps-keys/fn--:kindly/f #'scicloj.hanamicloth.v1.plotlycloth/plotly-xform}
This template has all the necessary knowledge, including the substitution keys, to turn into a plot. This happens when your visual tool (e.g., Clay) displays the plot. The tool knows what to do thanks to the Kindly metadata and a special function attached to the plot.
+meta example1) (
:kindly{:kind :kind/fn} #
:kindly/f example1) (
#'scicloj.hanamicloth.v1.plotlycloth/plotly-xform
4.4 Realizing the plot
+If you wish to see the resulting plot specification before displaying it as a plot, you can use the plot
function. In this case, it generates a Plotly.js plot:
-> example1
+ (
+ ploclo/plot kind/pprint)
:data
+ {:mode :markers,
+ [{:type :scatter,
+ :opacity 0.6,
+ :name "setosa",
+ :x [3.4 3.4 2.3],
+ :y [5.0 4.6 4.5],
+ :marker {:color "#1B9E77", :size 20}}
+ :mode :markers,
+ {:type :scatter,
+ :opacity 0.6,
+ :name "versicolor",
+ :x [2.9 3.0 2.7 2.0 2.5],
+ :y [5.7 6.1 5.6 5.0 6.3],
+ :marker {:color "#D95F02", :size 20}}
+ :mode :markers,
+ {:type :scatter,
+ :opacity 0.6,
+ :name "virginica",
+ :x [2.8 3.3],
+ :y [6.2 6.7],
+ :marker {:color "#7570B3", :size 20}}],
+ :layout
+ :width nil,
+ {:height nil,
+ :plot_bgcolor "rgb(229,229,229)",
+ :xaxis {:gridcolor "rgb(255,255,255)", :title :sepal-width},
+ :yaxis {:gridcolor "rgb(255,255,255)", :title :sepal-length},
+ :title nil}}
It is annotated as kind/plotly
, so that visual tools know how to render it.
-> example1
+ (
+ ploclo/plotmeta)
:kindly{:kind :kind/plotly} #
This can be useful if you wish to process the actual Plotly.js spec rather than use Plotlycloth’s API. Let us change the background colour, for example:
+-> example1
+ (
+ ploclo/plotassoc-in [:layout :plot_bgcolor] "#eeeedd")) (
4.5 Field type inference
+Plotlycloth infers the type of relevant fields from the data.
+The example above was colored as it were since :species
column was nominal, so it was assigned distinct colours.
In the following example, the coloring is by a quantitative column, so a color gradient is used:
+-> datasets/mtcars
+ (
+ (ploclo/layer-point:mpg
+ {:=x :disp
+ :=y :cyl
+ :=color 20})) :=mark-size
We can override the inferred types and thus affect the generated plot:
+-> datasets/mtcars
+ (
+ (ploclo/layer-point:mpg
+ {:=x :disp
+ :=y :cyl
+ :=color :nominal
+ :=color-type 20})) :=mark-size
4.6 More examples
+4.6.1 Boxplot
+-> datasets/mtcars
+ (
+ (ploclo/layer-boxplot:cyl
+ {:=x :disp})) :=y
4.6.2 Segment plot
+-> datasets/iris
+ (
+ (ploclo/layer-segment:sepal-width
+ {:=x0 :sepal-length
+ :=y0 :petal-width
+ :=x1 :petal-length
+ :=y1 0.4
+ :=mark-opacity 3
+ :=mark-size :species})) :=color
4.7 Varying color and size
+-> {:ABCD (range 1 11)
+ (:EFGH [5 2.5 5 7.5 5 2.5 7.5 4.5 5.5 5]
+ :IJKL [:A :A :A :A :A :B :B :B :B :B]
+ :MNOP [:C :D :C :D :C :D :C :D :C :D]}
+
+ tc/dataset"IJKLMNOP"})
+ (ploclo/base {:=title :ABCD
+ (ploclo/layer-point {:=x :EFGH
+ :=y :IJKL
+ :=color :MNOP
+ :=size "QRST1"})
+ :=name
+ (ploclo/layer-line"IJKL MNOP"
+ {:=title :ABCD
+ :=x :ABCD
+ :=y "QRST2"
+ :=name "magenta"
+ :=mark-color 20
+ :=mark-size 0.2})
+ :=mark-opacity ploclo/plot)
4.8 Time series
+Date and time fields are handle appropriately. Let us, for example, draw the time series of unemployment counts.
+-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(
+ (ploclo/layer-line:date
+ {:=x :value
+ :=y "purple"})) :=mark-color
4.9 Multiple layers
+We can draw more than one layer:
+-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(:date
+ (ploclo/layer-point {:=x :value
+ :=y "green"
+ :=mark-color 20
+ :=mark-size 0.5})
+ :=mark-opacity :date
+ (ploclo/layer-line {:=x :value
+ :=y "purple"})) :=mark-color
We can also use the base
function for the common parameters across layers:
-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(:date
+ (ploclo/base {:=x :value})
+ :=y "green"
+ (ploclo/layer-point {:=mark-color 20
+ :=mark-size 0.5})
+ :=mark-opacity "purple"})) (ploclo/layer-line {:=mark-color
4.10 Updating data
+We can use the update-data
function to vary the dataset along a plotting pipeline, affecting the layers that follow.
This functionality is inspired by ggbuilder and metamorph.
+Here, for example, we draw a line, then sample 5 data rows, and draw them as points:
+-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(:date
+ (ploclo/base {:=x :value})
+ :=y "purple"})
+ (ploclo/layer-line {:=mark-color 5)
+ (ploclo/update-data tc/random "green"
+ (ploclo/layer-point {:=mark-color 15
+ :=mark-size 0.5})) :=mark-opacity
4.11 Smoothing
+layer-smooth
is a layer that applies some statistical processing to the dataset to model it as a smooth shape. It is inspired by ggplot’s geom_smooth.
At the moment, it can only be used to model :=y
by linear regression. Soon we will add more ways of modelling the data.
-> datasets/iris
+ ("dummy"
+ (ploclo/base {:=title "green"
+ :=mark-color :sepal-width
+ :=x :sepal-length})
+ :=y
+ ploclo/layer-point"orange"})
+ (ploclo/layer-smooth {:=mark-color ploclo/plot)
By default, the regression is computed with only one predictor variable, which is :=x
. But this can be overriden using the :predictors
key. We may compute a regression with more than one predictor.
-> datasets/iris
+ (:sepal-width
+ (ploclo/base {:=x :sepal-length})
+ :=y
+ ploclo/layer-point:petal-width
+ (ploclo/layer-smooth {:=predictors [:petal-length]
+ 0.5})
+ :=mark-opacity ploclo/plot)
4.12 Grouping
+The regression computed by haclo/layer-smooth
is affected by the inferred grouping of the data.
For example, here we recieve three regression lines, each for every species.
+-> datasets/iris
+ ("dummy"
+ (ploclo/base {:=title :species
+ :=color :sepal-width
+ :=x :sepal-length})
+ :=y
+ ploclo/layer-point ploclo/layer-smooth)
This happened because the :color
field was :species
, which is of :nominal
type.
But we may override this using the :group
key. For example, let us avoid grouping:
-> datasets/iris
+ ("dummy"
+ (ploclo/base {:=title :species
+ :=color
+ :=group []:sepal-width
+ :=x :sepal-length})
+ :=y
+ ploclo/layer-point ploclo/layer-smooth)
4.13 Example: out-of-sample predictions
+Here is a slighly more elaborate example inpired by the London Clojurians talk mentioned in the preface.
+Assume we wish to predict the unemployment rate for 96 months. Let us add those months to our dataset, and mark them as Future
(considering the original data as Past
):
-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(:relative-time "Past")
+ (tc/add-column :date (-> datasets/economics-long
+ (tc/concat (tc/dataset {:date
+ last
+ range 96) :days))
+ (datetime/plus-temporal-amount (:relative-time "Future"}))
+ 6)) (print/print-range
ggplot2/economics_long [670 6]:
+:rownames | +:date | +:variable | +:value | +:value01 | +:relative-time | +
---|---|---|---|---|---|
2297 | +1967-07-01 | +unemploy | +2944.0 | +0.02044683 | +Past | +
2298 | +1967-08-01 | +unemploy | +2945.0 | +0.02052578 | +Past | +
2299 | +1967-09-01 | +unemploy | +2958.0 | +0.02155206 | +Past | +
… | +… | +… | +… | +… | +… | +
+ | 2015-07-02 | ++ | + | + | Future | +
+ | 2015-07-03 | ++ | + | + | Future | +
+ | 2015-07-04 | ++ | + | + | Future | +
+ | 2015-07-05 | ++ | + | + | Future | +
Let us represent our dates as numbers, so that we can use them in linear regression:
+-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(:relative-time "Past")
+ (tc/add-column :date (-> datasets/economics-long
+ (tc/concat (tc/dataset {:date
+ last
+ range 96) :months))
+ (datetime/plus-temporal-amount (:relative-time "Future"}))
+ :year #(datetime/long-temporal-field :years (:date %)))
+ (tc/add-column :month #(datetime/long-temporal-field :months (:date %)))
+ (tc/add-column :yearmonth [:year :month] (fn [y m] (+ m (* 12 y))))
+ (tc/map-columns 6)) (print/print-range
ggplot2/economics_long [670 9]:
+:rownames | +:date | +:variable | +:value | +:value01 | +:relative-time | +:year | +:month | +:yearmonth | +
---|---|---|---|---|---|---|---|---|
2297 | +1967-07-01 | +unemploy | +2944.0 | +0.02044683 | +Past | +1967 | +7 | +23611 | +
2298 | +1967-08-01 | +unemploy | +2945.0 | +0.02052578 | +Past | +1967 | +8 | +23612 | +
2299 | +1967-09-01 | +unemploy | +2958.0 | +0.02155206 | +Past | +1967 | +9 | +23613 | +
… | +… | +… | +… | +… | +… | +… | +… | +… | +
+ | 2022-12-01 | ++ | + | + | Future | +2022 | +12 | +24276 | +
+ | 2023-01-01 | ++ | + | + | Future | +2023 | +1 | +24277 | +
+ | 2023-02-01 | ++ | + | + | Future | +2023 | +2 | +24278 | +
+ | 2023-03-01 | ++ | + | + | Future | +2023 | +3 | +24279 | +
Let us use the same regression line for the Past
and Future
groups. To do this, we avoid grouping by assigning []
to :=group
. The line is affected only by the past, since in the Future, :=y
is missing. We use the numerical field :yearmonth
as the regression predictor, but for plotting, we still use the :temporal
field :date
.
-> datasets/economics-long
+ (-> % :variable (= "unemploy")))
+ (tc/select-rows #(:relative-time "Past")
+ (tc/add-column :date (-> datasets/economics-long
+ (tc/concat (tc/dataset {:date
+ last
+ range 96) :months))
+ (datetime/plus-temporal-amount (:relative-time "Future"}))
+ :year #(datetime/long-temporal-field :years (:date %)))
+ (tc/add-column :month #(datetime/long-temporal-field :months (:date %)))
+ (tc/add-column :yearmonth [:year :month] (fn [y m] (+ m (* 12 y))))
+ (tc/map-columns :date
+ (ploclo/base {:=x :value})
+ :=y :relative-time
+ (ploclo/layer-smooth {:=color 20
+ :=mark-size
+ :=group []:yearmonth]})
+ :=predictors [;; Keep only the past for the following layer:
+ fn [dataset]
+ (ploclo/update-data (-> dataset
+ (fn [row]
+ (tc/select-rows (-> row :relative-time (= "Past")))))))
+ ("purple"
+ (ploclo/layer-line {:=mark-color 3})) :=mark-size
4.14 Histograms
+Histograms can also be represented as layers with statistical processing:
+-> datasets/iris
+ (:sepal-width})) (ploclo/layer-histogram {:=x
-> datasets/iris
+ (:sepal-width
+ (ploclo/layer-histogram {:=x 30})) :=histogram-nbins