-
Notifications
You must be signed in to change notification settings - Fork 0
/
slides7.html
304 lines (269 loc) · 13.2 KB
/
slides7.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
<!doctype html>
<html>
<head>
<title>Math for ML</title>
<script src="lib/loader.js"></script>
</head>
<body onload="init()">
<div id="bgpage" class="background">
<canvas id="canvas"></canvas>
</div>
<div class="header">
<div id="follow" style="display:none">follow</div>
</div>
<div id="footer" class="footer">
<div id="navigator" class="large"></div>
</div>
<div id="slides" class="slides">
<div class="slide title"
group="title"
style="background-color:black; padding:5vw"
onshow="d0('navigator','footer');v0('bgpage')"
onhide="d1('navigator','footer');v1('bgpage')">
<div style="height: 15%"></div>
<h1 style="color:#ffa540">Welcome to Math for Machine Learning!</h1>
<h1 style="color:#ffffff">Linear Algebra, Session
7 — Loose Ends</h1>
<h1 style="color:#808080">Mesch</h1>
<h2 style="position:absolute; bottom:5%; padding-left:0.1vw; color:red">X/Google, 2021</h2>
</div>
<div class="slide noprint banner" group="intro">
<h1>Tensors, Machine Learning, and Quantum Computing</h1>
<div>
<em>We have left some loose ends dangling.</em>
</div>
</div>
<div class="slide build-focus-visible" group="intro">
<h1>Loose Ends — Overview</h1>
<div>Plan for today. We very lightly touch upon some left-over topics
that give an outlook where to go from here:</div>
<ol>
<li step="1,6">Coordinate system invarance of Scalars
<li step="2,6">Affine spaces, another look at Arrows
<li step="3,6">Outlook: Manifolds and differential geometry
<li step="4,6">Tensor like data structures in software
<li step="5,6">Where this leaves Linear Algebra in Deep Learning
</ol>
</div>
<div class="slide build-focus-visible">
<h1>Invariance of Scalars</h1>
<div class="notes"></div>
<div>Properties of vectors are defined independent of the coordinate
system.</div>
<ul>
<li step="1,6">Coordinates follow from algebraic axioms of the vector
space.
<li step="2,6">Scalar product <em>can</em> be computed from coordinates
of vectors and the metric tensor:
$$\xvec{x}\cdot\xvec{y} = x^i y^j g_{ij}$$
but has the same value in every basis.
<li step="3,6">Metric properties <em>norm</em> and <em>angle</em> are
defined by scalar product.
<li step="4,6">Also for scalar properties of operators: E.g., trace is
defined in coordinates of an operator, but also retains its value
under a coordiniate transformation $T$:
$$\mathrm{tr}(A) = A_i^i = T_i^{j^\prime} T^i_{k^\prime}
A_{j^\prime}^{k^\prime} = \delta^{j^\prime}_{k^\prime}
A_{j^\prime}^{k^\prime} = A_{j^\prime}^{j^\prime}$$
<li step="5,6">A scalar, like a vector and a tensor, is not just a
certain set of numbers, but the numbers have a specific behavior
under coordinate transformation.
</ul>
</div>
<div class="slide build-focus">
<h1>Arrows revisited: Affine Spaces and Euclidean Spaces</h1>
<div class="notes"></div>
<div>Vectors and their coordinates are related but not the same. Points
and the vectors that describe them aren't, either.</div>
<table style="margin:0">
<tr>
<td style="width: 50vw">
<ul>
<li step="1">We use vectors to describe points in a plane.
<li step="2">But that's relative to a fixed origin point.
<li step="3,4">What if we want to change the origin?
<li step="4">E.g. working with a coordinate frame at the surface of
Earth, which rotates?
<li step="5">The underlying algebraic structure is an <b>Affine
Space</b>.
</ul>
<td>
<div class="img">
<img class="img" src="img/affine1.png" step="1">
<img class="img" src="img/affine2.png" step="2">
<img class="img" src="img/affine3.png" step="3">
<img class="img" src="img/affine4.png" step="4">
</div>
</tr>
</table>
</div>
<div class="slide build-focus-visible continued">
<h1>Affine Spaces and Euclidean Spaces</h1>
<div class="notes"></div>
<div>What is an <b>Affine Space</b>?</div>
<ul>
<li step="1">A set $A$, elements are called Points.
<li step="2">An associated Vector Space $V$.
<li step="3">An operation <b>Point-Point=Vector</b> that maps any pair
of points to a vector.
$$\forall A,B \in A \, \exists \xvec{v} \in V :\, B - A = \xvec{v} $$
<li step="4">An operation <b>Point+Vector=Point</b> that moves a point
by a vector to another point.
$$\forall A \in A \, \forall \xvec{v} \in V \, \exists B \in A :\, A
+ \xvec{v} = B$$
</ul>
</div>
<div class="slide build-focus-visible continued">
<h1>Affine Spaces and Euclidean Spaces</h1>
<div class="notes"></div>
<div><b>Basis</b> of an <b>Affine Space</b></div>
<ul>
<li step="1">A <b>Basis</b> in such a space is a pair of an origin point and
a basis in the vector space.
$${O, \xvec{e}_1, ... \xvec{e}_n}$$
<li step="2"><b>Coordinates</b> of a Point $P$ in that basis are the
coordinates $p^i$ of the vector $$\xvec{p} = P - O$$
<li step="3">A <b>Basis Transformation</b> is a pair of a transformation of
the vector basis, plus a translation of the origin point (a vector).
$$\xvec{e}_{i^\prime} = T^{i}_{i^\prime} \xvec{e}_i$$
$$O^\prime = O + \xvec{t}$$
</ul>
</div>
<div class="slide build-focus-visible continued">
<h1>Affine Spaces and Euclidean Spaces</h1>
<div class="notes"></div>
<div>Vectors <em>are</em> operators on the Affine Space:</div>
<ul>
<li step="1">Every vector from $V$ defines an Operator on the Affine
Space $A$, a <b>Translation</b>.
<li step="2">We have seen before that Operators on a Vector Space form
another Vector Space on their own.
<li step="3">Similar here, except it's operators on a space that's not
a vector space.
<li step="4">"are" above means "isomorphic".
<li step="5">Recall the example
of <a href="slides5.html#slide=10&step=1">Arrows as vectors</a>.
</ul>
</div>
<div class="slide build-focus-visible continued">
<h1>Affine Spaces and Euclidean Spaces</h1>
<div class="notes"></div>
<div>Euclidean Space</div>
<ul>
<li step="1,3">Remember, a Euclidean Vector Space
is one over $R$ with a Scalar Product.
<li step="2,3">An Affine Space associated with a Euclidean Vector Space
is a <b>Euclidean Space</b>.
</ul>
</div>
<div class="slide build-focus-visible">
<h1>Polar Coordinates Revisited: Manifolds</h1>
<div class="notes"></div>
<div>Polar coordinates are not really the coordinates we have defined
from linear independence. They can be better understood as coordinates
of points.</div>
<ul>
<li step="1">Consider polar coordinates in a 2D Euclidean space.
<li step="2">There are directions in which only one of the coordinates changes.
<li step="3">The vectors in these directions form a basis of the associated
vector space.
<li step="4">But this basis is different in every point!
<li step="5">Interesting question to ask: How are those bases defined in nearby
points related?
<li step="6">This is the topic of <b>Differential Geometry</b>, aka
Riemann Geometry, aka Vector and Tensor Calculus.
<li step="7">Leads to the concept of a <b>Manifold</b>.
</ul>
</div>
<div class="slide build-focus-visible continued">
<h1>Manifolds</h1>
<div class="notes"></div>
<ul>
<li step="1">Manifolds are "warped" subspaces that are embedded in Affine
Spaces.
<li step="2">For example, the <em>surface</em> of a sphere is a
manifold embedded in 3D Euclidean space. It's parametrized by two
coordinates, <em>longitude</em> and <em>latitude</em>.
<li step="3,4">It turns out that the examples from which deep learning
models learn are drawn from a manifold in the feature space, not the
whole feature space.
<li step="4">An embedding inside a model are the coordinates of such a
manifold.
<li step="5">Also, during training, the ML model "mostly" traverses a
manifold in its parameter space, the "top subspace"
(<a href="https://arxiv.org/abs/1812.04754">reference</a>).
</ul>
</div>
<div class="slide build-focus-visible">
<h1>Tensor like data structures in software</h1>
<div>Some situations in software design offer a choice of cartesian and
tensor product.</div>
<ul>
<li step="1">The <b>set of files</b> belonging to a user can be modeled as a
vector (in one-hot-encoding).
<li step="2"><b>Permissions</b> (view, comment, edit) can be modeled
as vectors too.
<li step="3">Requisite <b>operations</b> TBD, but cconceivably there even are
operations addition and scalar multiplication by -1, to deny a
permission to a group.
<li step="4"><b>Grant of permission</b>: should this be the Cartesian
product of the vector of files with the vector of permissions, or
the tensor product?
<li step="5">In <b>Google Drive</b>, sharing a file with users is such
a tensor product: each file is shared with separate permission.
<li step="6">However, in <b>Google Drive API</b>, granting access to
an app is the cartesian product: requesting app iis granted the same
accces permission to all files (called <em>scope</em>) in
drive. (Arguably a poor choice.)
</ul>
</div>
<div class="slide build-focus-visible">
<h1>LA, Deep Learning, and Software Design</h1>
<div>So where does this leave Linear Algebra in Deep Learning?</div>
<ul>
<li step="1"><b>Multidimensional arrays</b> are an important mechanism for
massive parallel and distributed computing.
<li step="2"><b>Matrix multiplication and scalar products</b> appear
in tensorflow computations.
<li step="3">More germane concepts of Linear Algebra generalize to
<b>Manifolds</b>, and then are useful to better understand and
optimize DL models.
<li step="4">E.g. understand <b>training, optimization, embeddings</b>
in terms of manifolds in feature space or parameter space.
<li step="5">Impose <em>regularization</em> in terms of tensor product
decompositions of weight matrices.
<li step="6">Understand embeddings in terms of matrix decompositions.
<li step="7">Other concepts from LA may inform or inspire software
design. E.g. <b>tensor product-like data structure composition</b>;
<b>reversible computing</b> inspired by QC.
</ul>
</div>
<div class="slide" group="intro">
<div>Handoff to Tai-Danae.</div>
</div>
<div class="slide title"
style="background-color:black; padding:5vw"
onshow="d0('navigator','footer');v0('bgpage')"
onhide="d1('navigator','footer');v1('bgpage')">
<div style="height: 15%"></div>
<h2 style="font-weight:normal"><a style="text-decoration:none;
color:#6666ff"
class="slides-sync"
href="slides.html#slide=2">Overview</a></h2>
<h2 style="font-weight:normal"><a style="text-decoration:none;
color:#6666ff"
class="slides-sync"
href="slides5.html">Session 5 — Intro to Tensors</a></h2>
<h2 style="font-weight:normal"><a style="text-decoration:none;
color:#6666ff"
class="slides-sync"
href="slides6.html">Session 6 — Tensors and Quantum Computing</a></h2>
<h2 style="font-weight:normal"><a style="text-decoration:none;
color:#6666ff"
class="slides-sync"
href="slides7.html">Session 7 — Loose Ends</a></h2>
</div>
<div class="slide" group="intro"></div>
</div>
</body>
</html>