Skip to content

Commit

Permalink
Merge pull request #396 from fangzhouli/patch-1
Browse files Browse the repository at this point in the history
Update 07.3-concepts.Rmd
  • Loading branch information
christophM committed May 22, 2024
2 parents df9c1d7 + 63dac24 commit acb54b5
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion manuscript/07.3-concepts.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ $$S_{C,k,l}(x)=\nabla h_{l,k}(\hat{f}_l(x))\cdot v_l^C$$
where $\hat{f}_l$ maps the input $x$ to the activation vector of the layer $l$ and $h_{l,k}$ maps the activation vector to the logit output of class $k$.

Mathematically, the sign of $S_{C,k,l}(x)$ only depends on the angle between the gradient of $h_{l,k}(\hat{f}_l(x))$ and $v_l^C$.
If the angle is greater than 90 degrees, $S_{C,k,l}(x)$ will be positive, and if the angle is less than 90 degrees, $S_{C,k,l}(x)$ will be negative.
If the angle is less than 90 degrees, $S_{C,k,l}(x)$ will be positive, and if the angle is greater than 90 degrees, $S_{C,k,l}(x)$ will be negative.
Since the gradient $\nabla h_{l,k}$ points to the direction that maximizes the output the most rapidly, conceptual sensitivity $S_{C,k,l}$, intuitively, indicates whether $v_l^C$ points to the similar direction that maximizes $h_{l,k}$.
Thus, $S_{C,k,l}(x)>0$ can be interpreted as concept $C$ encouraging the model to classify $x$ into class $k$.

Expand Down

0 comments on commit acb54b5

Please sign in to comment.