-
Notifications
You must be signed in to change notification settings - Fork 3
Neural Networks
The architecture of a very simple feedforward neural net can be seen in this graphic:
Each layer of a neural net implements the ParametricTensorFunction
protocol. The NeuralNetLayer
base class can be used as an input layer. It has no parameters, the input is directly used as preactivations (denoted with the letter "a" in the graphic).
These preactivations are fed into the activationFunction
("σ" in the graphic), a property of the layer. Currently available activation function are the sigmoid function, (leaky) rectified linear units, and the softplus function. The output of the activation function ("h") is stored as currentActivation
, it will be needed when calculating the gradient. The activations are the output of this layer. For this layer, the gradient with respect to the input is calculated by multiplying the gradient of the activations with respect to the preactivations and the gradient with respect to the output (i.e. the activations). This last gradient in turn must be provided by the following layer.
FeedforwardLayer
is a subclass of the NeuralNetLayer
. It has two parameters: a matrix of weights ("W" in the graphic) and a bias vector ("b"). Here the preactivations are calculated by multiplying the input with the weights and adding the bias.
This additional step must be taken into account when calculating the gradient with respect to the input: The gradient with respect to the preactivations of this layer has to be multiplied with the weights. Also, the gradient with respect to both parameters must be calculated. The gradient with respect to the bias is simply the preactivation gradient. The weight gradient is calculated by multiplying the reactivation gradient with the input of this layer, which are the stored activations of the previous layer.
-
Basic Documentation
-
Applications