Extent of purpose for quantization-aware-training for weights in aihwkit? #528
-
Hello! I am from a Edge AI/TinyML background where quantization-aware-training(QAT) is often used to gain network robustness when running integer-based nets. I have found that aihwkit supports ADC and DAC conversions through the rpu_config such as:
This could been seen as a "quantization", however the quantization here affects the in/out signals themselves, and not the weights, which limits the effectiveness of QAT using this functionality in my opinion as we do not get the weight robustness. In "Hardware-aware training for large-scale and diverse
However, when looking at paper such as "Fully hardware-implemented memristor
Here it seems like discrete steps were used in programming the crossbar array, so in my mind, there would be a benefit to exploring QAT for training of this kind of AIMC network. My questions are: Is there any other information/research done in this area? Is there a purpose for QAT in this framework? Bonus/Curious question: If implementing QAT for aihwkit, is it reasonable to extend AnalogLayer/AnalogLayerBase to use the set_weights/get_weights such as:
Or are there inherent noises/imperfections added in the AnalogLayer/AnalogLayerBase similar to how it's done for AnalogSGD? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @arseniivanov, That being said a digitally quantized model might be useful for the quantized AIMC approach in the second paper as a starting case for hardware-aware training, where it could be then made noise robust. In terms of training quantized weight with the AIHWKIT, one could do it in different ways. One could be to build the same forward pass as expected for the particular hardware (e.g. the second paper ) something like: def forward(self, x_input: Tensor) -> Tensor:
# [.. ] init y_q tensor with correct size
for significance_factor, analog_tile in zip(self.sig_factors, self.analog_tile_array):
y_q.add_(significance_factor * analog_tile.forward(x_input))
return y_q Where the analog tiles are instantiated for each bit of the weight in the init. Then one might be able to use noise-aware training with this model. However, this might be somewhat complicated and would might want to directly train with QAT and noise. That can actually be done by using DISRETIZE_AND_NORMAL or DOREFA. However, the support for this kind of quantization aware training is limited and other packages on QAT might yield better results. You can take a look at the tutorial we recently wrote, to see how one can use hardware-aware training with the AIHWKIT, see here. Thus, a third way would be to use a state-of-the-art QAT network obtained from some other specialized package on QAT and then convert the resulting model to analog using the |
Beta Was this translation helpful? Give feedback.
Hi @arseniivanov,
many thanks for raising this interesting point. In the first paper, hardware-aware training is done with an analog representation of the weights in mind, that is the weight is directly encoded into the conductance values, and only having one pair of resistive elements (one for positive and one for negative parts of the weight). Thus, it makes no sense to use QAT methods since there is no quantization of the weights in this case. There are only noise and limited weight ranges, since the resistive value writing and read out is subject to noise and non-idealities as described in the first paper. Therefore this is very different to the QAT situation in digital, where weights…