We've created DACT, a new algorithm for achieving adaptive computation time that, unlike existing approaches, is fully differentiable and can work in conjunction with complex models. DACT replaces hard limits and piecewise functions with inductive biases to allow the network to choose during evaluation the amount of computation needed for the current input. The resulting models learn the tradeoff between precision and complexity and actively adapt their architectures accordingly. Our paper shows that, when applied to the widely known MAC architecture and the visual reasoning task, DACT can improve interpretability, make the model more robust to relevant hyperparameter changes, all while increasing the performance to computation ratio.
Forked from ceyzaguirre4/mac-network-pytorch which is based on Memory, Attention and Composition (MAC) Network for CLEVR from Compositional Attention Networks for Machine Reasoning.
Run the following command to install missing dependencies:
pip install -r requirements.txt
The dependencies include comet_ml
for metric reporting, but all experiments can be run without installing it by carefully commenting the appropiate lines.
python3 preprocess.py [CLEVR directory]
python3 image_feature.py [CLEVR directory]
For GQA we use the object-based features for GQA, extracted from faster-RCNN available in the oficial website and extract the hdf5 files to [GQA directory]/features/objects
.
# `all` to use all questions (as in paper), `balanced`to only use oficial balanced subset.
python3 preprocess_GQA.py [GQA directory] [all | balanced]
Training the model with all-default hyper-parameters trains a 12-step MAC without gating or self-attention for 10 epochs on CLEVR.
python3 train.py DATALOADER.FEATURES_PATH [CLEVR directory]
Alternatively, pass the desired configuration file as a parameter.
python3 train.py --config-file=[path to config file] DATALOADER.FEATURES_PATH [CLEVR directory]
Configuration files to replicate results from paper are provided in the configs/
directory.
All (CLEVR) adaptive models require a trained 12 step MAC from which to load pre-trained weights.
For instance, to train DACT with ponder cost 5e-3
do:
# pretrain 12-step MAC
python3 train.py --config-file=configs/CLEVR/MAC/mac12.yaml DATALOADER.FEATURES_PATH [CLEVR directory]
# OR use default params
# python3 train.py DATALOADER.FEATURES_PATH [CLEVR directory]
# train DACT
python3 train.py --config-file=configs/CLEVR/DACT/ours_0005.yaml DATALOADER.FEATURES_PATH [CLEVR directory]
The same is valid for all gated MACs; to train gated the gated variants provided in the configuration files do:
# pretrain 3-step MAC
python3 train.py --config-file=configs/CLEVR/MAC/mac3.yaml DATALOADER.FEATURES_PATH [CLEVR directory]
# train gated variant from pretrained weights
python3 train.py --config-file=configs/CLEVR/MAC/mac3+gate.yaml DATALOADER.FEATURES_PATH [CLEVR directory]
Training on GQA dataset is achieved by using the --mode
argument:
For instance, training the model with all-default hyper-parameters in --mode=gqa
trains a 4-step MAC without gating or self-attention for 5 epochs:
python3 train.py --mode=gqa DATALOADER.FEATURES_PATH [GQA directory]
All (GQA) adaptive models require a trained 4 step MAC from which to load pre-trained weights.
For instance, to train DACT with ponder cost 5e-3
do:
# pretrain 4-step MAC
python3 train.py --mode=gqa --config-file=configs/GQA/MAC/mac12.yaml DATALOADER.FEATURES_PATH [GQA directory]
# OR use default params
# python3 train.py --mode=gqa DATALOADER.FEATURES_PATH [GQA directory]
# train DACT
python3 train.py --mode=gqa --config-file=configs/GQA/DACT/ours_0005.yaml DATALOADER.FEATURES_PATH [GQA directory]
@article{Eyzaguirre2020DifferentiableAC,
title={Differentiable Adaptive Computation Time for Visual Reasoning},
author={Cristobal Eyzaguirre and A. Soto},
journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020},
pages={12814-12822}
}