Skip to content

kalelpark/Visual_Attention

Repository files navigation

Visualization Anythings

it helps those who want to study Neural Architecture Search..
Wongi Park

Environment Setting

  • Conda environment : Ubuntu 18.04 CUDA-10.1 (10.2) with Pytorch==1.13.0, Torchvision==0.6.0 (python 3.8~11), timm, reception Fields
# Create Environment
conda create -n vslt python=3.8
conda activate vslt

# Install pytorch, torchvision, cudatoolkit
conda install pytorch==1.13.0 torchvision==0.6.0 cudatoolkit=10.1 (10.2) -c pytorch
conda install timm
conda install receptivefield>=0.5.0

Q1. What properties of visualize about attention or amplitude in papers?

A1. These visualizations aid in interpreting the model's behavior, analyzing its strengths and weaknesses, and guiding further improvements and research in the field.

Q2. Why we visualize Tokens interactions?

A2. This provides insights into how attention is propagated and shared across the input patches. Analyzing token interactions can help understand the flow of information and the dependencies learned by the self-attention mechanism.

Supporting papers

(1) DINOv2 (Paper / Code)
(2) How Do Vision Transformers Work? (Paper / Code)
(3) More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity (Paper / Code)
(4) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper / Code)
(5) Deformable ConvNets v2: More Deformable, Better Results (Paper / Code)

How to cite

@article = {
    title = {Visualize Anything},
    author = {Wongi Park},
    journal = {GitHub},
    url = {https://github.com/kalelpark/visualizing},
    year = {2023},
}