Model Inspection: Computing Model FLOPs #3830

peterdavidfagan · 2024-04-08T08:41:31Z

peterdavidfagan
Apr 8, 2024

I am currently looking to evaluate the FLOPs of a model architecture with token compression methods such as pruning and merging. I am using the nn.tabulate method but I find that I get errors in the calculation of FLOPs through this API. If I don't calculate FLOPs I get a model summary as expected without any errors.

Is anyone else using nn.tabulate and/or does anyone have recommendations for alternatives if they were not able to get nn.tabulate to work for their purposes?

My model architecture repo can be found here.

Token compression attention module here.

Token merging/pruning methods here.

I get errors in my attention layers that contain merging/pruning functions. In particular, my pruning method computes the top_k token embeddings based on an importance score, when computing FLOPs it seems like I am getting empty arrays when trying to compute indices of top_k token embeddings as detailed in below trace:

Traceback (most recent call last):
  File "/home/peter/Code/robot_learning_baselines/robot_learning_baselines/train_multi_modal.py", line 102, in main
    inspect_model(model, rngs, input_data, method=cfg.architecture.multi_modal_transformer.forward_method)
  File "/home/peter/Code/robot_learning_baselines/robot_learning_baselines/utils/pipeline.py", line 25, in inspect_model
    model.tabulate(
  File "/home/peter/.cache/pypoetry/virtualenvs/robot-learning-baselines-eWOsH7n4-py3.10/lib/python3.10/site-packages/flax/linen/module.py", line 2730, in tabulate
    return tabulate_fn(*args, **kwargs)
  File "/home/peter/.cache/pypoetry/virtualenvs/robot-learning-baselines-eWOsH7n4-py3.10/lib/python3.10/site-packages/flax/linen/summary.py", line 315, in _tabulate_fn
    table = table_fn(rngs, *fn_args, **fn_kwargs, **kwargs)
  File "/home/peter/.cache/pypoetry/virtualenvs/robot-learning-baselines-eWOsH7n4-py3.10/lib/python3.10/site-packages/flax/linen/summary.py", line 490, in _get_table_fn
    *_get_call_flops(c, compute_flops, compute_vjp_flops),
  File "/home/peter/.cache/pypoetry/virtualenvs/robot-learning-baselines-eWOsH7n4-py3.10/lib/python3.10/site-packages/flax/linen/summary.py", line 400, in _get_call_flops
    variables = jax.eval_shape(init, rngs, dynamic_leaves)
  File "/home/peter/.cache/pypoetry/virtualenvs/robot-learning-baselines-eWOsH7n4-py3.10/lib/python3.10/site-packages/flax/linen/summary.py", line 392, in init
    return c.module.init(
  File "/home/peter/Code/robot_learning_baselines/robot_learning_baselines/model_architectures/multi_modal_transformers/multi_modal_transformers/attention_blocks/compressed_attention.py", line 391, in __call__
    x = CompressedEncoder1DBlock(
  File "/home/peter/Code/robot_learning_baselines/robot_learning_baselines/model_architectures/multi_modal_transformers/multi_modal_transformers/attention_blocks/compressed_attention.py", line 342, in __call__
    x, ids = compressed_attn(prune_fn=self.prune_fn, merge_fn=self.merge_fn)(x, x, mask=mask, deterministic=not train)
  File "/home/peter/Code/robot_learning_baselines/robot_learning_baselines/model_architectures/multi_modal_transformers/multi_modal_transformers/attention_blocks/compressed_attention.py", line 301, in __call__
    x, ids = self.prune_fn(x, importance_scores)
  File "/home/peter/Code/robot_learning_baselines/robot_learning_baselines/model_architectures/multi_modal_transformers/multi_modal_transformers/tokenizers/token_compression.py", line 35, in compute_top_k_tokens
    ids = jnp.concatenate(ids)
 File "/home/peter/.cache/pypoetry/virtualenvs/robot-learning-baselines-eWOsH7n4-py3.10/lib/python3.10/site-packages/jax/_src/numpy/lax_numpy.py", line 1883, in concatenate
    raise ValueError("Need at least one array to concatenate.")
ValueError: Need at least one array to concatenate.

Any advice from the Flax dev community is appreciated.

Update: I am able to evaluate the FLOPs of my CompressedMultiHeadDotProductAttention just fine from a Colab notebook where I only inspect this layer and not the overall model architecture. I am still debugging potential causes of the error I am facing, I thought it may be due to my module setup for the overall architecture but this is still TBC.

Answered by peterdavidfagan

Apr 11, 2024

#1854 resolves my issue for now. When required I can also compute layer specific FLOPs with this API.

View full answer

peterdavidfagan · 2024-04-11T11:13:07Z

peterdavidfagan
Apr 11, 2024
Author

#1854 resolves my issue for now. When required I can also compute layer specific FLOPs with this API.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Inspection: Computing Model FLOPs #3830

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Model Inspection: Computing Model FLOPs #3830

peterdavidfagan Apr 8, 2024

Replies: 1 comment

peterdavidfagan Apr 11, 2024 Author

peterdavidfagan
Apr 8, 2024

peterdavidfagan
Apr 11, 2024
Author