Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vlfuse_helper中的attention_mask_l用法 #11803

Open
HandsLing opened this issue Jun 20, 2024 · 3 comments
Open

vlfuse_helper中的attention_mask_l用法 #11803

HandsLing opened this issue Jun 20, 2024 · 3 comments
Assignees

Comments

@HandsLing
Copy link

请教一个问题,在vlfuse_helper.py中的BiMultiHeadAttention类中的forward函数中,对于attention_mask_l的用法为什么和https://github.com/IDEA-Research/GroundingDINO这里面的不一致呢?
“”“
if attention_mask_l is not None:
assert (attention_mask_l.dim() == 2)
attention_mask = attention_mask_l.unsqueeze(1).unsqueeze(1)
attention_mask = attention_mask.expand(bsz, 1, tgt_len, src_len)
attention_mask = attention_mask.masked_fill(attention_mask==0, -9e15)
if attention_mask.size() != (bsz, 1, tgt_len, src_len):
raise ValueError('Attention mask should be of size {(bsz, 1, tgt_len, src_len)}')
attn_weights = attn_weights.view(bsz, self.num_heads, tgt_len, src_len) + attention_mask
attn_weights = attn_weights.view(bsz * self.num_heads, tgt_len, src_len)
”“”
这里面的代码是这样,我在测试的时候发现attention_mask_l其实是一个全False的tensor,那这个操作“attention_mask = attention_mask.masked_fill(attention_mask==0, -9e15)” 相当于把attention_mask变成全True的tensor,再加到attn_weights上面去了,这相当于attn_weights每个位置加了1,请问这是你们最初想要实现的效果吗?为什么要这么做呢?

@talebolano
Copy link

same question

@talebolano
Copy link

@HandsLing 我尝试了将attention_mask_l直接置为None,对最终输出结果没有任何影响,我怀疑一开始这里便写错了

@HandsLing
Copy link
Author

@talebolano 我在推理的时候,直接在attn_weights上面加上1,最后效果也是一样

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants