You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 1, 2025. It is now read-only.
If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention, attn_mask[b, q, :] = True will be converted to attn_mask[b, q, :] = float('-inf') in nn.MultiheadAttention. Finally, when attn_mask is used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (
This problem came up when I applied masked attention to my semantic segmentation task. : (
If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention,

attn_mask[b, q, :] = Truewill be converted toattn_mask[b, q, :] = float('-inf')innn.MultiheadAttention. Finally, whenattn_maskis used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (This problem came up when I applied masked attention to my semantic segmentation task. : (