-
Notifications
You must be signed in to change notification settings - Fork 73
Open
Description
Line 286 in b1dd65b
| mask = mask.unsqueeze(1).repeat(1, self.num_heads, 1, 1) # BxNxQ_LENxK_LEN |
It seems the mask is not correct. Since there is a permute of query, key, and value. The mask should also has a permute.
Metadata
Metadata
Assignees
Labels
No labels