why squeeze here?

Hi, I think there is a bug here:

https://github.com/Stonesjtu/Pytorch-NCE/blob/862afc666445dca4ce9d24a3eb1e073255edb92e/nce.py#L198

For RNN model which the last layer before softmax has shape [B * N * D] where time steps `N>1`, I believe  the `squeeze` do not have any effect. Maybe for batch size `B=1`? If that is the case, `squeeze(0)` might be a better choice.

I am using your code for predicting the last state (in other words, `N=1`). The `squeeze` here will give a `model_loss.shape = (B , 1)`  and `noise_loss.shape = (B,)` and then the total `loss.shape = (B, B)`, which should be `(B,1)` I think.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why squeeze here? #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

why squeeze here? #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions