how to perform inference without any dataloader

Thanks for the great work!

I want to perform inference using a a clip tensor of shape BXCxTxHxW by `output = model(clip_tensor)`. What is the way of doing it on MeMViT model? What is the expected input size?

When I try to input a tensor of shape 1, 3, 16, 224, 224 into MViT model created with `configs/AVA/MeMViT_16_K400.yaml`, I am getting this error: 

```bash
  File "...\MeMViT\memvit\models\video_model_builder.py", line 1081, in forward
    x = torch.cat((cls_tokens, x), dim=1)
  File "...\MeMViT\debug.py", line 49, in <module>
    pred = model(input,)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 96 but got size 8 for tensor number 1 in the list.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to perform inference without any dataloader #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

how to perform inference without any dataloader #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions