Hello there.
I tried to use m2_bert_80M_2k to generate embeddings for text strings with lengths around 500 tokens following your example on Huggingface](https://huggingface.co/togethercomputer/m2-bert-80M-2k-retrieval). However, the outputs = model(**input_ids) line tooks over 15s on average, slower than expected. Could you please help me find the issue here?
I also tested your example for 12 tokens. The model forwarding process is still slow (over 5s for 12tokens & padding="longest", over 16s for 12tokens & padding="max_length"(=2048).


Thanks in advance!
Hello there.
I tried to use m2_bert_80M_2k to generate embeddings for text strings with lengths around 500 tokens following your example on Huggingface](https://huggingface.co/togethercomputer/m2-bert-80M-2k-retrieval). However, the
outputs = model(**input_ids)line tooks over 15s on average, slower than expected. Could you please help me find the issue here?I also tested your example for 12 tokens. The model forwarding process is still slow (over 5s for 12tokens & padding="longest", over 16s for 12tokens & padding="max_length"(=2048).


Thanks in advance!