[Feature]: Batch embedding and reranking

### 🚀 The feature, motivation and pitch

Would it be possible to add support for embedding generation and reranking in Aphrodite? Right now, using RAG setups often means juggling both vLLM and Aphrodite, but it’d be great if Aphrodite could handle the whole process.

Something like vLLM’s pooling models (https://docs.vllm.ai/en/v0.6.5/models/pooling_models.html) could work—letting us generate embeddings efficiently and improve retrieval without switching tools. This would help speed things up and simplify deployment.

Curious if this would be doable. Appreciate your thoughts!

### Alternatives

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Batch embedding and reranking #1258

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Batch embedding and reranking #1258

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions