feat: unified GenAI routing client + fix Bedrock & Vertex AI compilation bugs (#81) by atharva-nagane · Pull Request #82 · c2siorg/RustCloud

atharva-nagane · 2026-03-14T15:28:24Z

Summary

Closes #81

This PR does three things that are tightly coupled:

Fixes two providers that cannot compile — aws_bedrock.rs and
gcp_vertex_ai.rs both reference CloudError::ProviderError(String),
a variant that does not exist in errors.rs.
Rescues the orphaned Vertex AI implementation — gcp_vertex_ai.rs
existed on disk but was never declared in the module tree, making it
completely unreachable. It is now wired and its existing tests run.
Introduces UnifiedLlmClient — a provider-agnostic routing layer that
sits on top of the three LlmProvider implementations, enabling
applications to target multiple clouds without any provider-specific code.

What's fixed in existing providers

`aws_bedrock.rs`

Location	Old (broken)	Fixed
`build_messages`	`CloudError::ProviderError(e)`	`CloudError::Provider { http_status: 0, message: e, retryable: false }`
`.send()` errors	`CloudError::ProviderError(e)`	`CloudError::Provider { .. }`
Stream error events	`CloudError::ProviderError(e)`	`CloudError::Provider { retryable: true }`
JSON deserialization	`CloudError::ProviderError(e)`	`CloudError::Serialization { source: e }`
`generate_with_tools`	One `ToolConfiguration` per tool	Single `ToolConfiguration` with all tools (AWS Bedrock API requirement)

`gcp_vertex_ai.rs`

Issue	Old	Fixed
All error variants	`CloudError::ProviderError(e)`	Correct named-field variants
Module wiring	Not declared anywhere	Added to `mod.rs`, `main.rs`, `tests/mod.rs`
Embed endpoint	`:predictBatch` (wrong)	`publishers/google/models/text-embedding-004:predict`
Embed body format	Legacy `requests[]`	Correct `instances[]` with `task_type`
Embed response type	Wrong struct fields	`predictions[].embeddings.values`
Stream parsing	Raw bytes as `DeltaText`	SSE `data:` line parser; emits `DeltaText`, `Usage`, `Done` events
Tool call extraction	Hardcoded `name: "tool_called"`	Parses `functionCall.name` and `functionCall.args` from response parts
`Part` struct	No `functionCall` field	Added `function_call: Option<FunctionCall>` with correct serde rename
Auth errors	`CloudError::ProviderError`	`CloudError::Auth { message }`
Network errors	`CloudError::ProviderError`	`CloudError::Network { source }`

New: `rustcloud/src/genai/`

Architecture

                    +----------------------+
                    |   UnifiedLlmClient   |
                    +----------+-----------+
                               |
                +--------------+--------------+
                |              |              |
        +-------v------+ +-----v------+ +-----v------+
        | Bedrock      | | VertexAI   | | AzureOpenAI|
        | Provider     | | Provider   | | Provider   |
        | (AWS)        | | (GCP)      | | (Azure)    |
        +-------+------+ +-----+------+ +-----+------+
                \              |              /
                 \             |             /
                  +------------v-------------+
                  |     RoutingStrategy      |
                  | Explicit | ModelBased |  |
                  | Fallback               |
                  +------------------------+

All three backends implement the same LlmProvider trait.

The unified client also implements LlmProvider, meaning callers never
need to change their call site when switching between a specific provider
and the multi-cloud client.

Builder pattern

// ModelBased: infer provider from model ID prefix
let client = UnifiedLlmClient::builder()
    .register("aws",   Box::new(BedrockProvider::new().await))
    .register("gcp",   Box::new(VertexAI::new("my-project", "us-central1")))
    .register("azure", Box::new(AzureOpenAIProvider::new()))
    .routing(RoutingStrategy::ModelBased)
    .build()?;

// anthropic.* → aws
let bedrock = client.generate(
    LlmRequest {
        model: ModelRef::Provider("anthropic.claude-3-5-haiku-20241022-v1:0".to_string()),
        ..
    }
).await?;

// gemini* → gcp
let vertex = client.generate(
    LlmRequest {
        model: ModelRef::Provider("gemini-1.5-flash".to_string()),
        ..
    }
).await?;

// gpt-* / Deployment(_) → azure
let azure = client.generate(
    LlmRequest {
        model: ModelRef::Deployment("gpt-4o".to_string()),
        ..
    }
).await?;

Fallback routing

// Fallback: resilient multi-cloud — skips rate-limited/network-failed providers
let client = UnifiedLlmClient::builder()
    .register("aws",   Box::new(BedrockProvider::new().await))
    .register("gcp",   Box::new(VertexAI::new("my-project", "us-central1")))
    .register("azure", Box::new(AzureOpenAIProvider::new()))
    .routing(RoutingStrategy::Fallback)
    .build()?;

Routing strategy rules

Strategy	generate / stream / generate_with_tools	embed
Explicit	default provider	default provider
ModelBased	inferred from model ID, fallback to default	default provider
Fallback	providers in registration order, skip on transient errors	providers in order

ModelBased inference table

Model prefix / type	Routed to
`anthropic.`, `amazon.`, `meta.`, `cohere.`, `mistral.*`	`"aws"`
`gemini`, `text-embedding-`, `textembedding-*`	`"gcp"`
`gpt-`, `o1`, `o3*`, `ModelRef::Deployment(_)`	`"azure"`
No match	default provider

Fallback transient error detection

Fallback skips a provider and tries the next when it returns:

CloudError::RateLimit { .. }
CloudError::Network { .. }
CloudError::Provider { retryable: true, .. }

Hard errors (Auth, Unsupported, non-retryable Provider) also skip to
the next provider so that a misconfigured backend does not block healthy ones.

Design decisions

UnifiedLlmClient implements `LlmProvider`

Callers use identical code whether they hold a BedrockProvider
or a UnifiedLlmClient.

This ensures zero lock-in at the call site.

Builder validates at `build()`

Missing default provider or empty registry returns Err(String)
immediately rather than panicking during a request.

No new Cargo dependencies

The routing layer uses existing crate dependencies:

async-trait
futures
tokio

No binary size increase.

Vertex AI streaming uses `?alt=sse`

The streamGenerateContent endpoint with ?alt=sse
returns standard Server Sent Events.

The implementation collects the full body then emits typed
LlmStreamEvent variants — consistent with the Bedrock stream design.

Files changed

File	Change
`src/aws/aws_apis/artificial_intelligence/aws_bedrock.rs`	Fixed CloudError variants, fixed ToolConfiguration builder loop
`src/gcp/gcp_apis/artificial_intelligence/gcp_vertex_ai.rs`	Fixed all bugs, full rewrite
`src/gcp/gcp_apis/artificial_intelligence/mod.rs`	Added `pub mod gcp_vertex_ai`
`src/main.rs`	Wired `gcp_vertex_ai` + new `genai` module
`src/tests/mod.rs`	Added `gcp_vertex_ai_operations` + `unified_genai_operations`
`src/genai/mod.rs`	NEW — module docs + re-exports
`src/genai/routing.rs`	NEW — `RoutingStrategy`, `infer_provider()`, `is_transient()`
`src/genai/client.rs`	NEW — `UnifiedLlmClient` + builder
`src/tests/unified_genai_operations.rs`	NEW — 14 tests
`examples/unified_genai.md`	NEW — usage guide
`README.md`	Added Vertex AI + multi-cloud section

Testing

Unit tests (no credentials required)

test_builder_no_providers_returns_error
test_builder_unknown_default_returns_error
test_builder_first_provider_becomes_default
test_explicit_routing_uses_default
test_model_based_routes_anthropic_to_aws
test_model_based_routes_gemini_to_gcp
test_model_based_routes_gpt_to_azure
test_model_based_falls_back_to_default_on_unknown_model
test_fallback_skips_rate_limited_provider
test_fallback_returns_last_error_when_all_fail
test_fallback_embed_skips_rate_limited_provider
test_stream_collects_events
test_generate_with_tools_explicit_routing

Integration tests (`#[ignore]`)

Run with:

cargo test -- --include-ignored

integration_aws_bedrock_via_unified_client
integration_vertex_ai_via_unified_client
integration_azure_openai_via_unified_client
integration_fallback_across_all_providers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: unified GenAI routing client + fix Bedrock & Vertex AI compilation bugs (#81)#82

feat: unified GenAI routing client + fix Bedrock & Vertex AI compilation bugs (#81)#82
atharva-nagane wants to merge 1 commit intoc2siorg:mainfrom
atharva-nagane:feat/unified-genai-routing

atharva-nagane commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

atharva-nagane commented Mar 14, 2026

Summary

What's fixed in existing providers

aws_bedrock.rs

gcp_vertex_ai.rs

New: rustcloud/src/genai/

Architecture

Builder pattern

Fallback routing

Routing strategy rules

ModelBased inference table

Fallback transient error detection

Design decisions

UnifiedLlmClient implements LlmProvider

Builder validates at build()

No new Cargo dependencies

Vertex AI streaming uses ?alt=sse

Files changed

Testing

Unit tests (no credentials required)

Integration tests (#[ignore])

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`aws_bedrock.rs`

`gcp_vertex_ai.rs`

New: `rustcloud/src/genai/`

UnifiedLlmClient implements `LlmProvider`

Builder validates at `build()`

Vertex AI streaming uses `?alt=sse`

Integration tests (`#[ignore]`)