Skip to content

feat: unified GenAI routing client + fix Bedrock & Vertex AI compilation bugs (#81)#82

Open
atharva-nagane wants to merge 1 commit intoc2siorg:mainfrom
atharva-nagane:feat/unified-genai-routing
Open

feat: unified GenAI routing client + fix Bedrock & Vertex AI compilation bugs (#81)#82
atharva-nagane wants to merge 1 commit intoc2siorg:mainfrom
atharva-nagane:feat/unified-genai-routing

Conversation

@atharva-nagane
Copy link
Contributor

Summary

Closes #81

This PR does three things that are tightly coupled:

  1. Fixes two providers that cannot compileaws_bedrock.rs and
    gcp_vertex_ai.rs both reference CloudError::ProviderError(String),
    a variant that does not exist in errors.rs.

  2. Rescues the orphaned Vertex AI implementationgcp_vertex_ai.rs
    existed on disk but was never declared in the module tree, making it
    completely unreachable. It is now wired and its existing tests run.

  3. Introduces UnifiedLlmClient — a provider-agnostic routing layer that
    sits on top of the three LlmProvider implementations, enabling
    applications to target multiple clouds without any provider-specific code.


What's fixed in existing providers

aws_bedrock.rs

Location Old (broken) Fixed
build_messages CloudError::ProviderError(e) CloudError::Provider { http_status: 0, message: e, retryable: false }
.send() errors CloudError::ProviderError(e) CloudError::Provider { .. }
Stream error events CloudError::ProviderError(e) CloudError::Provider { retryable: true }
JSON deserialization CloudError::ProviderError(e) CloudError::Serialization { source: e }
generate_with_tools One ToolConfiguration per tool Single ToolConfiguration with all tools (AWS Bedrock API requirement)

gcp_vertex_ai.rs

Issue Old Fixed
All error variants CloudError::ProviderError(e) Correct named-field variants
Module wiring Not declared anywhere Added to mod.rs, main.rs, tests/mod.rs
Embed endpoint :predictBatch (wrong) publishers/google/models/text-embedding-004:predict
Embed body format Legacy requests[] Correct instances[] with task_type
Embed response type Wrong struct fields predictions[].embeddings.values
Stream parsing Raw bytes as DeltaText SSE data: line parser; emits DeltaText, Usage, Done events
Tool call extraction Hardcoded name: "tool_called" Parses functionCall.name and functionCall.args from response parts
Part struct No functionCall field Added function_call: Option<FunctionCall> with correct serde rename
Auth errors CloudError::ProviderError CloudError::Auth { message }
Network errors CloudError::ProviderError CloudError::Network { source }

New: rustcloud/src/genai/

Architecture

                    +----------------------+
                    |   UnifiedLlmClient   |
                    +----------+-----------+
                               |
                +--------------+--------------+
                |              |              |
        +-------v------+ +-----v------+ +-----v------+
        | Bedrock      | | VertexAI   | | AzureOpenAI|
        | Provider     | | Provider   | | Provider   |
        | (AWS)        | | (GCP)      | | (Azure)    |
        +-------+------+ +-----+------+ +-----+------+
                \              |              /
                 \             |             /
                  +------------v-------------+
                  |     RoutingStrategy      |
                  | Explicit | ModelBased |  |
                  | Fallback               |
                  +------------------------+

All three backends implement the same LlmProvider trait.

The unified client also implements LlmProvider, meaning callers never
need to change their call site when switching between a specific provider
and the multi-cloud client.


Builder pattern

// ModelBased: infer provider from model ID prefix
let client = UnifiedLlmClient::builder()
    .register("aws",   Box::new(BedrockProvider::new().await))
    .register("gcp",   Box::new(VertexAI::new("my-project", "us-central1")))
    .register("azure", Box::new(AzureOpenAIProvider::new()))
    .routing(RoutingStrategy::ModelBased)
    .build()?;

// anthropic.* → aws
let bedrock = client.generate(
    LlmRequest {
        model: ModelRef::Provider("anthropic.claude-3-5-haiku-20241022-v1:0".to_string()),
        ..
    }
).await?;

// gemini* → gcp
let vertex = client.generate(
    LlmRequest {
        model: ModelRef::Provider("gemini-1.5-flash".to_string()),
        ..
    }
).await?;

// gpt-* / Deployment(_) → azure
let azure = client.generate(
    LlmRequest {
        model: ModelRef::Deployment("gpt-4o".to_string()),
        ..
    }
).await?;

Fallback routing

// Fallback: resilient multi-cloud — skips rate-limited/network-failed providers
let client = UnifiedLlmClient::builder()
    .register("aws",   Box::new(BedrockProvider::new().await))
    .register("gcp",   Box::new(VertexAI::new("my-project", "us-central1")))
    .register("azure", Box::new(AzureOpenAIProvider::new()))
    .routing(RoutingStrategy::Fallback)
    .build()?;

Routing strategy rules

Strategy generate / stream / generate_with_tools embed
Explicit default provider default provider
ModelBased inferred from model ID, fallback to default default provider
Fallback providers in registration order, skip on transient errors providers in order

ModelBased inference table

Model prefix / type Routed to
anthropic.*, amazon.*, meta.*, cohere.*, mistral.* "aws"
gemini*, text-embedding-*, textembedding-* "gcp"
gpt-*, o1*, o3*, ModelRef::Deployment(_) "azure"
No match default provider

Fallback transient error detection

Fallback skips a provider and tries the next when it returns:

CloudError::RateLimit { .. }
CloudError::Network { .. }
CloudError::Provider { retryable: true, .. }

Hard errors (Auth, Unsupported, non-retryable Provider) also skip to
the next provider so that a misconfigured backend does not block healthy ones.


Design decisions

UnifiedLlmClient implements LlmProvider

Callers use identical code whether they hold a BedrockProvider
or a UnifiedLlmClient.

This ensures zero lock-in at the call site.


Builder validates at build()

Missing default provider or empty registry returns Err(String)
immediately rather than panicking during a request.


No new Cargo dependencies

The routing layer uses existing crate dependencies:

async-trait
futures
tokio

No binary size increase.


Vertex AI streaming uses ?alt=sse

The streamGenerateContent endpoint with ?alt=sse
returns standard Server Sent Events.

The implementation collects the full body then emits typed
LlmStreamEvent variants — consistent with the Bedrock stream design.


Files changed

File Change
src/aws/aws_apis/artificial_intelligence/aws_bedrock.rs Fixed CloudError variants, fixed ToolConfiguration builder loop
src/gcp/gcp_apis/artificial_intelligence/gcp_vertex_ai.rs Fixed all bugs, full rewrite
src/gcp/gcp_apis/artificial_intelligence/mod.rs Added pub mod gcp_vertex_ai
src/main.rs Wired gcp_vertex_ai + new genai module
src/tests/mod.rs Added gcp_vertex_ai_operations + unified_genai_operations
src/genai/mod.rs NEW — module docs + re-exports
src/genai/routing.rs NEW — RoutingStrategy, infer_provider(), is_transient()
src/genai/client.rs NEW — UnifiedLlmClient + builder
src/tests/unified_genai_operations.rs NEW — 14 tests
examples/unified_genai.md NEW — usage guide
README.md Added Vertex AI + multi-cloud section

Testing

Unit tests (no credentials required)

test_builder_no_providers_returns_error
test_builder_unknown_default_returns_error
test_builder_first_provider_becomes_default
test_explicit_routing_uses_default
test_model_based_routes_anthropic_to_aws
test_model_based_routes_gemini_to_gcp
test_model_based_routes_gpt_to_azure
test_model_based_falls_back_to_default_on_unknown_model
test_fallback_skips_rate_limited_provider
test_fallback_returns_last_error_when_all_fail
test_fallback_embed_skips_rate_limited_provider
test_stream_collects_events
test_generate_with_tools_explicit_routing

Integration tests (#[ignore])

Run with:

cargo test -- --include-ignored
integration_aws_bedrock_via_unified_client
integration_vertex_ai_via_unified_client
integration_azure_openai_via_unified_client
integration_fallback_across_all_providers

Related

Fixes compilation of #61 (AWS Bedrock)
Fixes and wires #53 (GCP Vertex AI)
Builds on #64 (Azure OpenAI)

Pattern inspired by googleapis/google-cloud-rust:

Centralised credential layer with shared error / retry primitives
consumed by all service clients.

@atharva-nagane atharva-nagane force-pushed the feat/unified-genai-routing branch from 50a8276 to 01094b5 Compare March 17, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: unified multi-cloud GenAI routing client + fix AWS Bedrock & GCP Vertex AI compilation errors

1 participant