Terraform-based GCP deployment package for Osirus AI.
This stack deploys a production-style, serverless-first GCP architecture:
- Global HTTP(S) load balancing routes traffic to Cloud Run services (
app,api). - API runs with a sidecar (
searxng) on Cloud Run. - Data plane services:
- Cloud SQL (MySQL)
- Memorystore (Redis)
- Cloud Storage assets bucket
- Optional domain + managed certificate for HTTPS.
- Cloud Run Job executes database migrations.
- Secret Manager + IAM secure runtime secrets and access.
aws_attached mode is also supported, where Cloud Run/LB stays on GCP while data services come from AWS stack outputs.
Custom Domain (osirus.ai)
|
v
Global External HTTP(S) Load Balancer
|
+------------------------------+
| |
v v
Cloud Run: app Cloud Run: api (+ searxng sidecar)
|
+--> Secret Manager (runtime/API keys)
+--> Cloud SQL (MySQL)
+--> Memorystore (Redis)
+--> Cloud Storage (assets)
+--> OpenSearch endpoint (configured)
Ops path:
- Cloud Run Job: <stack>-api-migrations -> Cloud SQL
- IAM roles/policies control service-to-service access
- Cloud Logging/Monitoring via Cloud Run + GCP platform services
In standalone mode, osirus.ai traffic reaches the GCP load balancer and is routed to the api Cloud Run service. The API then connects to Cloud SQL (DB), Memorystore (Redis), Cloud Storage (assets/files), and configured OpenSearch endpoint values from runtime environment/secrets.
In aws_attached mode, routing and runtime remain on GCP (Load Balancer + Cloud Run), but gcp.sh resolves AWS CloudFormation outputs and injects those endpoints into Terraform variables, so the same API container connects to AWS-hosted DB/Redis/S3/OpenSearch instead of GCP-managed equivalents.
terraform/Terraform modules, providers, and examplesgcp.shclean wrapper for init/plan/apply/destroy/bootstrap/migrations
- Create local tfvars from examples:
cp terraform/terraform.tfvars.example terraform/terraform.tfvars
cp terraform/terraform.launch.tfvars.example terraform/terraform.launch.tfvars-
Edit local tfvars with your project/images/secrets.
-
Run deploy workflow:
./gcp.sh init
./gcp.sh plan standalone
./gcp.sh up standaloneAWS_STACK_NAME=<aws-stack-name> AWS_REGION=us-east-1 ./gcp.sh plan aws_attached
AWS_STACK_NAME=<aws-stack-name> AWS_REGION=us-east-1 ./gcp.sh up aws_attachedGOOGLE_PROJECT_ID=<project-id> STACK_NAME=osirus-ai ./gcp.sh bootstrap-role
GOOGLE_PROJECT_ID=<project-id> REGION=us-central1 STACK_NAME=osirus-ai ./gcp.sh migrations- Keep
terraform/*.tfvarslocal and untracked. - Never commit real API keys, tokens, or service account secrets.
User UI
|
| 1) Prompt / task request
v
Osirus API (Cloud Run)
|
| 2) Prefetch context (RAG/search grounding)
v
Vertex AI Search (Vertex Search)
|
| 3) Ranked docs/snippets returned
v
Osirus API (context assembly + prompt building)
|
| 4) Model invocation
v
Vertex AI Model Endpoint
|\
| \-- Gemini (text/multimodal)
| \-- Nano Banana (configured model route)
|
| 5) Model output
v
Osirus API (post-processing, policy/formatting)
|
| 6) API response / stream
v
User UI
Notes:
- The API orchestrates both retrieval (Vertex Search) and generation (Vertex AI).
- Provider/model choice is controlled by API routing/config (for example Gemini vs Nano Banana).
- The same response path returns to the UI regardless of chosen model.