Skip to content

Reduce cold-start overhead and simplify single-service deployment#593

Merged
jirhiker merged 2 commits intostagingfrom
deployment-optimizations
Mar 10, 2026
Merged

Reduce cold-start overhead and simplify single-service deployment#593
jirhiker merged 2 commits intostagingfrom
deployment-optimizations

Conversation

@jirhiker
Copy link
Copy Markdown
Member

@jirhiker jirhiker commented Mar 10, 2026

Summary

This PR keeps the app on a single GAE service while reducing startup cost and cleaning up deploy/dev ergonomics.

Main changes:

  • keeps API, /ogcapi, and /admin in one service
  • lazy-loads admin on first /admin access instead of importing/mounting it at app startup
  • keeps GCS imports lazy on asset paths
  • removes the leftover split-service config and entrypoints
  • adds startup/request timing logs for cold vs warm behavior
  • hardens GAE deploy cleanup and keeps warmup/business-hours scaling support
  • adds a follow-up Alembic migration to rebuild the water elevation materialized view for DBs that already recorded the earlier revision but still have the old schema

Why

The previous refactor proved that cold-start cost was dominated by import-time work. Splitting admin and pygeoapi into separate services reduced that cost, but it also added routing, Docker, and deploy complexity that is not worth it for current usage.

This branch takes the lower-complexity version:

  • stay single-service
  • keep the low-risk lazy-loading wins
  • move admin off the main startup path
  • keep GAE warmup/scaling improvements
  • preserve local Docker usability

Key changes

Startup and runtime

  • core/factory.py
    • uses shared app factory
    • mounts pygeoapi in the main app
    • loads session middleware only when SESSION_SECRET_KEY is present
    • enables lazy admin initialization
  • core/initializers.py
    • splits API route registration from admin setup
    • mounts admin only on first /admin request
    • returns a clear 503 for /admin when SESSION_SECRET_KEY is not configured
  • core/app.py
    • adds /_ah/warmup
    • logs instance startup complete
    • logs per-request timing with request_kind=cold|warm

Lazy imports / import-path cleanup

  • api/asset.py
    • lazy-imports GCS helpers at use sites
  • services/gcs_helper.py
    • moves Google client imports inside functions
  • services/asset_helper.py
    • avoids runtime import of google.cloud.storage for typing only
  • services/env.py
    • new lightweight env helper module
  • db/engine.py
  • alembic/env.py
    • now import get_bool_env from the lightweight env module instead of the heavier utility module

Deployment / App Engine

  • .github/app.template.yaml
    • single-service App Engine config
    • gunicorn -w 1
    • warmup enabled
    • min/max scaling rendered from workflow
  • .github/workflows/CD_staging.yml
  • .github/workflows/CD_production.yml
    • render/deploy one service again
    • run migrations before deploy
    • refresh materialized views
    • safely delete only old non-serving versions
    • safely handle first/single-version deploy cases

Local Docker dev

  • docker-compose.yml
    • back to db + app
    • app serves API, admin, and OGC on :8000
  • entrypoint.sh
    • parameterized startup remains
    • RUN_MIGRATIONS defaults to true

Materialized view fix

  • alembic/versions/o8b9c0d1e2f3_rebuild_water_elevation_materialized_view.py
    • rebuilds ogc_water_elevation_wells with the expected feet-normalized schema
  • tests/integration/test_alembic_migrations.py
    • verifies the actual materialized view columns after migration

Notes

  • SESSION_SECRET_KEY is no longer required just to boot the main API locally.
    • It is only required if you want to use /admin.
    • If it is missing, /admin returns a clear 503 instead of breaking main app startup.
  • The branch does not introduce any new required deploy secrets.
  • App Engine runtime service account behavior stays aligned with the prior template:
    • service_account: "${CLOUD_SQL_USER}.gserviceaccount.com"

Validation

Ran:

  • pytest tests/test_lazy_admin.py tests/test_request_timing.py tests/test_ogc.py tests/test_admin_views.py tests/integration/test_alembic_migrations.py tests/test_cli_commands.py -q

Result:

  • 55 passed, 3 skipped

Also verified:

  • App Engine template/workflow YAML parsing
  • import main works with SESSION_SECRET_KEY unset
  • targeted compile checks on touched Python files

Follow-up

The largest remaining cold-start cost is now pygeoapi import/setup, not admin. If more startup reduction is needed later, that is the next area to target.

@jirhiker jirhiker changed the base branch from main to staging March 10, 2026 15:18
@jirhiker jirhiker changed the title Fix unused imports and reinforce cleanup instructions Reduce cold-start overhead and simplify single-service deployment Mar 10, 2026
@jirhiker jirhiker merged commit d79d761 into staging Mar 10, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

gcs_remove(asset.uri, bucket)

P1 Badge Delete GCS objects by storage_path, not URI

remove_asset passes asset.uri to gcs_remove, but gcs_remove calls bucket.blob(...), which expects a blob/object name (the storage_path), not a full URL. For assets created through the upload flow, this targets a non-existent object key and makes /asset/{asset_id}/remove fail to delete the real file (or return an error) in production cleanup scenarios.


raise PydanticStyleException(status_code=HTTP_409_CONFLICT, detail=[detail])

P2 Badge Guard detail before raising custom asset DB error

database_error_handler unconditionally raises PydanticStyleException(detail=[detail]), but detail is only assigned inside one specific Postgres message match. Any other ProgrammingError path will hit an UnboundLocalError, masking the real database failure and returning an unexpected 500 instead of a controlled API error.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant