Skip to content

Add shared monitor dispatcher scheduler#185

Merged
Pigbibi merged 1 commit into
mainfrom
feature/monitor-dispatcher-scheduler
Jun 19, 2026
Merged

Add shared monitor dispatcher scheduler#185
Pigbibi merged 1 commit into
mainfrom
feature/monitor-dispatcher-scheduler

Conversation

@Pigbibi

@Pigbibi Pigbibi commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • add /monitor-dispatch to fan out due probe and dry-run checks from one shared scheduler
  • keep the formal strategy scheduler on /run only
  • sync monitor target config and delete legacy probe/precheck scheduler jobs

Tests

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --with pytest --with requests --with google-auth python -m pytest tests/test_monitor_dispatcher.py tests/test_request_handling.py::RequestHandlingTests::test_cloud_run_route_contracts_are_registered tests/test_request_handling.py::RequestHandlingTests::test_handle_monitor_dispatch_post_dispatches_due_targets
  • bash tests/test_sync_cloud_run_env_workflow.sh
  • python3 -m py_compile application/monitor_dispatcher.py main.py

@Pigbibi Pigbibi merged commit bfabb1e into main Jun 19, 2026
1 check passed
@Pigbibi Pigbibi deleted the feature/monitor-dispatcher-scheduler branch June 19, 2026 10:43

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 107ca2e323

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

"strategy_profile": runtime_target.get("strategy_profile") or source.get("STRATEGY_PROFILE"),
"account_scope": runtime_target.get("account_scope") or source.get("ACCOUNT_REGION") or source.get("account_scope"),
"runtime_target_enabled": source.get("runtime_target_enabled", source.get("RUNTIME_TARGET_ENABLED", True)),
"scheduler": runtime_target.get("scheduler") if isinstance(runtime_target.get("scheduler"), dict) else {},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Populate fallback monitor schedules

When CLOUD_RUN_SERVICE_TARGETS_JSON contains the minimal target entries supported elsewhere (for example {"service":"svc-a"}), or when the generated RUNTIME_TARGET_JSON has no scheduler, this writes {} into MONITOR_DISPATCH_TARGETS_JSON. The new dispatcher only calls /probe and /dry-run when probe_time/precheck_time are present, and this workflow later deletes the legacy probe/precheck jobs, so default-configured services silently stop receiving monitor checks despite the CLOUD_SCHEDULER_PROBE_TIME/CLOUD_SCHEDULER_PRECHECK_TIME defaults. Please include the resolved fallback schedule in each monitor target before removing the old jobs.

Useful? React with 👍 / 👎.

Comment thread main.py
result = dispatch_due_monitors(load_monitor_targets())
except Exception as exc:
return _handle_route_runtime_error(exc, route_label="POST /monitor-dispatch")
return json.dumps(result, ensure_ascii=False), 200, {"Content-Type": "application/json"}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Propagate dispatch failures to Cloud Scheduler

If any downstream monitor request times out, returns 403/500, or otherwise fails, dispatch_due_monitors reports {"ok": false} but this route still returns HTTP 200. In that scenario the shared Cloud Scheduler job treats the run as successful and won't retry or surface the dispatcher failure, even though no target-side notification may have happened (for example on IAM/network failures). Return a non-2xx status when result["ok"] is false so the scheduler can detect the failed dispatch.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant