Skip to content

feat(clusters): add real-time cluster log streaming#182

Merged
alek-thunder merged 6 commits intomainfrom
feat/cluster-logs
Feb 9, 2026
Merged

feat(clusters): add real-time cluster log streaming#182
alek-thunder merged 6 commits intomainfrom
feat/cluster-logs

Conversation

@srnbckr
Copy link
Contributor

@srnbckr srnbckr commented Feb 6, 2026

Description

This PR adds the possibility to stream cluster deployment and deletion events from the API.

  • Add exls clusters logs <CLUSTER_NAME_OR_ID> command to stream Kubernetes events in real-time via the GET /cluster/{cluster_id}/logs NDJSON endpoint, giving users visibility into what is happening during cluster provisioning and helping them spot issues early
  • Add --follow / -f flag on exls clusters deploy to automatically tail logs after deployment starts
  • Add --json flag on logs command for raw NDJSON output

Notes for Reviewers

The generated SDK cannot handle NDJSON streams, so this introduces StreamingGetRequestCommand in the shared HTTP infrastructure using requests with stream=True + iter_lines(). Malformed lines and mid-stream connection drops are handled gracefully.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds real-time cluster log streaming capabilities to the CLI, enabling users to monitor Kubernetes events during cluster deployment and troubleshoot issues more effectively. The implementation introduces a new streaming infrastructure to handle NDJSON responses from the backend API.

Changes:

  • Added StreamingGetRequestCommand base class in shared HTTP infrastructure to handle NDJSON streaming with graceful error handling for malformed lines and connection drops
  • Added exls clusters logs command with --json flag for raw NDJSON output
  • Added --follow/-f flag to exls clusters deploy command to automatically tail logs after deployment starts

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
exls/shared/adapters/http/commands.py Adds StreamingGetRequestCommand base class for NDJSON streaming with error handling
exls/clusters/core/domain.py Adds ClusterEvent and ClusterEventInvolvedObject domain models
exls/clusters/core/service.py Adds stream_cluster_logs service method with error handling decorator
exls/clusters/core/ports/operations.py Adds stream_logs port method to operations interface
exls/clusters/adapters/adapter.py Implements stream_logs in adapter layer
exls/clusters/adapters/gateway/gateway.py Adds abstract stream_logs method to gateway interface
exls/clusters/adapters/gateway/sdk/sdk.py Implements streaming in SDK gateway with proper resource cleanup
exls/clusters/adapters/gateway/sdk/commands.py Adds StreamClusterLogsSdkCommand using streaming infrastructure
exls/clusters/adapters/bundle.py Updates bundle to pass base_url and access_token to gateway and provides log renderer
exls/clusters/adapters/ui/display/log_renderer.py Adds ClusterLogRenderer for formatted console output
exls/clusters/app.py Adds logs command and --follow flag with shared streaming helper
tests/unit/shared/test_streaming_command.py Comprehensive tests for streaming command error handling and edge cases
tests/unit/clusters/test_log_renderer.py Tests for log rendering including multiline message handling
tests/unit/clusters/test_cluster_event_domain.py Tests for domain model validation and serialization
tests/unit/clusters/test_clusters_app.py Tests for iterator cleanup on normal exit and keyboard interrupt

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@alek-thunder
Copy link
Contributor

Nice feature!

Perfect structural Implementation for the StreamingGetRequestCommand and the Gateway. Fits perfectly into our adapter layer.

I’ve refactored the display logic to better align with the architecture:

  1. The logic for formatting log events (colors, timestamps) was removed from the cluster-specific renderer. Instead, I extended the shared text rendering and added a text formatter attribute what allows a custom text rendering for objects. We use a new CLUSTER_LOG_TEXT_VIEW in the cluster display adapter to customize the rendering. Domain specific redering configuration should live in the domain-specific display adapters while general logic to apply rendering is shared.
  2. I added a display_stream method to our Shared Kernel (IOFacade). This centralizes the loop handling, Ctrl+C interruption, and resource cleanup, ensuring any future streaming commands (e.g., node stats) can reuse this implementation without duplicating code. Removed this from the CLI layer, which should focused solely on orchestration; not on stream iteration and interruption logic handling.

This separation keeps our domain logic pure and our shared display logic consistent across the domains.

@alek-thunder alek-thunder merged commit 21570b6 into main Feb 9, 2026
2 checks passed
@alek-thunder alek-thunder deleted the feat/cluster-logs branch February 9, 2026 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants