Skip to content

Observability Engine Foundation (Phase A)#2

Open
ashum9 wants to merge 9 commits into
mofa-org:enginefrom
ashum9:observability
Open

Observability Engine Foundation (Phase A)#2
ashum9 wants to merge 9 commits into
mofa-org:enginefrom
ashum9:observability

Conversation

@ashum9

@ashum9 ashum9 commented Jun 24, 2026

Copy link
Copy Markdown

Implements the Phase A observability foundation for the MoFA Engine.

Changes:

  1. mofa-observability: new crate added to workspace
  2. events.rs: 11 structured event types across request, model lifecycle,
    memory, and preflight domains
  3. collector.rs: async metrics collector with counters, histograms, and
    gauges calibrated to Phase 0 benchmarking data
  4. OpenTelemetry trace_id, span_id, and span_end_timestamp added to
    EventEnvelope.

All 26 unit tests passing. Next: Prometheus renderer.

Copilot AI review requested due to automatic review settings June 24, 2026 08:55

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new mofa-observability workspace crate to establish the Phase A observability foundation, defining a structured event schema and an in-memory async metrics collector intended to be scraped by a future Prometheus renderer.

Changes:

  • Introduces mofa-observability crate with public events and collector modules.
  • Defines an EventEnvelope plus structured engine event payload types for requests, model lifecycle, memory, preflight, and infra.
  • Implements a Tokio-driven metrics collector with counters, histograms, gauges, and a bounded event channel.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
mofa-observability/src/lib.rs Exposes the new crate’s events and collector modules.
mofa-observability/src/events.rs Adds structured event types and a standard envelope for observability data.
mofa-observability/src/collector.rs Adds an async metrics collector, metric families, and an event channel abstraction.
mofa-observability/Cargo.toml Declares the new crate and its serde/tokio dependencies via workspace.
Cargo.toml Registers mofa-observability as a workspace member.
Cargo.lock Adds the new crate entry to the lockfile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +195 to +210
/// Emitted after the router selects a model.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RoutingDecision {
/// Requested capability.
pub capability: Capability,
/// How many models were eligible.
pub candidates_count: u32,
/// The model chosen.
pub selected_model: String,
/// The backend hosting that model.
pub selected_backend: String,
/// Whether this was a fallback selection.
pub is_fallback: bool,
/// Why this model was chosen (e.g., "local_first", "only_candidate", "fallback_after_failure").
pub reason: String,
}
Comment on lines +18 to +43
/// What the caller asked for. Bounded enum — no free-form strings.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum Capability {
Chat,
Tts,
Asr,
Vlm,
ImageGen,
VideoGen,
Embedding,
}

impl std::fmt::Display for Capability {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Capability::Chat => write!(f, "chat"),
Capability::Tts => write!(f, "tts"),
Capability::Asr => write!(f, "asr"),
Capability::Vlm => write!(f, "vlm"),
Capability::ImageGen => write!(f, "image_gen"),
Capability::VideoGen => write!(f, "video_gen"),
Capability::Embedding => write!(f, "embedding"),
}
}
}
Comment on lines +98 to +104
/// OpenTelemetry trace ID (128-bit hex string).
pub trace_id: Option<String>,
/// OpenTelemetry span ID (64-bit hex string).
pub span_id: Option<String>,
/// OpenTelemetry span end timestamp (Unix ms).
pub span_end_timestamp: Option<u64>,
/// The event payload.
Comment thread mofa-observability/src/collector.rs Outdated
Comment on lines +6 to +11
//! Design:
//! - Single writer (collector task), multiple readers (Prometheus scrapes).
//! - No blocking I/O. No disk. Just in-memory atomics behind a RwLock.
//! - Bounded event channel with drop-oldest backpressure.
//!
//! Reference: observability_plan.md §4
Comment on lines +444 to +458
impl EventSender {
/// Send an event. If the channel is full, the event is dropped
/// and the dropped counter is incremented.
pub fn send(&self, event: EventEnvelope) {
if self.tx.try_send(event).is_err() {
self.dropped
.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
}
}

/// How many events have been dropped due to backpressure.
pub fn dropped_count(&self) -> u64 {
self.dropped.load(std::sync::atomic::Ordering::Relaxed)
}
}
Comment on lines +983 to +984
// Simulate 100 events across all 7 capabilities × 2 statuses = 14 series max
let capabilities = [
Comment on lines +61 to +80
impl HistogramValue {
fn new(boundaries: &[f64]) -> Self {
let buckets = boundaries.iter().map(|&b| (b, 0u64)).collect();
Self {
buckets,
sum: 0.0,
count: 0,
}
}

fn observe(&mut self, value: f64) {
self.sum += value;
self.count += 1;
for bucket in &mut self.buckets {
if value <= bucket.0 {
bucket.1 += 1;
}
}
}
}
Comment on lines +29 to +33
pub fn add(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
self.0.push((key.into(), value.into()));
self.0.sort_by(|a, b| a.0.cmp(&b.0));
self
}
Comment on lines +6 to +12
//! Design principles:
//! - Zero side effects: events are data, not actions.
//! - Privacy: no prompt text, file contents, API keys, or user-identifying info. Ever.
//! - Bounded enums: capability, reason, source, status use enums, not free-form strings.
//!
//! Reference: observability_plan.md §3

Comment on lines +9 to +12
//! - Bounded event channel with drop-oldest backpressure.
//!
//! Reference: observability_plan.md §4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants