Skip to content

Add facility for extensible relay services with direct async calls#716

Open
williampMSFT wants to merge 4 commits intoOpenDevicePartnership:v0.2.0from
williampMSFT:user/williamp/es-tad-tests
Open

Add facility for extensible relay services with direct async calls#716
williampMSFT wants to merge 4 commits intoOpenDevicePartnership:v0.2.0from
williampMSFT:user/williamp/es-tad-tests

Conversation

@williampMSFT
Copy link
Contributor

This change adds a facility to allow relay services to be user-extensible by hoisting the knowledge of which types are relayable out of the relay service (e.g. eSPI service) into the application layer. The application layer can pass in a list of (name, address, relay-handler) tuples and use those to instantiate a relay service.

This enables OEMs to add their own services and messages that can share the same message bus.

This requires a few things:

  • Eliminating all static state from within the eSPI service (since type sizes are no longer known by the eSPI service)
  • Implementing a trait for each relayable service to do conversion between wire formats and function calls

The time-alarm service has been ported to leverage this new facility as an example; the other relayable services (battery, debug, thermal) will be migrated in a future change.

Because we're moving from the comms system to direct async calls, we incidentally also get rid of the requirement imposed by the comms system that services have lifetime 'static. This incidentally also allows making services that leverage this new facility generic over the lifetime of the hardware that they manage, which enables some integration testing scenarios. To demonstrate this capability, a couple simple tests were added to the time-alarm service.

This change adds a facility to allow relay services to be user-extensible by hoisting the knowledge
of which types are relayable out of the relay service (e.g. eSPI service) into the application layer.
The application layer can pass in a list of (name, address, relay-handler) tuples and use those to
instantiate a relay service.

This enables OEMs to add their own services and messages that can share the same message bus.

This requires a few things:
- Eliminating all static state from within the eSPI service (since type sizes are no longer known by the eSPI service)
- Implementing a trait for each relayable service to do conversion between wire formats and function calls

The time-alarm service has been ported to leverage this new facility as an example; the other relayable services
(battery, debug, thermal) will be migrated in a future change.

Because we're moving from the comms system to direct async calls, we incidentally also get rid of the requirement
imposed by the comms system that services have lifetime 'static.  This incidentally also allows making services
that leverage this new facility generic over the lifetime of the hardware that they manage, which enables some
integration testing scenarios.  To demonstrate this capability, a couple simple tests were added to the time-alarm service.
Copilot AI review requested due to automatic review settings February 12, 2026 23:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new, application-extensible MCTP relay mechanism that moves “relayable type knowledge” out of relay services (e.g., eSPI/UART) and into the application layer via a generated RelayHandler. As part of the transition away from the legacy comms system, the time-alarm service is migrated to direct async calls + relay traits, with basic tokio-based tests added as an example.

Changes:

  • Add v2 relay traits + impl_odp_mctp_relay_handler! macro to generate extensible relay handlers.
  • Update eSPI (and UART) relay integration to support both legacy comms-based routing and new direct async relay routing.
  • Port time-alarm service to the new model (non-'static capable service), add tests, and update the RT685 example to demonstrate direct calls + relay handler usage.

Reviewed changes

Copilot reviewed 18 out of 21 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
embedded-service/src/relay/mod.rs Adds relay v2 traits and the new relay handler macro.
embedded-service/src/lib.rs Adds hidden _macro_internal re-exports for macro expansion.
espi-service/src/espi_service.rs Refactors eSPI service to be generic over a relay handler and supports legacy/new routing.
espi-service/src/task.rs Adjusts task entry to run the new service loop.
espi-service/src/mctp.rs Splits legacy relay types into a deprecated compatibility module.
uart-service/src/mctp.rs Mirrors the legacy relay compatibility module approach.
time-alarm-service/src/lib.rs Migrates time-alarm service to direct-call API + implements relay handler traits.
time-alarm-service/src/task.rs Consolidates worker tasks into a single run_service loop.
time-alarm-service/src/timer.rs Removes 'static storage requirements by threading a hardware lifetime.
time-alarm-service/tests/* Adds tokio tests and std-backed mocks to validate non-'static usage.
time-alarm-service-messages/src/* Adds error conversion + adjusts timestamp parsing error plumbing.
examples/rt685s-evk/src/bin/time_alarm.rs Updates example to spawn the unified task and demonstrate direct calls + relay handler generation.
Cargo.toml / embedded-service/Cargo.toml Adds paste dependency (workspace + crate).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 12, 2026 23:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 24 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@williampMSFT williampMSFT marked this pull request as ready for review February 12, 2026 23:57
@williampMSFT williampMSFT requested review from a team as code owners February 12, 2026 23:57
Copy link
Contributor

@kurtjd kurtjd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! My only minor concern is leaving the deprecated/legacy code in here. Could it make sense instead to start a separate branch where service owners can update their services, then when the migration is done, merge the branch into v0.2.0? Though I don't feel strongly so if you've considered this then no big deal.

@williampMSFT
Copy link
Contributor Author

Looks good! My only minor concern is leaving the deprecated/legacy code in here. Could it make sense instead to start a separate branch where service owners can update their services, then when the migration is done, merge the branch into v0.2.0? Though I don't feel strongly so if you've considered this then no big deal.

I thought about it, but I think there's enough upside to just going into v0.2 that a separate feature branch might not be worth it. If we do this piecemeal migration instead of a branch, we get a few nontrivial benefits:

  • Existing services can migrate at their own pace
  • We can bring up new services on the new system in parallel with migration of existing services
  • Services that do migrate can drop the 'static requirement imposed by the comms service and start writing and leveraging 'integration tests' today, without waiting for every other service to migrate first
  • Early migrators don't have to pay the cost of keeping two branches in sync while slower migrators are in the middle of migration (or worse, fail to keep the branches in sync and then have a nasty merge when we're ready to RI)

I think the main downside is the potential that some new services could fall into the trap of using the 'old' facility during bringup, but I think there are a few mitigations for that:

  • the #[deprecated] warnings should help steer folks in the right direction (CI should block their checkin if they try to bring stuff up using the old facility)
  • the new trait requirements for services are nearly a superset of the trait requirements for the old facility, so if someone does fall into this trap, a lot of their work should still be valuable,
  • since the alternative is that they use the old facility and then we have to convert that service too, even if they do miss the deprecation warnings until later in development and need to migrate, I think we're still better off because we caught it early

That said, if I've missed something, I'm open to other options :)

@kurtjd
Copy link
Contributor

kurtjd commented Feb 13, 2026

Looks good! My only minor concern is leaving the deprecated/legacy code in here. Could it make sense instead to start a separate branch where service owners can update their services, then when the migration is done, merge the branch into v0.2.0? Though I don't feel strongly so if you've considered this then no big deal.

I thought about it, but I think there's enough upside to just going into v0.2 that a separate feature branch might not be worth it. If we do this piecemeal migration instead of a branch, we get a few nontrivial benefits:

  • Existing services can migrate at their own pace
  • We can bring up new services on the new system in parallel with migration of existing services
  • Services that do migrate can drop the 'static requirement imposed by the comms service and start writing and leveraging 'integration tests' today, without waiting for every other service to migrate first
  • Early migrators don't have to pay the cost of keeping two branches in sync while slower migrators are in the middle of migration (or worse, fail to keep the branches in sync and then have a nasty merge when we're ready to RI)

I think the main downside is the potential that some new services could fall into the trap of using the 'old' facility during bringup, but I think there are a few mitigations for that:

  • the #[deprecated] warnings should help steer folks in the right direction (CI should block their checkin if they try to bring stuff up using the old facility)
  • the new trait requirements for services are nearly a superset of the trait requirements for the old facility, so if someone does fall into this trap, a lot of their work should still be valuable,
  • since the alternative is that they use the old facility and then we have to convert that service too, even if they do miss the deprecation warnings until later in development and need to migrate, I think we're still better off because we caught it early

That said, if I've missed something, I'm open to other options :)

Makes sense to me, I appreciate the explanation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants