Skip to content

Enable Remote Access for Embedded/no_std Environments #39

@lxsaah

Description

@lxsaah

The current remote access implementation (AimX v1 protocol) is fully functional but locked to std environments (Tokio runtime, Unix sockets). This issue tracks the architectural changes needed to enable remote access on embedded MCU targets (Embassy runtime, no_std), allowing developers to introspect and debug AimDB instances running on resource-constrained devices.


Current State

What Works (std only)

✅ Full AimX v1 protocol implementation
✅ Unix domain socket transport (Tokio + tokio::net::UnixListener)
✅ NDJSON protocol parsing with serde_json
✅ Record introspection (record.list, record.get)
✅ Real-time subscriptions with backpressure
✅ Read-only and read-write security policies
✅ Connection handling with event funnel pattern

Hard Dependencies on std

The remote access subsystem (aimdb-core/src/remote/) is currently gated behind #[cfg(feature = "std")] due to:

1. Tokio-specific APIs

  • tokio::net::UnixListener / UnixStream (supervisor.rs, handler.rs)
  • tokio::sync::mpsc channels for event funneling
  • tokio::sync::oneshot for subscription cancellation
  • tokio::spawn() for connection handlers
  • tokio::select! macro for multiplexing I/O

2. Standard Library Types

  • std::collections::HashMap for subscription tracking
  • std::path::PathBuf for socket paths
  • std::string::String everywhere (not available in no_std)
  • std::vec::Vec for dynamic collections
  • std::sync::Arc for shared ownership

3. Serialization Stack

  • serde_json for NDJSON protocol (requires heap allocation)
  • thiserror for error handling (std-only)

4. I/O Model

  • Unix domain sockets (OS-level IPC, not available on bare metal)
  • Assumes filesystem for socket files

5. Memory Allocation

  • Unbounded channels (mpsc::unbounded_channel)
  • Dynamic string formatting in errors
  • JSON serialization (allocates strings)

Embedded Constraints

Target Environment

  • MCU Class: ARM Cortex-M4/M7, RISC-V, ESP32
  • RAM: 64-512 KB (vs. megabytes on edge/cloud)
  • Flash: 512 KB - 2 MB
  • No OS: Bare metal or RTOS only
  • No filesystem: No concept of file paths or socket files
  • Limited heap: Prefer stack allocation or fixed-size buffers

Transport Requirements

Since Unix sockets don't exist on MCUs, we need alternatives:

  1. Serial (UART/USB-CDC) - Most common debug interface
  2. TCP over Ethernet/WiFi - For networked MCUs (ESP32, STM32 with PHY)
  3. USB bulk endpoints - High-speed debug channel
  4. Shared memory - For coprocessor scenarios (rare)

Proposed Architecture

Phase 1: Core Abstractions (Foundation)

1.1 Transport Abstraction Layer

Create a trait to abstract over different transports:

/// Transport-agnostic connection interface
pub trait RemoteTransport: Send + 'static {
    /// Read one NDJSON line (blocking or async)
    async fn read_line(&mut self, buf: &mut [u8]) -> Result<usize, TransportError>;
    
    /// Write one NDJSON line (blocking or async)
    async fn write_line(&mut self, line: &[u8]) -> Result<(), TransportError>;
    
    /// Flush buffered writes
    async fn flush(&mut self) -> Result<(), TransportError>;
}

/// Error type for transport operations
#[derive(Debug, Clone)]
pub enum TransportError {
    IoError,
    BufferFull,
    Disconnected,
}

Implementations:

  • UdsTransport (std): Wraps tokio::net::UnixStream
  • SerialTransport (embedded): Wraps embassy_usb::UsbDevice or UART
  • TcpTransport (embedded): Wraps embassy_net::TcpSocket

1.2 Replace Standard Collections

Use heapless crate for fixed-size, stack-allocated collections:

// Before (std)
use std::collections::HashMap;
let subscriptions: HashMap<String, SubscriptionHandle> = HashMap::new();

// After (no_std)
use heapless::{FnvIndexMap, String};
const MAX_SUBSCRIPTIONS: usize = 8;
let subscriptions: FnvIndexMap<String<64>, SubscriptionHandle, MAX_SUBSCRIPTIONS> 
    = FnvIndexMap::new();

Changes needed:

  • HashMap<String, _>FnvIndexMap<String<N>, _, MAX>
  • Vec<T>Vec<T, MAX> (heapless)
  • PathBuf → Remove entirely (not needed on embedded)

1.3 String Management

Replace dynamic strings with fixed-size buffers:

// Configuration
use heapless::String;

pub struct AimxConfig<const MAX_RECORD_NAME_LEN: usize = 64> {
    // socket_path removed (not applicable)
    pub security_policy: SecurityPolicy,
    pub max_connections: usize,
    pub subscription_queue_size: usize,
    pub auth_token: Option<String<32>>,  // Fixed-size
}

1.4 JSON Serialization

Two options:

Option A: Keep serde_json with alloc

  • Requires heap allocator (e.g., embedded-alloc, alloc-cortex-m)
  • Acceptable for MCUs with >128KB RAM
  • Simplest migration path

Option B: Replace with serde-json-core

  • Zero-allocation JSON parser/serializer
  • Writes to fixed-size buffers
  • More complex, but lower memory footprint
  • Example:
    use serde_json_core::{to_slice, from_slice};
    let mut buf = [0u8; 256];
    let len = to_slice(&my_struct, &mut buf)?;

Recommendation: Start with Option A (serde_json + alloc), move to Option B if memory becomes critical.

1.5 Error Handling Without thiserror

Replace thiserror with manual Display implementations or use defmt:

// Before (std)
#[derive(Debug, Clone, Error)]
pub enum RemoteError {
    #[error("Not found: {resource}")]
    NotFound { resource: String },
}

// After (no_std with defmt)
#[derive(Debug, Clone)]
#[cfg_attr(feature = "defmt", derive(defmt::Format))]
pub enum RemoteError {
    NotFound,  // Context stored separately or in fixed-size buffer
}

Phase 2: Protocol Adaptations

2.1 Memory-Bounded Message Parsing

Current implementation uses BufReader with unbounded line buffers. Embedded version needs fixed limits:

const MAX_LINE_LEN: usize = 512;  // Configurable per platform

async fn read_request<T: RemoteTransport>(
    transport: &mut T
) -> Result<Request, RemoteError> {
    let mut buf = [0u8; MAX_LINE_LEN];
    let len = transport.read_line(&mut buf).await?;
    
    // Parse from fixed buffer
    let request: Request = serde_json_core::from_slice(&buf[..len])
        .map_err(|_| RemoteError::ProtocolError)?
        .0;  // serde-json-core returns (T, usize)
    
    Ok(request)
}

2.2 Subscription Limits

Enforce strict limits on active subscriptions:

const MAX_SUBSCRIPTIONS_PER_CLIENT: usize = 4;  // vs. dynamic in std version
const MAX_EVENTS_IN_FLIGHT: usize = 16;         // Bounded queue

2.3 Connection Handler Pool

Instead of spawning unbounded tasks, use a fixed-size connection pool:

// Embassy task pool
#[embassy_executor::task(pool_size = 4)]
async fn connection_handler(
    transport: /* ... */,
    db: /* ... */,
) {
    // Handle connection
}

Phase 3: Transport Implementations

3.1 Serial Transport (UART/USB-CDC)

Primary debug interface for most MCUs:

use embassy_usb::class::cdc_acm::{CdcAcmClass, Receiver, Sender};

pub struct SerialTransport<'d> {
    rx: Receiver<'d, /* ... */>,
    tx: Sender<'d, /* ... */>,
    read_buf: [u8; 512],
    write_buf: [u8; 512],
}

impl RemoteTransport for SerialTransport<'_> {
    async fn read_line(&mut self, buf: &mut [u8]) -> Result<usize, TransportError> {
        // Read until '\n', handle line buffering
        // ...
    }
    
    async fn write_line(&mut self, line: &[u8]) -> Result<(), TransportError> {
        // Write with '\n' terminator
        self.tx.write_packet(line).await
            .map_err(|_| TransportError::IoError)?;
        self.tx.write_packet(b"\n").await
            .map_err(|_| TransportError::IoError)?;
        Ok(())
    }
}

3.2 TCP Transport (Networked MCUs)

For ESP32, STM32H7 with Ethernet PHY:

use embassy_net::{TcpSocket, Stack};

pub struct TcpTransport<'a> {
    socket: TcpSocket<'a>,
    read_buf: [u8; 1024],
    write_buf: [u8; 1024],
}

impl RemoteTransport for TcpTransport<'_> {
    // Similar to SerialTransport, but use socket.read()/write()
}

Phase 4: Runtime Adapter Integration

4.1 Embassy Executor Integration

Modify spawn_supervisor to be runtime-agnostic:

// Current (Tokio-specific)
#[cfg(feature = "std")]
pub fn spawn_supervisor<R>(/* ... */) -> DbResult<()> {
    let listener = UnixListener::bind(&config.socket_path)?;
    tokio::spawn(async move { /* ... */ });
}

// Proposed (generic over transport + runtime)
pub fn spawn_supervisor<R, T>(
    db: Arc<AimDb<R>>,
    runtime: Arc<R>,
    config: AimxConfig,
    transport_factory: impl TransportFactory<T>,
) -> DbResult<()>
where
    R: aimdb_executor::Spawn + 'static,
    T: RemoteTransport,
{
    runtime.spawn(async move {
        // Runtime-agnostic accept loop
    })?;
    Ok(())
}

4.2 Embassy Task Structure

#[embassy_executor::task]
async fn remote_supervisor(
    db: &'static AimDb<EmbassyAdapter>,
    config: AimxConfig,
) {
    // Accept connections on serial/TCP
    // Spawn connection handlers (from pool)
}

#[embassy_executor::task(pool_size = 4)]
async fn connection_handler(
    transport: /* ... */,
    db: &'static AimDb<EmbassyAdapter>,
    config: AimxConfig,
) {
    // Handle single client connection
}

Implementation Phases

Phase 1: Foundation (4-6 weeks)

  • Create RemoteTransport trait abstraction
  • Replace HashMap with heapless::FnvIndexMap
  • Replace String/PathBuf with heapless::String<N>
  • Replace Vec with heapless::Vec<T, N>
  • Add alloc feature to aimdb-core for embedded heap use
  • Remove thiserror dependency, use manual error formatting
  • Add fixed-size buffer parsing for NDJSON

Phase 2: Transport Layer (3-4 weeks)

  • Implement UdsTransport (std, wraps existing Tokio code)
  • Implement SerialTransport (Embassy, USB-CDC)
  • Implement TcpTransport (Embassy, embassy_net)
  • Add configurable buffer sizes (const generics)
  • Add transport-agnostic connection handler

Phase 3: Protocol Hardening (2-3 weeks)

  • Enforce MAX_SUBSCRIPTIONS_PER_CLIENT compile-time limit
  • Implement bounded event queues (no unbounded channels)
  • Add protocol timeout handling (embedded deadlock prevention)
  • Add memory usage metrics (stack/heap)

Phase 4: Runtime Integration (2-3 weeks)

  • Update AimDbBuilder::with_remote_access() to accept transport factory
  • Add Embassy-specific supervisor spawn method
  • Update examples to show both Tokio and Embassy usage
  • Add cross-compilation tests (thumbv7em-none-eabihf)

Phase 5: Testing & Documentation (2-3 weeks)

  • Create examples/embassy-remote-serial/ demo
  • Create examples/embassy-remote-tcp/ demo (ESP32-C3)
  • Add memory usage benchmarks
  • Update docs with embedded deployment guide
  • Create CLI tool variant for serial connection

Total Estimated Time: 13-19 weeks (3-5 months)


Memory Budgets

Embedded Target Profile (ARM Cortex-M4, 256KB RAM)

Component RAM Usage Notes
Core AimDB 40-80 KB Records + buffers
Remote supervisor 2 KB Task stack
Connection handler (×4) 8 KB each 32 KB total
Subscription queues (×4 clients × 4 subs) 1 KB each 16 KB total
Transport buffers (×4) 2 KB each 8 KB total
JSON serialization 4-8 KB Temporary buffers
Total Overhead 60-100 KB ~25-40% of RAM

Recommendation: Target MCUs with ≥256KB RAM for remote access. For smaller MCUs, remote access should be optional (feature-gated).


Breaking Changes

API Changes

  • AimxConfig::socket_path removed (embedded has no filesystem)
  • New required parameter: TransportFactory or transport type selection
  • SecurityPolicy unchanged (still TypeId-based)

Feature Flags

# Before (implicit)
aimdb-core = { features = ["std"] }  # Includes remote access

# After (explicit)
aimdb-core = { 
    features = ["std", "remote-access-uds"]  # Unix sockets (Tokio)
}

# Or for embedded
aimdb-core = { 
    features = ["alloc", "remote-access-serial"]  # Serial (Embassy)
}

Migration Path

Existing std code remains unchanged - the UDS transport is the default for feature = "std". Embedded users explicitly opt into serial/TCP transports.


Open Questions

  1. Heap allocator requirement: Should we mandate a heap allocator (alloc) for embedded remote access, or support fully stack-based operation?

    • Recommendation: Require alloc initially, optimize to pure stack later if needed.
  2. Protocol subset: Should embedded support a reduced protocol (e.g., no record.set, subscriptions only)?

    • Recommendation: Start with full protocol parity, measure overhead, then decide.
  3. Authentication on embedded: Auth tokens over serial are problematic (plaintext). Use hardware-based security?

    • Recommendation: Document as "debug only" use case, require physical access.
  4. Multi-client support: Embedded likely only needs 1-2 concurrent clients. Enforce at compile time?

    • Recommendation: Yes, use const generics: const MAX_CLIENTS: usize = 2.
  5. Binary protocol: Should we define a binary protocol (MessagePack/CBOR) for embedded to save bandwidth?

    • Recommendation: v2 feature. Keep NDJSON for v1 parity.

Success Criteria

Minimum Viable Product (MVP)

✅ AimX v1 protocol working over USB-CDC serial on STM32H7
✅ Support 2 concurrent clients, 4 subscriptions each
record.list, record.get, record.subscribe implemented
✅ Runs in ≤256KB RAM
✅ Cross-compiles for thumbv7em-none-eabihf target

Full Feature Parity

✅ All AimX v1 methods supported
✅ Read-write security policies
✅ Both serial and TCP transports
✅ Examples for STM32, ESP32, RP2040
✅ CLI tool supports serial connections (aimdb-cli --serial /dev/ttyUSB0)


Related Work

Similar Projects

  • probe-rs RTT: Real-Time Transfer for embedded logging (one-way only)
  • defmt: Efficient logging for embedded (not bidirectional)
  • postcard-rpc: RPC over serial for embedded Rust (similar transport model)

Learnings

  • Use fixed-size buffers for all I/O (avoid fragmentation)
  • Prioritize deterministic memory usage over flexibility
  • Serial protocols need framing (line-based NDJSON is good choice)
  • Timeout all operations (embedded can't afford deadlocks)

References


Conclusion

Enabling remote access on embedded requires significant architectural changes but is technically feasible with the right abstractions. The key insight is to make transport and allocation strategies pluggable rather than hardcoded to Tokio/std.

Priority assessment: This is a high-value feature for debugging embedded deployments, but not critical path for initial embedded support. Suggest targeting M4 milestone (after core embedded functionality stabilizes in M3).

Complexity: High (13-19 weeks), primarily due to:

  1. Replacing std collections with heapless equivalents
  2. Abstracting transport layer
  3. Testing across multiple MCU platforms
  4. Memory optimization for <256KB targets

Risk mitigation:

  • Keep std implementation unchanged (no regression risk)
  • Use feature flags to isolate embedded code paths
  • Incremental rollout: serial first, TCP second
  • Extensive memory profiling on real hardware

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions