Skip to content

[bot] Agent middleware non-streaming path gates token usage and timing behind content capture flag #51

@braintrust-bot

Description

@braintrust-bot

Summary

The BraintrustAgentMiddleware non-streaming path (RunCoreAsync) gates token usage metrics and time-to-first-token behind _captureMessageContent, so disabling content capture for privacy also drops all execution metrics from agent spans. The BraintrustChatClientMiddleware in this same package correctly separates metadata from content — always recording model, tokens, and timing — making this an inconsistency within the Agent Framework integration.

What is missing

In BraintrustAgentMiddleware.RunCoreAsync(), the output, token metrics, and TTFT are all gated behind _captureMessageContent (line 56):

if (activity != null && _captureMessageContent)
{
    SpanTagHelper.SetOutputMessages(activity, response.Messages);
    SpanTagHelper.SetTokenMetrics(activity, response.Usage);
    SpanTagHelper.SetTimeToFirstToken(activity, (DateTime.UtcNow - startTime).TotalSeconds);
}

When captureMessageContent is false, all three are skipped — including SetTokenMetrics and SetTimeToFirstToken which are execution metadata, not message content.

Correct behavior — BraintrustChatClientMiddleware (same package)

The ChatClient middleware correctly separates metadata from content (lines 53–64):

// Always set — regardless of _captureMessageContent
SpanTagHelper.SetResponseModel(activity, response.ModelId);
SpanTagHelper.SetTokenMetrics(activity, response.Usage);
SpanTagHelper.SetTimeToFirstToken(activity, timeToFirstToken);

// Only content is gated
if (_captureMessageContent)
{
    SpanTagHelper.SetOutputMessages(activity, response.Messages);
}

Impact

Users who disable content capture for privacy (a legitimate use case in regulated environments) will get agent-level spans with no token usage, no timing metrics, and no cost attribution — while LLM-level spans from the ChatClient middleware in the same pipeline will include them. This inconsistency makes agent-level cost tracking and performance monitoring unreliable when content capture is off.

Note: Issue #49 describes this same class of bug for OpenAI and Anthropic integrations, but specifically states "The Agent Framework integration in this same repo correctly separates metadata from content." That statement is accurate for BraintrustChatClientMiddleware but not for BraintrustAgentMiddleware.

Braintrust docs status

The Braintrust tracing docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls document that token usage and timing are automatically captured. These execution metrics should not be tied to a content-capture privacy toggle. Status: supported (documented as always captured; incorrectly gated behind content capture in the agent middleware).

Upstream sources

Local files inspected

  • src/Braintrust.Sdk.AgentFramework/BraintrustAgentMiddleware.cs — non-streaming RunCoreAsync (line 56) gates SetTokenMetrics and SetTimeToFirstToken behind _captureMessageContent
  • src/Braintrust.Sdk.AgentFramework/BraintrustChatClientMiddleware.cs — non-streaming GetResponseAsync (lines 53–64) correctly separates metadata from content
  • src/Braintrust.Sdk.AgentFramework/SpanTagHelper.csSetTokenMetrics and SetTimeToFirstToken are available as standalone helpers

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions