Summary
The BraintrustAgentMiddleware non-streaming path (RunCoreAsync) gates token usage metrics and time-to-first-token behind _captureMessageContent, so disabling content capture for privacy also drops all execution metrics from agent spans. The BraintrustChatClientMiddleware in this same package correctly separates metadata from content — always recording model, tokens, and timing — making this an inconsistency within the Agent Framework integration.
What is missing
In BraintrustAgentMiddleware.RunCoreAsync(), the output, token metrics, and TTFT are all gated behind _captureMessageContent (line 56):
if (activity != null && _captureMessageContent)
{
SpanTagHelper.SetOutputMessages(activity, response.Messages);
SpanTagHelper.SetTokenMetrics(activity, response.Usage);
SpanTagHelper.SetTimeToFirstToken(activity, (DateTime.UtcNow - startTime).TotalSeconds);
}
When captureMessageContent is false, all three are skipped — including SetTokenMetrics and SetTimeToFirstToken which are execution metadata, not message content.
Correct behavior — BraintrustChatClientMiddleware (same package)
The ChatClient middleware correctly separates metadata from content (lines 53–64):
// Always set — regardless of _captureMessageContent
SpanTagHelper.SetResponseModel(activity, response.ModelId);
SpanTagHelper.SetTokenMetrics(activity, response.Usage);
SpanTagHelper.SetTimeToFirstToken(activity, timeToFirstToken);
// Only content is gated
if (_captureMessageContent)
{
SpanTagHelper.SetOutputMessages(activity, response.Messages);
}
Impact
Users who disable content capture for privacy (a legitimate use case in regulated environments) will get agent-level spans with no token usage, no timing metrics, and no cost attribution — while LLM-level spans from the ChatClient middleware in the same pipeline will include them. This inconsistency makes agent-level cost tracking and performance monitoring unreliable when content capture is off.
Note: Issue #49 describes this same class of bug for OpenAI and Anthropic integrations, but specifically states "The Agent Framework integration in this same repo correctly separates metadata from content." That statement is accurate for BraintrustChatClientMiddleware but not for BraintrustAgentMiddleware.
Braintrust docs status
The Braintrust tracing docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls document that token usage and timing are automatically captured. These execution metrics should not be tied to a content-capture privacy toggle. Status: supported (documented as always captured; incorrectly gated behind content capture in the agent middleware).
Upstream sources
Local files inspected
src/Braintrust.Sdk.AgentFramework/BraintrustAgentMiddleware.cs — non-streaming RunCoreAsync (line 56) gates SetTokenMetrics and SetTimeToFirstToken behind _captureMessageContent
src/Braintrust.Sdk.AgentFramework/BraintrustChatClientMiddleware.cs — non-streaming GetResponseAsync (lines 53–64) correctly separates metadata from content
src/Braintrust.Sdk.AgentFramework/SpanTagHelper.cs — SetTokenMetrics and SetTimeToFirstToken are available as standalone helpers
Summary
The
BraintrustAgentMiddlewarenon-streaming path (RunCoreAsync) gates token usage metrics and time-to-first-token behind_captureMessageContent, so disabling content capture for privacy also drops all execution metrics from agent spans. TheBraintrustChatClientMiddlewarein this same package correctly separates metadata from content — always recording model, tokens, and timing — making this an inconsistency within the Agent Framework integration.What is missing
In
BraintrustAgentMiddleware.RunCoreAsync(), the output, token metrics, and TTFT are all gated behind_captureMessageContent(line 56):When
captureMessageContentisfalse, all three are skipped — includingSetTokenMetricsandSetTimeToFirstTokenwhich are execution metadata, not message content.Correct behavior —
BraintrustChatClientMiddleware(same package)The ChatClient middleware correctly separates metadata from content (lines 53–64):
Impact
Users who disable content capture for privacy (a legitimate use case in regulated environments) will get agent-level spans with no token usage, no timing metrics, and no cost attribution — while LLM-level spans from the ChatClient middleware in the same pipeline will include them. This inconsistency makes agent-level cost tracking and performance monitoring unreliable when content capture is off.
Note: Issue #49 describes this same class of bug for OpenAI and Anthropic integrations, but specifically states "The Agent Framework integration in this same repo correctly separates metadata from content." That statement is accurate for
BraintrustChatClientMiddlewarebut not forBraintrustAgentMiddleware.Braintrust docs status
The Braintrust tracing docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls document that token usage and timing are automatically captured. These execution metrics should not be tied to a content-capture privacy toggle. Status: supported (documented as always captured; incorrectly gated behind content capture in the agent middleware).
Upstream sources
Microsoft.Agents.AIv1.0.0 andMicrosoft.Agents.AI.Workflowsv1.0.0 —AgentResponse.Usageprovides token counts as response metadata, separate from message contentUsageDetailsdocs: https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.usagedetailsLocal files inspected
src/Braintrust.Sdk.AgentFramework/BraintrustAgentMiddleware.cs— non-streamingRunCoreAsync(line 56) gatesSetTokenMetricsandSetTimeToFirstTokenbehind_captureMessageContentsrc/Braintrust.Sdk.AgentFramework/BraintrustChatClientMiddleware.cs— non-streamingGetResponseAsync(lines 53–64) correctly separates metadata from contentsrc/Braintrust.Sdk.AgentFramework/SpanTagHelper.cs—SetTokenMetricsandSetTimeToFirstTokenare available as standalone helpers