Skip to content

Draft: Add DatadogBridge for real-time APM span context propagation#15309

Open
epinzur wants to merge 4 commits intomainfrom
claude/investigate-apm-traces-e01el
Open

Draft: Add DatadogBridge for real-time APM span context propagation#15309
epinzur wants to merge 4 commits intomainfrom
claude/investigate-apm-traces-e01el

Conversation

@epinzur
Copy link
Copy Markdown
Member

@epinzur epinzur commented Apr 13, 2026

Description

Introduces DatadogBridge, a new observability bridge that solves a critical issue with APM span parenting in Datadog integrations. The bridge creates native dd-trace APM spans eagerly during execution (rather than retroactively) so that auto-instrumented operations (HTTP requests, database queries, etc.) made by tools and processors have the correct parent span context.

Problem Solved

The existing DatadogExporter creates LLMObs spans retroactively after execution completes. This means when tools and processors make outbound calls, there is no active dd-trace span in scope, causing dd-trace's auto-instrumentation to fall back to the nearest active span (typically the request handler) instead of the actual parent span.

Solution

DatadogBridge uses a dual-API approach:

  • tracer.startSpan() for eager APM span creation and activation in dd-trace's scope during execution
  • tracer.llmobs.trace() for retroactive LLMObs annotation and export after spans complete

This ensures:

  1. APM spans are active in dd-trace's scope when tools/processors execute
  2. Auto-instrumented calls are parented correctly
  3. LLMObs data is still emitted through dd-trace's pipeline for proper annotation and export

Key Features

  • Real-time APM context: Spans are created eagerly and activated in dd-trace scope
  • Proper parent-child relationships: Both APM and LLMObs spans maintain correct hierarchy
  • Flexible configuration: Supports agentless mode, custom ML app names, and context key promotion
  • Trace state management: Handles buffering, late-arriving spans, and cleanup with configurable timeouts
  • Error handling: Properly tags and annotates error spans in both APM and LLMObs

Configuration

const bridge = new DatadogBridge({
  mlApp: 'my-app',           // Required
  apiKey: 'xxx',             // Required for agentless mode
  agentless: true,           // Default: true
  requestContextKeys: ['tenantId', 'userId'],  // Promote to flat tags
});

Type of Change

  • New feature (non-breaking change that adds functionality)

Checklist

  • I have added comprehensive unit tests (713 lines of test coverage)
  • I have updated the index.ts exports to include the new bridge
  • Tests cover configuration, span creation, APM lifecycle, LLMObs emission, and error handling

https://claude.ai/code/session_01Q7w4QfZvEXyUvyY2y4XQe1

ELI5 Explanation

This PR makes Mastra create live Datadog APM spans while tasks run (instead of only reporting afterwards), so auto-instrumented work like HTTP and DB calls are shown as children of the correct Mastra task. It also still emits LLM Observability annotations after spans finish so Datadog receives the same LLM-specific metadata.


Overview

Adds DatadogBridge: an observability bridge that eagerly creates and activates native dd-trace APM spans during execution so dd-trace auto-instrumentation is parented under the correct Mastra span. It preserves retrospective LLM Observability emission by routing annotations through dd-trace’s llmobs API after spans complete. This resolves incorrect parenting that occurred when using the DatadogExporter alone.

DatadogBridge uses a dual API:

  • Real-time APM: tracer.startSpan() + tracer.scope().activate() to create/activate dd-trace spans during execution.
  • Retroactive LLM Observability: tracer.llmobs.trace()/annotate()/flush() to export LLM Observability data after spans end.

Key Features

  • Real-time APM context propagation so auto-instrumented ops are children of the correct Mastra span.
  • Dual emission model: eager APM spans during execution + retroactive LLM Observability annotations on span end.
  • Parent selection precedence: external parent mapping → ddSpanMap → active dd-trace scope fallback.
  • executeInContext()/executeInContextSync() to run callbacks with the dd-trace span active.
  • Per-trace buffering and traceState management to assemble and emit span trees when the root ends; supports late-arriving spans and iterative flush of chained late arrivals.
  • Preserves model/provider attributes for late-arriving MODEL_STEP spans by storing inherited model attrs and propagating them to descendants.
  • Configurable options: mlApp (required to enable), apiKey (for agentless), agentless, site, service, env, integrationsEnabled, requestContextKeys (promote request metadata to flat tags).
  • Error tagging/annotation; span type → Datadog LLM Observability kind mapping; token metrics for MODEL_STEP spans.
  • Cleanup timers including a max-lifetime window for traces and deterministic shutdown behavior (flush, force-finish APM spans, disable llmobs).

Changes

  • New: observability/datadog/src/bridge.ts

    • Exports DatadogBridgeConfig and DatadogBridge.
    • DatadogBridge extends BaseExporter and implements ObservabilityBridge.
    • Public API: createSpan(), executeInContext(), executeInContextSync(), flush(), shutdown().
    • Implements ensureTracer(), ddSpanMap, per-trace buffering, LLM Observability emission, lifecycle management, late-arrival handling and model attribute inheritance fixes.
  • New: observability/datadog/src/bridge.test.ts

    • Comprehensive Vitest suite (~713 lines) fully mocking dd-trace.
    • Tests cover enablement/disablement rules, createSpan id/parent behavior, sync/async context activation, APM finishing semantics, LLM Observability emission and annotation enrichment, late-arriving spans and preserved unresolved children, model/provider propagation for late MODEL_STEP spans, flush/shutdown behavior, and an end-to-end eager APM + LLM Observability flow.
  • Modified: observability/datadog/src/index.ts

    • Exports DatadogBridge and DatadogBridgeConfig alongside DatadogExporter exports.
    • Docs updated to describe two modes: DatadogBridge (recommended for correct APM parenting) and DatadogExporter (LLM Observability-only, retroactive).
  • Docs & Changeset

    • New user guide and reference docs for DatadogBridge (usage, config, initialization order, agent vs agentless, span mappings, requestContextKeys promotion, troubleshooting).
    • Sidebar/reference updates and a changeset announcing the new DatadogBridge.

Commit fixes / notable behavior changes

  • Fixes for late-arriving and orphaned spans:
    • Avoids dropping unresolved child spans from the buffer so children whose parents arrive later are not discarded; emits late-arriving chains iteratively.
    • Preserves model/provider metadata for late MODEL_STEP spans by storing and reusing inherited model attributes for descendants.
  • Regression tests added to cover these scenarios.

Technical Notes

  • Bridge enables only if mlApp is provided (constructor config takes precedence over env); agentless requires DD_API_KEY.
  • Mastra-compatible hex span/trace IDs are produced; dd-trace span objects are tracked in ddSpanMap.
  • On span end, the eager APM span is finished with the correct timestamp and the Mastra span is buffered for LLM Observability emission. When the root ends, the full tree is emitted via nested tracer.llmobs.trace() calls; late children are emitted individually when parent context becomes available.
  • flush() prefers tracer.llmobs.flush(); shutdown() cancels timers, force-finishes remaining APM spans, flushes, optionally disables llmobs, clears internal state, and calls super.shutdown().

Review Impact

  • Large new implementation and tests (bridge.ts + tests + docs) — high review effort for bridge.ts, medium for tests/docs.
  • Backward-compatible: DatadogExporter remains available; DatadogBridge is recommended when dd-trace auto-instrumentation requires correct parenting.

The DatadogExporter creates LLMObs spans retroactively (after execution
completes), which means dd-trace auto-instrumented APM spans from tools
and processors get parented to the request handler instead of the correct
Mastra span. This is because no dd-trace span is active in scope during
execution.

The new DatadogBridge solves this by creating dd-trace APM spans eagerly
via tracer.startSpan() at span creation time, and activating them in
dd-trace's scope via tracer.scope().activate() during execution. This
means auto-instrumented HTTP/DB calls from MCP tools, guardrail
processors, etc. are correctly nested under their parent Mastra spans.

LLMObs annotation (model info, token usage, I/O) is still emitted
retroactively through dd-trace's own LLMObs pipeline using the existing
nested llmobs.trace() callback pattern.

Usage:
  import { DatadogBridge } from '@mastra/datadog';

  new Mastra({
    observability: {
      configs: {
        default: {
          bridge: new DatadogBridge({ mlApp: 'my-app' }),
        }
      }
    }
  });

https://claude.ai/code/session_01Q7w4QfZvEXyUvyY2y4XQe1
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
mastra-docs-1.x Skipped Skipped Apr 13, 2026 6:01pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

Walkthrough

Adds a new DatadogBridge that integrates with dd-trace to create APM spans eagerly for real-time context propagation, buffers Mastra spans for retroactive LLM Observability emission via dd-trace, and includes tests, docs, exports, lifecycle (flush/shutdown) and cleanup logic.

Changes

Cohort / File(s) Summary
Bridge Implementation
observability/datadog/src/bridge.ts
New DatadogBridge and DatadogBridgeConfig: dd-trace initialization, eager APM span creation with span/trace ID mapping and parent resolution, executeInContext/executeInContextSync, per-trace buffering and two-phase LLMObs emission (nested llmobs.trace()), annotation/tag handling, timers for cleanup, flush() and shutdown() logic.
Bridge Tests
observability/datadog/src/bridge.test.ts
Comprehensive Vitest suite mocking dd-trace (init, startSpan, scope, llmobs): verifies enablement rules, createSpan semantics and parent wiring, context activation helpers, APM lifecycle on exported events, LLMObs emission/annotations/parenting (including late/partial-tree cases), model attribute inheritance, and flush/shutdown behaviors.
Module Exports & Index
observability/datadog/src/index.ts
Exports DatadogBridge and DatadogBridgeConfig; documents two integration modes (DatadogBridge realtime vs DatadogExporter retrospective) and preserves existing exporter exports.
Documentation — Guides & Reference
docs/src/content/en/docs/observability/tracing/bridges/datadog.mdx, docs/src/content/en/reference/observability/tracing/bridges/datadog.mdx
Adds user and API reference docs for the DatadogBridge: experimental notice, config options (apiKey, mlApp, site, service, env, agentless, requestContextKeys), setup ordering, agent vs agentless behavior, span-kind mapping, examples, and method signatures.
Documentation — Exporter Callouts & Sidebars
docs/src/content/en/docs/observability/tracing/exporters/datadog.mdx, docs/src/content/en/reference/observability/tracing/exporters/datadog.mdx, docs/src/content/en/docs/sidebars.js, docs/src/content/en/reference/sidebars.js
Adds callouts recommending DatadogBridge for proper APM parenting with dd-trace; updates wording to “Datadog LLM Observability” and adds Datadog/DatadogBridge entries to sidebars.
Changeset
.changeset/datadog-bridge.md
New changeset marking a minor release for @mastra/datadog, documenting the DatadogBridge addition and guidance compared to DatadogExporter.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: introduction of DatadogBridge for APM span context propagation, which aligns with the PR objectives and all file changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/investigate-apm-traces-e01el

Comment @coderabbitai help to get the list of available commands and usage tips.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 13, 2026

🦋 Changeset detected

Latest commit: ce8b6da

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@mastra/datadog Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
observability/datadog/src/bridge.test.ts (1)

32-97: Consider resetting apmSpanCounter in beforeEach for test isolation.

The apmSpanCounter variable continues incrementing across tests since it's not reset in the beforeEach hook. While current tests don't depend on specific counter values, this could cause fragile tests if future tests expect specific span IDs.

♻️ Suggested improvement

Add a reset mechanism to the hoisted mock:

 const {
   mockAnnotate,
   mockTrace,
   // ... other exports
   capturedAPMSpans,
+  resetApmSpanCounter,
 } = vi.hoisted(() => {
   let currentScopeSpan: any = undefined;
   const parents: any[] = [];
   const llmobsSpans: any[] = [];
   let apmSpanCounter = 0;
   // ...
   return {
     // ... existing returns
+    resetApmSpanCounter: () => { apmSpanCounter = 0; },
   };
 });

Then in beforeEach:

   beforeEach(() => {
     vi.clearAllMocks();
     traceParents.length = 0;
     capturedLLMObsSpans.length = 0;
     capturedAPMSpans.length = 0;
+    resetApmSpanCounter();
     // ...
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/datadog/src/bridge.test.ts` around lines 32 - 97, The hoisted
mock keeps apmSpanCounter incrementing across tests which can leak state; expose
a reset function from the vi.hoisted return (e.g., resetApmSpanCounter) that
sets apmSpanCounter = 0 and then call that reset in the test suite's beforeEach
to ensure test isolation; reference the existing apmSpanCounter and
mockStartSpan in your change so the counter reset is clearly tied to the span
factory used by mockStartSpan.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@observability/datadog/src/bridge.test.ts`:
- Around line 32-97: The hoisted mock keeps apmSpanCounter incrementing across
tests which can leak state; expose a reset function from the vi.hoisted return
(e.g., resetApmSpanCounter) that sets apmSpanCounter = 0 and then call that
reset in the test suite's beforeEach to ensure test isolation; reference the
existing apmSpanCounter and mockStartSpan in your change so the counter reset is
clearly tied to the span factory used by mockStartSpan.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 06b92b46-be15-420c-8d4e-86ade858250d

📥 Commits

Reviewing files that changed from the base of the PR and between 6544c97 and 9ea3030.

📒 Files selected for processing (3)
  • observability/datadog/src/bridge.test.ts
  • observability/datadog/src/bridge.ts
  • observability/datadog/src/index.ts

…ss to false

- Add guide page at docs/observability/tracing/bridges/datadog explaining
  when to use the bridge, how it works, setup with dd-trace, agent vs
  agentless mode, and trace hierarchy.
- Add reference page at reference/observability/tracing/bridges/datadog
  documenting DatadogBridgeConfig, methods, usage examples, span mapping,
  and environment variables.
- Update both docs and reference sidebars to include the new pages.
- Cross-reference the bridge from the Datadog exporter pages so users
  with dd-trace APM are pointed at the right tool.
- Default DatadogBridge agentless to false. Bridge users almost always
  have a local Datadog Agent (required for dd-trace APM data), so
  agentless mode would split LLMObs traffic away from APM traffic. The
  exporter remains agentless-by-default for LLMObs-only use cases.

https://claude.ai/code/session_01Q7w4QfZvEXyUvyY2y4XQe1
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@observability/datadog/src/bridge.ts`:
- Around line 431-437: buildSpanTree()/emitSpanTree()/tryEmitReadySpans
currently clear the entire state.buffer after attempting to build/emit a tree,
which drops child spans whose parent arrived later; change the logic to preserve
unresolved spans by only removing spans that were actually emitted.
Specifically, modify tryEmitReadySpans (and the branch that calls
buildSpanTree/emitSpanTree) to: 1) let buildSpanTree/emitSpanTree return the
set/list of span IDs that were successfully emitted (or the root subtree nodes),
2) remove only those emitted IDs from state.buffer, and 3) keep any spans whose
parent was unresolved in state.buffer and do not set state.treeEmitted true
unless the intended root emission completed; apply the same change for the other
occurrence mentioned (around the 495-500 block) so unresolved children remain
buffered until their parent is emitted.
- Around line 531-538: Late-arriving MODEL_STEP spans are missing inherited
model/provider metadata because the late-span path calls buildSpanOptions(span)
without the parent-derived context; update the logic that computes
childInheritedModelAttrs and the late-span emission (locations around
childInheritedModelAttrs, buildSpanOptions, emitSingleSpan) to persist or
re-derive effective model/provider attributes from the parent (e.g., store
effective attrs in state.contexts keyed by trace/span id or walk to the parent
span to extract its ModelGenerationAttributes) and pass those into
buildSpanOptions so emitted LLMObs spans include modelName/modelProvider; apply
the same fix to the other late-span areas noted (around lines referenced by the
reviewer).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 09d761e0-e0e9-416d-ab58-f998eaa03d94

📥 Commits

Reviewing files that changed from the base of the PR and between 9ea3030 and 90f3dc0.

📒 Files selected for processing (9)
  • .changeset/datadog-bridge.md
  • docs/src/content/en/docs/observability/tracing/bridges/datadog.mdx
  • docs/src/content/en/docs/observability/tracing/exporters/datadog.mdx
  • docs/src/content/en/docs/sidebars.js
  • docs/src/content/en/reference/observability/tracing/bridges/datadog.mdx
  • docs/src/content/en/reference/observability/tracing/exporters/datadog.mdx
  • docs/src/content/en/reference/sidebars.js
  • observability/datadog/src/bridge.test.ts
  • observability/datadog/src/bridge.ts
✅ Files skipped from review due to trivial changes (5)
  • docs/src/content/en/reference/sidebars.js
  • docs/src/content/en/reference/observability/tracing/exporters/datadog.mdx
  • .changeset/datadog-bridge.md
  • docs/src/content/en/docs/observability/tracing/bridges/datadog.mdx
  • docs/src/content/en/docs/sidebars.js

Comment thread observability/datadog/src/bridge.ts Outdated
Comment thread observability/datadog/src/bridge.ts
…ate spans in DatadogBridge

Two related bugs in tryEmitReadySpans/emitSpanTree/emitSingleSpan:

1. Orphan span drop: buildSpanTree() only links a child to its parent if the
   parent is present in the buffer at tree-build time. Previously, after the
   initial tree was emitted, state.buffer.clear() discarded any unresolved
   spans — including children whose parent simply hadn't ended yet (e.g.,
   child ended early, root ended before parent, parent is still in flight).
   The parent would then emit successfully via the late-arrival path, but
   the orphaned child was gone forever.

   Fix: remove only spans that actually landed in state.contexts during
   tree emission. Unresolved spans stay buffered. The late-arrival phase
   (now run unconditionally, not just in an else branch) emits them once
   their parent's context exists, iterating to a fixed point so chains of
   late spans all flush.

2. Missing model/provider on late MODEL_STEP spans: emitSpanTree propagates
   MODEL_GENERATION's model/provider down to MODEL_STEP descendants via
   childInheritedModelAttrs. But emitSingleSpan (the late-arrival path)
   called buildSpanOptions(span) with no inherited attrs, so a late
   MODEL_STEP would render without modelName/modelProvider in LLMObs.

   Fix: store childInheritedModelAttrs in state.contexts alongside each
   emitted ddSpan. When the late-arrival path emits a span, it looks up
   the parent's stored attrs and passes them through. emitSingleSpan also
   computes its own childInheritedModelAttrs for any further descendants.

Added regression tests:
- preserves unresolved children when root ends before their parent
- passes MODEL_GENERATION model/provider to a late-arriving MODEL_STEP child

https://claude.ai/code/session_01Q7w4QfZvEXyUvyY2y4XQe1
@vercel vercel bot temporarily deployed to Preview – mastra-docs-1.x April 14, 2026 12:15 Inactive
…rthand

"LLMObs" is a code-level shorthand (matches dd-trace's tracer.llmobs API
namespace and env vars like DD_LLMOBS_ML_APP), but Datadog's user-facing
product name is "LLM Observability". Update prose in both the new bridge
docs and the existing exporter docs to use the full product name
consistently so readers aren't left decoding an acronym.

https://claude.ai/code/session_01Q7w4QfZvEXyUvyY2y4XQe1
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@observability/datadog/src/bridge.ts`:
- Around line 354-364: The root traceContext is only captured on 'span_started'
and stays stale; update it when a root span is enriched by calling
captureTraceContext(span) for root spans in the 'span_updated' branch and again
before finishApmSpan(span) in the 'span_ended' branch so the latest user/session
identifiers are used; apply the same changes to the other occurrence around
finishApmSpan/enqueueSpan (the second switch block referenced near the other
range) and ensure you only refresh for root spans (use the same root detection
logic used elsewhere).
- Around line 833-842: The early return condition `if (this.isDisabled ||
!(tracer as any).llmobs) return;` prevents the fallback
`tracer.flush()`/`tracer.shutdown()` from ever running; change the guard to only
bail when `this.isDisabled` so the code can fall through to the `tracer.llmobs`
branch or the fallback branch. Concretely, update the checks in the methods
handling flushing/shutdown so they first `if (this.isDisabled) return;` and
then: if `tracer.llmobs?.flush` call `tracer.llmobs.flush()`, else if `(tracer
as any).flush` call `tracer.flush()` (and mirror the same pattern for
`shutdown`), preserving the existing try/catch and logging around
`tracer.llmobs` and the fallback.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d6a52ea4-ed87-467e-a1cc-8a199e1b555d

📥 Commits

Reviewing files that changed from the base of the PR and between 90f3dc0 and 1bd7289.

📒 Files selected for processing (2)
  • observability/datadog/src/bridge.test.ts
  • observability/datadog/src/bridge.ts

Comment on lines +354 to +364
switch (event.type) {
case 'span_started':
this.captureTraceContext(span);
return;

case 'span_updated':
return;

case 'span_ended':
this.finishApmSpan(span);
this.enqueueSpan(span);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Refresh root trace context on updates and end events.

traceContext is latched from the first root span_started payload and never updated. Because spans can be enriched later (span_updated exists, and span_ended carries the final metadata), a root that gets userId or sessionId after start will export the whole tree with stale or empty identifiers.

🐛 Proposed fix
       case 'span_started':
         this.captureTraceContext(span);
         return;
 
       case 'span_updated':
-        return;
+        this.captureTraceContext(span);
+        return;
 
       case 'span_ended':
+        this.captureTraceContext(span);
         this.finishApmSpan(span);
         this.enqueueSpan(span);
         return;
   private captureTraceContext(span: AnyExportedSpan): void {
-    if (span.isRootSpan && !this.traceContext.has(span.traceId)) {
-      this.traceContext.set(span.traceId, {
-        userId: span.metadata?.userId,
-        sessionId: span.metadata?.sessionId,
-      });
-    }
+    if (!span.isRootSpan) return;
+
+    const existing = this.traceContext.get(span.traceId);
+    this.traceContext.set(span.traceId, {
+      userId: span.metadata?.userId ?? existing?.userId,
+      sessionId: span.metadata?.sessionId ?? existing?.sessionId,
+    });
   }

Also applies to: 408-414

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/datadog/src/bridge.ts` around lines 354 - 364, The root
traceContext is only captured on 'span_started' and stays stale; update it when
a root span is enriched by calling captureTraceContext(span) for root spans in
the 'span_updated' branch and again before finishApmSpan(span) in the
'span_ended' branch so the latest user/session identifiers are used; apply the
same changes to the other occurrence around finishApmSpan/enqueueSpan (the
second switch block referenced near the other range) and ensure you only refresh
for root spans (use the same root detection logic used elsewhere).

Comment on lines +833 to +842
if (this.isDisabled || !(tracer as any).llmobs) return;

if (tracer.llmobs?.flush) {
try {
await tracer.llmobs.flush();
this.logger.debug('Datadog llmobs flushed');
} catch (e) {
this.logger.error('Error flushing llmobs', { error: e });
}
} else if ((tracer as any).flush) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't return before the tracer.flush() fallback.

The !(tracer as any).llmobs guard makes the fallback branch unreachable, so flush() and shutdown() do nothing in the exact case the fallback is supposed to cover.

🐛 Proposed fix
   async flush(): Promise<void> {
-    if (this.isDisabled || !(tracer as any).llmobs) return;
+    if (this.isDisabled) return;
 
     if (tracer.llmobs?.flush) {
       try {
         await tracer.llmobs.flush();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (this.isDisabled || !(tracer as any).llmobs) return;
if (tracer.llmobs?.flush) {
try {
await tracer.llmobs.flush();
this.logger.debug('Datadog llmobs flushed');
} catch (e) {
this.logger.error('Error flushing llmobs', { error: e });
}
} else if ((tracer as any).flush) {
async flush(): Promise<void> {
if (this.isDisabled) return;
if (tracer.llmobs?.flush) {
try {
await tracer.llmobs.flush();
this.logger.debug('Datadog llmobs flushed');
} catch (e) {
this.logger.error('Error flushing llmobs', { error: e });
}
} else if ((tracer as any).flush) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability/datadog/src/bridge.ts` around lines 833 - 842, The early return
condition `if (this.isDisabled || !(tracer as any).llmobs) return;` prevents the
fallback `tracer.flush()`/`tracer.shutdown()` from ever running; change the
guard to only bail when `this.isDisabled` so the code can fall through to the
`tracer.llmobs` branch or the fallback branch. Concretely, update the checks in
the methods handling flushing/shutdown so they first `if (this.isDisabled)
return;` and then: if `tracer.llmobs?.flush` call `tracer.llmobs.flush()`, else
if `(tracer as any).flush` call `tracer.flush()` (and mirror the same pattern
for `shutdown`), preserving the existing try/catch and logging around
`tracer.llmobs` and the fallback.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/src/content/en/reference/observability/tracing/bridges/datadog.mdx (1)

284-284: Optional wording polish for readability.

This line repeats “Tags” multiple times in close succession; a small rewrite would read more smoothly.

✍️ Suggested wording tweak
-Tags supplied via `tracingOptions.tags` are converted into structured LLM Observability annotation tags. Tags formatted as `key:value` are split into separate entries; tags without a colon are set with a `true` value.
+Values supplied via `tracingOptions.tags` are converted into structured LLM Observability annotation tags. Entries formatted as `key:value` are split into separate key/value pairs; entries without a colon are set to `true`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/content/en/reference/observability/tracing/bridges/datadog.mdx` at
line 284, Reword the sentence that begins "Tags supplied via
`tracingOptions.tags`..." to improve readability and reduce repetition: rewrite
it to a single clear sentence describing that entries in tracingOptions.tags
become structured LLM Observability annotation tags, that items containing a
colon (key:value) are split into separate key and value entries, and that items
without a colon are interpreted as boolean true; update the sentence containing
`tracingOptions.tags` and the examples of `key:value` and "true" to reflect this
clearer phrasing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/src/content/en/reference/observability/tracing/bridges/datadog.mdx`:
- Line 284: Reword the sentence that begins "Tags supplied via
`tracingOptions.tags`..." to improve readability and reduce repetition: rewrite
it to a single clear sentence describing that entries in tracingOptions.tags
become structured LLM Observability annotation tags, that items containing a
colon (key:value) are split into separate key and value entries, and that items
without a colon are interpreted as boolean true; update the sentence containing
`tracingOptions.tags` and the examples of `key:value` and "true" to reflect this
clearer phrasing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dd094d4b-842b-438f-a6ac-55863b74cb2d

📥 Commits

Reviewing files that changed from the base of the PR and between 1bd7289 and ce8b6da.

📒 Files selected for processing (4)
  • docs/src/content/en/docs/observability/tracing/bridges/datadog.mdx
  • docs/src/content/en/docs/observability/tracing/exporters/datadog.mdx
  • docs/src/content/en/reference/observability/tracing/bridges/datadog.mdx
  • docs/src/content/en/reference/observability/tracing/exporters/datadog.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/src/content/en/docs/observability/tracing/exporters/datadog.mdx
  • docs/src/content/en/docs/observability/tracing/bridges/datadog.mdx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants