Skip to content

feat(llmobs): support submitting trace and session level evals#17530

Closed
cdfox wants to merge 4 commits intomainfrom
christopher.fox/trace-session-level-evals
Closed

feat(llmobs): support submitting trace and session level evals#17530
cdfox wants to merge 4 commits intomainfrom
christopher.fox/trace-session-level-evals

Conversation

@cdfox
Copy link
Copy Markdown
Contributor

@cdfox cdfox commented Apr 15, 2026

Description

  LLM Observability: Adds support for submitting trace-level and session-level evaluation metrics
  via ``LLMObs.submit_evaluation()``. Two new parameters are introduced: ``eval_scope`` (one of
  ``"span"`` (default), ``"trace"``, or ``"session"``) and ``session_id`` (required when
  ``eval_scope="session"``). Use ``eval_scope="trace"`` to associate an evaluation with an entire
  trace by providing the root span, or ``eval_scope="session"`` to associate an evaluation with a
  session by providing the ``session_id``.

Testing

Ran test script: https://github.com/DataDog/experimental/blob/main/users/christopher.fox/scripts/emit_eval_metrics.py

  • eval_scope set explicitly to "span"
  • eval_scope left to default ("span")
  • eval_scope "span" + join_on using tag
  • eval_scope set to "trace"
  • eval_scope set to "session"

Observed in EVP the 5 eval_metric events (3 span-level, 1 trace-level, 1 session-level):
https://dd.datad0g.com/internal/events-ui/queries?index_name=llmobs&query_string=%40event_type%3Aeval-metric%20%40ml_app%3Aemit_eval_metrics&query_type=list&timerange=1775670893207-1776275693207l&track=llmobs

Risks

Additional Notes

@cdfox cdfox requested a review from a team as a code owner April 15, 2026 01:08
@cdfox cdfox marked this pull request as draft April 15, 2026 01:08
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 07330929f1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread ddtrace/llmobs/_llmobs.py Outdated
Comment thread ddtrace/llmobs/_writer.py Outdated
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Apr 15, 2026

Performance SLOs

Comparing candidate christopher.fox/trace-session-level-evals (0733092) with baseline main (278a9a1)

📈 Performance Regressions (2 suites)
📈 iastaspects - 118/118

✅ add_aspect

Time: ✅ 103.526µs (SLO: <130.000µs 📉 -20.4%) vs baseline: +2.9%

Memory: ✅ 43.851MB (SLO: <46.000MB -4.7%) vs baseline: +5.0%


✅ add_inplace_aspect

Time: ✅ 100.263µs (SLO: <130.000µs 📉 -22.9%) vs baseline: -3.0%

Memory: ✅ 43.845MB (SLO: <46.000MB -4.7%) vs baseline: +5.0%


✅ add_inplace_noaspect

Time: ✅ 28.420µs (SLO: <40.000µs 📉 -28.9%) vs baseline: +0.4%

Memory: ✅ 43.851MB (SLO: <46.000MB -4.7%) vs baseline: +5.1%


✅ add_noaspect

Time: ✅ 49.182µs (SLO: <70.000µs 📉 -29.7%) vs baseline: ~same

Memory: ✅ 44.235MB (SLO: <46.000MB -3.8%) vs baseline: +5.9%


✅ bytearray_aspect

Time: ✅ 253.273µs (SLO: <400.000µs 📉 -36.7%) vs baseline: -9.2%

Memory: ✅ 43.883MB (SLO: <46.000MB -4.6%) vs baseline: +5.1%


✅ bytearray_extend_aspect

Time: ✅ 642.018µs (SLO: <800.000µs 📉 -19.7%) vs baseline: -2.5%

Memory: ✅ 43.838MB (SLO: <46.000MB -4.7%) vs baseline: +5.0%


✅ bytearray_extend_noaspect

Time: ✅ 265.027µs (SLO: <400.000µs 📉 -33.7%) vs baseline: -1.0%

Memory: ✅ 44.326MB (SLO: <46.000MB -3.6%) vs baseline: +6.3%


✅ bytearray_noaspect

Time: ✅ 142.762µs (SLO: <300.000µs 📉 -52.4%) vs baseline: -1.0%

Memory: ✅ 43.894MB (SLO: <46.000MB -4.6%) vs baseline: +5.1%


✅ bytes_aspect

Time: ✅ 218.579µs (SLO: <300.000µs 📉 -27.1%) vs baseline: -6.5%

Memory: ✅ 43.809MB (SLO: <46.000MB -4.8%) vs baseline: +4.7%


✅ bytes_noaspect

Time: ✅ 133.282µs (SLO: <200.000µs 📉 -33.4%) vs baseline: -0.9%

Memory: ✅ 43.906MB (SLO: <46.000MB -4.6%) vs baseline: +4.8%


✅ bytesio_aspect

Time: ✅ 3.811ms (SLO: <5.000ms 📉 -23.8%) vs baseline: -2.4%

Memory: ✅ 44.157MB (SLO: <46.000MB -4.0%) vs baseline: +5.7%


✅ bytesio_noaspect

Time: ✅ 319.762µs (SLO: <420.000µs 📉 -23.9%) vs baseline: +0.3%

Memory: ✅ 43.889MB (SLO: <46.000MB -4.6%) vs baseline: +5.1%


✅ capitalize_aspect

Time: ✅ 88.681µs (SLO: <300.000µs 📉 -70.4%) vs baseline: +0.2%

Memory: ✅ 44.050MB (SLO: <46.000MB -4.2%) vs baseline: +5.3%


✅ capitalize_noaspect

Time: ✅ 264.739µs (SLO: <300.000µs 📉 -11.8%) vs baseline: +4.3%

Memory: ✅ 43.958MB (SLO: <46.000MB -4.4%) vs baseline: +5.3%


✅ casefold_aspect

Time: ✅ 89.037µs (SLO: <500.000µs 📉 -82.2%) vs baseline: -0.2%

Memory: ✅ 44.070MB (SLO: <46.000MB -4.2%) vs baseline: +5.5%


✅ casefold_noaspect

Time: ✅ 316.528µs (SLO: <500.000µs 📉 -36.7%) vs baseline: +1.0%

Memory: ✅ 43.948MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%


✅ decode_aspect

Time: ✅ 86.475µs (SLO: <100.000µs 📉 -13.5%) vs baseline: -0.9%

Memory: ✅ 44.074MB (SLO: <46.000MB -4.2%) vs baseline: +5.5%


✅ decode_noaspect

Time: ✅ 155.784µs (SLO: <210.000µs 📉 -25.8%) vs baseline: -1.0%

Memory: ✅ 43.844MB (SLO: <46.000MB -4.7%) vs baseline: +4.9%


✅ encode_aspect

Time: ✅ 84.417µs (SLO: <200.000µs 📉 -57.8%) vs baseline: -0.3%

Memory: ✅ 44.078MB (SLO: <46.000MB -4.2%) vs baseline: +5.6%


✅ encode_noaspect

Time: ✅ 143.893µs (SLO: <200.000µs 📉 -28.1%) vs baseline: -1.8%

Memory: ✅ 43.876MB (SLO: <46.000MB -4.6%) vs baseline: +5.2%


✅ format_aspect

Time: ✅ 14.556ms (SLO: <19.200ms 📉 -24.2%) vs baseline: -0.2%

Memory: ✅ 44.102MB (SLO: <46.000MB -4.1%) vs baseline: +5.4%


✅ format_map_aspect

Time: ✅ 16.349ms (SLO: <21.500ms 📉 -24.0%) vs baseline: -0.5%

Memory: ✅ 44.114MB (SLO: <46.000MB -4.1%) vs baseline: +5.6%


✅ format_map_noaspect

Time: ✅ 379.110µs (SLO: <500.000µs 📉 -24.2%) vs baseline: +1.4%

Memory: ✅ 43.948MB (SLO: <46.000MB -4.5%) vs baseline: +5.2%


✅ format_noaspect

Time: ✅ 312.804µs (SLO: <500.000µs 📉 -37.4%) vs baseline: -0.6%

Memory: ✅ 43.783MB (SLO: <46.000MB -4.8%) vs baseline: +4.8%


✅ index_aspect

Time: ✅ 138.310µs (SLO: <300.000µs 📉 -53.9%) vs baseline: +7.6%

Memory: ✅ 43.783MB (SLO: <46.000MB -4.8%) vs baseline: +4.7%


✅ index_noaspect

Time: ✅ 40.396µs (SLO: <300.000µs 📉 -86.5%) vs baseline: -0.8%

Memory: ✅ 43.887MB (SLO: <46.000MB -4.6%) vs baseline: +5.3%


✅ join_aspect

Time: ✅ 211.829µs (SLO: <300.000µs 📉 -29.4%) vs baseline: -3.9%

Memory: ✅ 43.829MB (SLO: <46.000MB -4.7%) vs baseline: +4.9%


✅ join_noaspect

Time: ✅ 142.857µs (SLO: <300.000µs 📉 -52.4%) vs baseline: -2.1%

Memory: ✅ 44.319MB (SLO: <46.000MB -3.7%) vs baseline: +6.1%


✅ ljust_aspect

Time: ✅ 506.801µs (SLO: <700.000µs 📉 -27.6%) vs baseline: -3.2%

Memory: ✅ 43.929MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%


✅ ljust_noaspect

Time: ✅ 271.699µs (SLO: <300.000µs -9.4%) vs baseline: +3.2%

Memory: ✅ 43.983MB (SLO: <46.000MB -4.4%) vs baseline: +5.4%


✅ lower_aspect

Time: ✅ 302.069µs (SLO: <500.000µs 📉 -39.6%) vs baseline: -2.7%

Memory: ✅ 43.740MB (SLO: <46.000MB -4.9%) vs baseline: +4.8%


✅ lower_noaspect

Time: ✅ 239.805µs (SLO: <300.000µs 📉 -20.1%) vs baseline: ~same

Memory: ✅ 43.866MB (SLO: <46.000MB -4.6%) vs baseline: +4.8%


✅ lstrip_aspect

Time: ✅ 0.272ms (SLO: <3.000ms 📉 -90.9%) vs baseline: -1.5%

Memory: ✅ 44.145MB (SLO: <46.000MB -4.0%) vs baseline: +6.2%


✅ lstrip_noaspect

Time: ✅ 0.178ms (SLO: <3.000ms 📉 -94.1%) vs baseline: +0.8%

Memory: ✅ 43.884MB (SLO: <46.000MB -4.6%) vs baseline: +4.9%


✅ modulo_aspect

Time: ✅ 14.212ms (SLO: <18.750ms 📉 -24.2%) vs baseline: -0.6%

Memory: ✅ 44.058MB (SLO: <46.000MB -4.2%) vs baseline: +5.4%


✅ modulo_aspect_for_bytearray_bytearray

Time: ✅ 14.722ms (SLO: <19.350ms 📉 -23.9%) vs baseline: -0.7%

Memory: ✅ 43.953MB (SLO: <46.000MB -4.5%) vs baseline: +5.1%


✅ modulo_aspect_for_bytes

Time: ✅ 14.307ms (SLO: <18.900ms 📉 -24.3%) vs baseline: -0.1%

Memory: ✅ 43.977MB (SLO: <46.000MB -4.4%) vs baseline: +5.1%


✅ modulo_aspect_for_bytes_bytearray

Time: ✅ 14.538ms (SLO: <19.150ms 📉 -24.1%) vs baseline: +0.2%

Memory: ✅ 44.317MB (SLO: <46.000MB -3.7%) vs baseline: +5.7%


✅ modulo_noaspect

Time: ✅ 0.373ms (SLO: <3.000ms 📉 -87.6%) vs baseline: +0.3%

Memory: ✅ 43.884MB (SLO: <46.000MB -4.6%) vs baseline: +5.0%


✅ replace_aspect

Time: ✅ 18.384ms (SLO: <24.000ms 📉 -23.4%) vs baseline: +0.1%

Memory: ✅ 44.104MB (SLO: <46.000MB -4.1%) vs baseline: +5.3%


✅ replace_noaspect

Time: ✅ 289.584µs (SLO: <400.000µs 📉 -27.6%) vs baseline: +0.6%

Memory: ✅ 43.827MB (SLO: <46.000MB -4.7%) vs baseline: +4.9%


✅ repr_aspect

Time: ✅ 328.502µs (SLO: <420.000µs 📉 -21.8%) vs baseline: -4.3%

Memory: ✅ 43.911MB (SLO: <46.000MB -4.5%) vs baseline: +5.2%


✅ repr_noaspect

Time: ✅ 47.191µs (SLO: <90.000µs 📉 -47.6%) vs baseline: +0.4%

Memory: ✅ 44.180MB (SLO: <46.000MB -4.0%) vs baseline: +6.0%


✅ rstrip_aspect

Time: ✅ 385.581µs (SLO: <500.000µs 📉 -22.9%) vs baseline: -1.4%

Memory: ✅ 44.154MB (SLO: <46.000MB -4.0%) vs baseline: +5.5%


✅ rstrip_noaspect

Time: ✅ 182.643µs (SLO: <300.000µs 📉 -39.1%) vs baseline: -0.6%

Memory: ✅ 43.909MB (SLO: <46.000MB -4.5%) vs baseline: +5.3%


✅ slice_aspect

Time: ✅ 184.365µs (SLO: <300.000µs 📉 -38.5%) vs baseline: +0.4%

Memory: ✅ 43.867MB (SLO: <46.000MB -4.6%) vs baseline: +5.0%


✅ slice_noaspect

Time: ✅ 53.794µs (SLO: <90.000µs 📉 -40.2%) vs baseline: -0.5%

Memory: ✅ 43.780MB (SLO: <46.000MB -4.8%) vs baseline: +4.8%


✅ stringio_aspect

Time: ✅ 4.463ms (SLO: <5.000ms 📉 -10.7%) vs baseline: 📈 +14.5%

Memory: ✅ 44.071MB (SLO: <46.000MB -4.2%) vs baseline: +5.5%


✅ stringio_noaspect

Time: ✅ 348.896µs (SLO: <500.000µs 📉 -30.2%) vs baseline: -2.5%

Memory: ✅ 43.888MB (SLO: <46.000MB -4.6%) vs baseline: +5.1%


✅ strip_aspect

Time: ✅ 271.449µs (SLO: <350.000µs 📉 -22.4%) vs baseline: -1.7%

Memory: ✅ 43.875MB (SLO: <46.000MB -4.6%) vs baseline: +5.0%


✅ strip_noaspect

Time: ✅ 180.303µs (SLO: <240.000µs 📉 -24.9%) vs baseline: +0.3%

Memory: ✅ 43.954MB (SLO: <46.000MB -4.4%) vs baseline: +5.0%


✅ swapcase_aspect

Time: ✅ 339.924µs (SLO: <500.000µs 📉 -32.0%) vs baseline: -2.2%

Memory: ✅ 44.194MB (SLO: <46.000MB -3.9%) vs baseline: +5.7%


✅ swapcase_noaspect

Time: ✅ 274.872µs (SLO: <400.000µs 📉 -31.3%) vs baseline: +0.5%

Memory: ✅ 43.885MB (SLO: <46.000MB -4.6%) vs baseline: +4.6%


✅ title_aspect

Time: ✅ 320.284µs (SLO: <500.000µs 📉 -35.9%) vs baseline: -7.2%

Memory: ✅ 43.840MB (SLO: <46.000MB -4.7%) vs baseline: +4.9%


✅ title_noaspect

Time: ✅ 264.274µs (SLO: <400.000µs 📉 -33.9%) vs baseline: -1.6%

Memory: ✅ 43.941MB (SLO: <46.000MB -4.5%) vs baseline: +5.2%


✅ translate_aspect

Time: ✅ 498.039µs (SLO: <700.000µs 📉 -28.9%) vs baseline: -3.7%

Memory: ✅ 44.047MB (SLO: <46.000MB -4.2%) vs baseline: +5.4%


✅ translate_noaspect

Time: ✅ 427.279µs (SLO: <500.000µs 📉 -14.5%) vs baseline: -1.9%

Memory: ✅ 43.796MB (SLO: <46.000MB -4.8%) vs baseline: +4.9%


✅ upper_aspect

Time: ✅ 302.369µs (SLO: <500.000µs 📉 -39.5%) vs baseline: -3.6%

Memory: ✅ 43.913MB (SLO: <46.000MB -4.5%) vs baseline: +4.9%


✅ upper_noaspect

Time: ✅ 235.576µs (SLO: <400.000µs 📉 -41.1%) vs baseline: -5.5%

Memory: ✅ 43.873MB (SLO: <46.000MB -4.6%) vs baseline: +4.8%


📈 iastaspectsospath - 24/24

✅ ospathbasename_aspect

Time: ✅ 536.290µs (SLO: <700.000µs 📉 -23.4%) vs baseline: 📈 +23.9%

Memory: ✅ 43.835MB (SLO: <46.000MB -4.7%) vs baseline: +5.5%


✅ ospathbasename_noaspect

Time: ✅ 441.115µs (SLO: <700.000µs 📉 -37.0%) vs baseline: +0.8%

Memory: ✅ 43.796MB (SLO: <46.000MB -4.8%) vs baseline: +5.1%


✅ ospathjoin_aspect

Time: ✅ 628.769µs (SLO: <700.000µs 📉 -10.2%) vs baseline: -2.0%

Memory: ✅ 43.842MB (SLO: <46.000MB -4.7%) vs baseline: +5.2%


✅ ospathjoin_noaspect

Time: ✅ 645.450µs (SLO: <700.000µs -7.8%) vs baseline: -1.5%

Memory: ✅ 43.771MB (SLO: <46.000MB -4.8%) vs baseline: +5.4%


✅ ospathnormcase_aspect

Time: ✅ 356.425µs (SLO: <700.000µs 📉 -49.1%) vs baseline: -1.6%

Memory: ✅ 43.848MB (SLO: <46.000MB -4.7%) vs baseline: +5.6%


✅ ospathnormcase_noaspect

Time: ✅ 361.489µs (SLO: <700.000µs 📉 -48.4%) vs baseline: -2.8%

Memory: ✅ 43.869MB (SLO: <46.000MB -4.6%) vs baseline: +5.5%


✅ ospathsplit_aspect

Time: ✅ 500.032µs (SLO: <700.000µs 📉 -28.6%) vs baseline: -0.8%

Memory: ✅ 43.958MB (SLO: <46.000MB -4.4%) vs baseline: +5.2%


✅ ospathsplit_noaspect

Time: ✅ 510.050µs (SLO: <700.000µs 📉 -27.1%) vs baseline: -1.7%

Memory: ✅ 43.858MB (SLO: <46.000MB -4.7%) vs baseline: +5.1%


✅ ospathsplitdrive_aspect

Time: ✅ 382.758µs (SLO: <700.000µs 📉 -45.3%) vs baseline: -0.1%

Memory: ✅ 43.896MB (SLO: <46.000MB -4.6%) vs baseline: +5.5%


✅ ospathsplitdrive_noaspect

Time: ✅ 72.839µs (SLO: <700.000µs 📉 -89.6%) vs baseline: +0.7%

Memory: ✅ 43.830MB (SLO: <46.000MB -4.7%) vs baseline: +5.3%


✅ ospathsplitext_aspect

Time: ✅ 471.487µs (SLO: <700.000µs 📉 -32.6%) vs baseline: +0.8%

Memory: ✅ 43.792MB (SLO: <46.000MB -4.8%) vs baseline: +5.1%


✅ ospathsplitext_noaspect

Time: ✅ 477.400µs (SLO: <700.000µs 📉 -31.8%) vs baseline: +1.1%

Memory: ✅ 43.773MB (SLO: <46.000MB -4.8%) vs baseline: +5.5%

🟡 Near SLO Breach (6 suites)
🟡 djangosimple - 30/30

✅ appsec

Time: ✅ 21.046ms (SLO: <22.300ms -5.6%) vs baseline: ~same

Memory: ✅ 71.269MB (SLO: <73.500MB -3.0%) vs baseline: +5.0%


✅ exception-replay-enabled

Time: ✅ 1.368ms (SLO: <1.450ms -5.6%) vs baseline: -0.3%

Memory: ✅ 69.601MB (SLO: <71.500MB -2.7%) vs baseline: +5.3%


✅ iast

Time: ✅ 20.981ms (SLO: <22.250ms -5.7%) vs baseline: ~same

Memory: ✅ 71.270MB (SLO: <75.000MB -5.0%) vs baseline: +5.1%


✅ profiler

Time: ✅ 15.244ms (SLO: <16.550ms -7.9%) vs baseline: +0.7%

Memory: ✅ 60.110MB (SLO: <61.000MB 🟡 -1.5%) vs baseline: +5.5%


✅ resource-renaming

Time: ✅ 20.807ms (SLO: <21.750ms -4.3%) vs baseline: +0.4%

Memory: ✅ 71.302MB (SLO: <73.500MB -3.0%) vs baseline: +5.2%


✅ span-code-origin

Time: ✅ 21.416ms (SLO: <28.200ms 📉 -24.1%) vs baseline: +0.6%

Memory: ✅ 71.349MB (SLO: <75.000MB -4.9%) vs baseline: +5.2%


✅ tracer

Time: ✅ 21.051ms (SLO: <21.750ms -3.2%) vs baseline: ~same

Memory: ✅ 71.366MB (SLO: <75.000MB -4.8%) vs baseline: +5.4%


✅ tracer-and-profiler

Time: ✅ 21.020ms (SLO: <23.500ms 📉 -10.6%) vs baseline: +0.1%

Memory: ✅ 73.330MB (SLO: <75.000MB -2.2%) vs baseline: +5.2%


✅ tracer-dont-create-db-spans

Time: ✅ 20.999ms (SLO: <21.500ms -2.3%) vs baseline: -0.6%

Memory: ✅ 71.363MB (SLO: <75.000MB -4.8%) vs baseline: +5.3%


✅ tracer-minimal

Time: ✅ 17.872ms (SLO: <18.500ms -3.4%) vs baseline: -0.5%

Memory: ✅ 71.291MB (SLO: <75.000MB -4.9%) vs baseline: +5.2%


✅ tracer-native

Time: ✅ 20.864ms (SLO: <21.750ms -4.1%) vs baseline: -0.7%

Memory: ✅ 71.353MB (SLO: <72.500MB 🟡 -1.6%) vs baseline: +5.3%


✅ tracer-no-caches

Time: ✅ 18.844ms (SLO: <19.650ms -4.1%) vs baseline: ~same

Memory: ✅ 71.275MB (SLO: <75.000MB -5.0%) vs baseline: +5.1%


✅ tracer-no-databases

Time: ✅ 20.596ms (SLO: <21.100ms -2.4%) vs baseline: -0.9%

Memory: ✅ 71.208MB (SLO: <75.000MB -5.1%) vs baseline: +4.9%


✅ tracer-no-middleware

Time: ✅ 20.780ms (SLO: <21.500ms -3.4%) vs baseline: +0.1%

Memory: ✅ 71.307MB (SLO: <75.000MB -4.9%) vs baseline: +5.1%


✅ tracer-no-templates

Time: ✅ 20.845ms (SLO: <22.000ms -5.2%) vs baseline: +0.7%

Memory: ✅ 71.327MB (SLO: <73.500MB -3.0%) vs baseline: +5.0%


🟡 otelsdkspan - 24/24

✅ add-event

Time: ✅ 40.767ms (SLO: <42.000ms -2.9%) vs baseline: ~same

Memory: ✅ 39.086MB (SLO: <40.750MB -4.1%) vs baseline: +6.2%


✅ add-link

Time: ✅ 36.470ms (SLO: <38.550ms -5.4%) vs baseline: +0.4%

Memory: ✅ 38.987MB (SLO: <40.750MB -4.3%) vs baseline: +5.8%


✅ add-metrics

Time: ✅ 219.912ms (SLO: <232.000ms -5.2%) vs baseline: +0.2%

Memory: ✅ 38.987MB (SLO: <40.750MB -4.3%) vs baseline: +5.9%


✅ add-tags

Time: ✅ 214.035ms (SLO: <221.600ms -3.4%) vs baseline: -0.1%

Memory: ✅ 39.007MB (SLO: <40.750MB -4.3%) vs baseline: +6.1%


✅ get-context

Time: ✅ 29.293ms (SLO: <31.300ms -6.4%) vs baseline: +0.4%

Memory: ✅ 39.125MB (SLO: <40.750MB -4.0%) vs baseline: +6.4%


✅ is-recording

Time: ✅ 29.284ms (SLO: <31.000ms -5.5%) vs baseline: +0.2%

Memory: ✅ 39.066MB (SLO: <40.750MB -4.1%) vs baseline: +5.9%


✅ record-exception

Time: ✅ 63.080ms (SLO: <65.850ms -4.2%) vs baseline: -0.4%

Memory: ✅ 39.086MB (SLO: <40.750MB -4.1%) vs baseline: +6.1%


✅ set-status

Time: ✅ 31.845ms (SLO: <34.150ms -6.7%) vs baseline: +0.5%

Memory: ✅ 38.987MB (SLO: <40.750MB -4.3%) vs baseline: +6.0%


✅ start

Time: ✅ 29.581ms (SLO: <30.150ms 🟡 -1.9%) vs baseline: +2.5%

Memory: ✅ 39.046MB (SLO: <40.750MB -4.2%) vs baseline: +6.2%


✅ start-finish

Time: ✅ 33.880ms (SLO: <35.350ms -4.2%) vs baseline: +0.4%

Memory: ✅ 38.987MB (SLO: <40.750MB -4.3%) vs baseline: +6.2%


✅ start-finish-telemetry

Time: ✅ 34.036ms (SLO: <35.450ms -4.0%) vs baseline: +1.1%

Memory: ✅ 39.145MB (SLO: <40.750MB -3.9%) vs baseline: +6.3%


✅ update-name

Time: ✅ 31.165ms (SLO: <33.400ms -6.7%) vs baseline: ~same

Memory: ✅ 39.046MB (SLO: <40.750MB -4.2%) vs baseline: +6.0%


🟡 otelspan - 22/22

✅ add-event

Time: ✅ 40.806ms (SLO: <47.150ms 📉 -13.5%) vs baseline: ~same

Memory: ✅ 41.201MB (SLO: <47.000MB 📉 -12.3%) vs baseline: +5.3%


✅ add-metrics

Time: ✅ 236.258ms (SLO: <344.800ms 📉 -31.5%) vs baseline: ~same

Memory: ✅ 45.625MB (SLO: <47.500MB -3.9%) vs baseline: +4.9%


✅ add-tags

Time: ✅ 277.912ms (SLO: <330.000ms 📉 -15.8%) vs baseline: +1.8%

Memory: ✅ 45.589MB (SLO: <47.500MB -4.0%) vs baseline: +5.1%


✅ get-context

Time: ✅ 83.728ms (SLO: <92.350ms -9.3%) vs baseline: +0.2%

Memory: ✅ 41.445MB (SLO: <46.500MB 📉 -10.9%) vs baseline: +5.1%


✅ is-recording

Time: ✅ 39.121ms (SLO: <44.500ms 📉 -12.1%) vs baseline: -0.5%

Memory: ✅ 41.104MB (SLO: <47.500MB 📉 -13.5%) vs baseline: +5.1%


✅ record-exception

Time: ✅ 61.040ms (SLO: <67.650ms -9.8%) vs baseline: -0.2%

Memory: ✅ 41.884MB (SLO: <47.000MB 📉 -10.9%) vs baseline: +5.7%


✅ set-status

Time: ✅ 45.055ms (SLO: <50.400ms 📉 -10.6%) vs baseline: -0.1%

Memory: ✅ 41.100MB (SLO: <47.000MB 📉 -12.6%) vs baseline: +5.2%


✅ start

Time: ✅ 39.954ms (SLO: <44.500ms 📉 -10.2%) vs baseline: +2.9%

Memory: ✅ 41.069MB (SLO: <47.000MB 📉 -12.6%) vs baseline: +5.4%


✅ start-finish

Time: ✅ 90.341ms (SLO: <91.000ms 🟡 -0.7%) vs baseline: +0.5%

Memory: ✅ 38.869MB (SLO: <46.500MB 📉 -16.4%) vs baseline: +5.4%


✅ start-finish-telemetry

Time: ✅ 91.688ms (SLO: <92.000ms 🟡 -0.3%) vs baseline: +0.2%

Memory: ✅ 38.633MB (SLO: <46.500MB 📉 -16.9%) vs baseline: +4.9%


✅ update-name

Time: ✅ 40.190ms (SLO: <45.150ms 📉 -11.0%) vs baseline: -0.2%

Memory: ✅ 41.208MB (SLO: <47.000MB 📉 -12.3%) vs baseline: +5.3%


🟡 recursivecomputation - 8/8

✅ deep

Time: ✅ 312.077ms (SLO: <320.950ms -2.8%) vs baseline: ~same

Memory: ✅ 37.415MB (SLO: <38.750MB -3.4%) vs baseline: +5.4%


✅ deep-profiled

Time: ✅ 328.296ms (SLO: <359.150ms -8.6%) vs baseline: -0.4%

Memory: ✅ 43.726MB (SLO: <46.000MB -4.9%) vs baseline: +5.4%


✅ medium

Time: ✅ 7.390ms (SLO: <7.450ms 🟡 -0.8%) vs baseline: -0.6%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


✅ shallow

Time: ✅ 1.050ms (SLO: <1.050ms 🟡 ~same) vs baseline: +2.0%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.1%


🟡 span - 26/26

✅ add-event

Time: ✅ 19.522ms (SLO: <22.500ms 📉 -13.2%) vs baseline: -1.5%

Memory: ✅ 38.412MB (SLO: <53.000MB 📉 -27.5%) vs baseline: +5.4%


✅ add-metrics

Time: ✅ 89.521ms (SLO: <93.500ms -4.3%) vs baseline: +0.3%

Memory: ✅ 42.920MB (SLO: <53.000MB 📉 -19.0%) vs baseline: +5.2%


✅ add-tags

Time: ✅ 148.922ms (SLO: <155.000ms -3.9%) vs baseline: +0.9%

Memory: ✅ 42.948MB (SLO: <53.000MB 📉 -19.0%) vs baseline: +5.5%


✅ get-context

Time: ✅ 18.713ms (SLO: <20.500ms -8.7%) vs baseline: -1.5%

Memory: ✅ 38.228MB (SLO: <53.000MB 📉 -27.9%) vs baseline: +5.1%


✅ is-recording

Time: ✅ 18.775ms (SLO: <20.500ms -8.4%) vs baseline: -1.3%

Memory: ✅ 38.282MB (SLO: <53.000MB 📉 -27.8%) vs baseline: +5.0%


✅ record-exception

Time: ✅ 38.387ms (SLO: <41.000ms -6.4%) vs baseline: -0.8%

Memory: ✅ 38.912MB (SLO: <53.000MB 📉 -26.6%) vs baseline: +5.5%


✅ set-status

Time: ✅ 20.583ms (SLO: <22.000ms -6.4%) vs baseline: -1.3%

Memory: ✅ 38.242MB (SLO: <53.000MB 📉 -27.8%) vs baseline: +5.0%


✅ start

Time: ✅ 19.702ms (SLO: <20.500ms -3.9%) vs baseline: +4.5%

Memory: ✅ 38.129MB (SLO: <53.000MB 📉 -28.1%) vs baseline: +4.7%


✅ start-finish

Time: ✅ 57.818ms (SLO: <58.500ms 🟡 -1.2%) vs baseline: -1.0%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.3%


✅ start-finish-telemetry

Time: ✅ 58.979ms (SLO: <60.000ms 🟡 -1.7%) vs baseline: -1.0%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.4%


✅ start-finish-traceid128

Time: ✅ 60.234ms (SLO: <62.000ms -2.8%) vs baseline: -1.0%

Memory: ✅ 36.196MB (SLO: <38.000MB -4.7%) vs baseline: +5.1%


✅ start-traceid128

Time: ✅ 18.628ms (SLO: <22.500ms 📉 -17.2%) vs baseline: -1.6%

Memory: ✅ 38.287MB (SLO: <53.000MB 📉 -27.8%) vs baseline: +5.3%


✅ update-name

Time: ✅ 19.328ms (SLO: <22.000ms 📉 -12.1%) vs baseline: -1.3%

Memory: ✅ 38.418MB (SLO: <53.000MB 📉 -27.5%) vs baseline: +5.7%


🟡 tracer - 6/6

✅ large

Time: ✅ 33.107ms (SLO: <32.950ms +0.5%) vs baseline: ~same

Memory: ✅ 37.749MB (SLO: <39.250MB -3.8%) vs baseline: +6.0%


✅ medium

Time: ✅ 3.338ms (SLO: <3.500ms -4.6%) vs baseline: -0.2%

Memory: ✅ 36.215MB (SLO: <38.750MB -6.5%) vs baseline: +5.2%


✅ small

Time: ✅ 386.022µs (SLO: <390.000µs 🟡 -1.0%) vs baseline: +3.4%

Memory: ✅ 36.215MB (SLO: <38.750MB -6.5%) vs baseline: +5.1%

⚠️ Unstable Tests (2 suites)
⚠️ coreapiscenario - 10/10 (1 unstable)

⚠️ context_with_data_listeners

Time: ⚠️ 13.645µs (SLO: <20.000µs 📉 -31.8%) vs baseline: ~same

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.2%


✅ context_with_data_no_listeners

Time: ✅ 3.577µs (SLO: <10.000µs 📉 -64.2%) vs baseline: -1.0%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.5%


✅ get_item_exists

Time: ✅ 0.590µs (SLO: <10.000µs 📉 -94.1%) vs baseline: +0.5%

Memory: ✅ 36.196MB (SLO: <38.000MB -4.7%) vs baseline: +5.3%


✅ get_item_missing

Time: ✅ 0.638µs (SLO: <10.000µs 📉 -93.6%) vs baseline: +0.2%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.5%


✅ set_item

Time: ✅ 24.422µs (SLO: <30.000µs 📉 -18.6%) vs baseline: -0.9%

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.4%


⚠️ packagesupdateimporteddependencies - 24/24 (1 unstable)

✅ import_many

Time: ✅ 155.469µs (SLO: <170.000µs -8.5%) vs baseline: -0.2%

Memory: ✅ 41.235MB (SLO: <46.000MB 📉 -10.4%) vs baseline: +5.5%


✅ import_many_cached

Time: ✅ 121.748µs (SLO: <130.000µs -6.3%) vs baseline: +0.6%

Memory: ✅ 41.307MB (SLO: <46.000MB 📉 -10.2%) vs baseline: +5.2%


✅ import_many_stdlib

Time: ✅ 0.799ms (SLO: <1.750ms 📉 -54.4%) vs baseline: +0.5%

Memory: ✅ 41.398MB (SLO: <46.000MB 📉 -10.0%) vs baseline: +5.6%


⚠️ import_many_stdlib_cached

Time: ⚠️ 0.185ms (SLO: <1.100ms 📉 -83.2%) vs baseline: +0.3%

Memory: ✅ 41.251MB (SLO: <46.000MB 📉 -10.3%) vs baseline: +5.1%


✅ import_many_unknown

Time: ✅ 831.088µs (SLO: <890.000µs -6.6%) vs baseline: ~same

Memory: ✅ 41.499MB (SLO: <46.000MB -9.8%) vs baseline: +5.3%


✅ import_many_unknown_cached

Time: ✅ 788.796µs (SLO: <870.000µs -9.3%) vs baseline: -1.3%

Memory: ✅ 41.385MB (SLO: <46.000MB 📉 -10.0%) vs baseline: +5.4%


✅ import_one

Time: ✅ 19.781µs (SLO: <30.000µs 📉 -34.1%) vs baseline: +0.4%

Memory: ✅ 41.304MB (SLO: <46.000MB 📉 -10.2%) vs baseline: +5.6%


✅ import_one_cache

Time: ✅ 6.299µs (SLO: <10.000µs 📉 -37.0%) vs baseline: +0.3%

Memory: ✅ 41.278MB (SLO: <46.000MB 📉 -10.3%) vs baseline: +5.5%


✅ import_one_stdlib

Time: ✅ 18.819µs (SLO: <20.000µs -5.9%) vs baseline: +1.6%

Memory: ✅ 41.451MB (SLO: <46.000MB -9.9%) vs baseline: +5.5%


✅ import_one_stdlib_cache

Time: ✅ 6.260µs (SLO: <10.000µs 📉 -37.4%) vs baseline: -1.0%

Memory: ✅ 41.327MB (SLO: <46.000MB 📉 -10.2%) vs baseline: +5.8%


✅ import_one_unknown

Time: ✅ 45.516µs (SLO: <50.000µs -9.0%) vs baseline: +0.4%

Memory: ✅ 41.207MB (SLO: <46.000MB 📉 -10.4%) vs baseline: +5.2%


✅ import_one_unknown_cache

Time: ✅ 6.278µs (SLO: <10.000µs 📉 -37.2%) vs baseline: -0.2%

Memory: ✅ 41.397MB (SLO: <43.000MB -3.7%) vs baseline: +5.6%

✅ All Tests Passing (16 suites)
codeprovenancefork - 2/2

✅ fork-10

Time: ✅ 2.185s (SLO: <2.300s -5.0%) vs baseline: +2.6%

Memory: ✅ 17.400MB (SLO: <20.000MB 📉 -13.0%) vs baseline: +4.6%


errortrackingdjangosimple - 6/6

✅ errortracking-enabled-all

Time: ✅ 17.504ms (SLO: <19.850ms 📉 -11.8%) vs baseline: ~same

Memory: ✅ 70.652MB (SLO: <75.000MB -5.8%) vs baseline: +4.7%


✅ errortracking-enabled-user

Time: ✅ 17.526ms (SLO: <19.400ms -9.7%) vs baseline: -0.4%

Memory: ✅ 70.621MB (SLO: <75.000MB -5.8%) vs baseline: +4.6%


✅ tracer-enabled

Time: ✅ 17.475ms (SLO: <19.450ms 📉 -10.2%) vs baseline: -0.2%

Memory: ✅ 70.700MB (SLO: <75.000MB -5.7%) vs baseline: +4.8%


errortrackingflasksqli - 6/6

✅ errortracking-enabled-all

Time: ✅ 2.121ms (SLO: <2.300ms -7.8%) vs baseline: ~same

Memory: ✅ 58.149MB (SLO: <60.000MB -3.1%) vs baseline: +4.8%


✅ errortracking-enabled-user

Time: ✅ 2.125ms (SLO: <2.250ms -5.6%) vs baseline: +0.2%

Memory: ✅ 58.156MB (SLO: <60.000MB -3.1%) vs baseline: +4.7%


✅ tracer-enabled

Time: ✅ 2.120ms (SLO: <2.300ms -7.8%) vs baseline: +0.1%

Memory: ✅ 58.177MB (SLO: <60.000MB -3.0%) vs baseline: +4.8%


flasksimple - 18/18

✅ appsec-get

Time: ✅ 3.427ms (SLO: <4.750ms 📉 -27.9%) vs baseline: +0.3%

Memory: ✅ 58.607MB (SLO: <66.500MB 📉 -11.9%) vs baseline: +5.6%


✅ appsec-post

Time: ✅ 2.910ms (SLO: <6.750ms 📉 -56.9%) vs baseline: +0.2%

Memory: ✅ 58.565MB (SLO: <66.500MB 📉 -11.9%) vs baseline: +5.6%


✅ appsec-telemetry

Time: ✅ 3.449ms (SLO: <4.750ms 📉 -27.4%) vs baseline: +1.7%

Memory: ✅ 58.558MB (SLO: <66.500MB 📉 -11.9%) vs baseline: +5.3%


✅ debugger

Time: ✅ 1.882ms (SLO: <2.000ms -5.9%) vs baseline: ~same

Memory: ✅ 49.218MB (SLO: <51.500MB -4.4%) vs baseline: +5.5%


✅ iast-get

Time: ✅ 1.871ms (SLO: <2.000ms -6.5%) vs baseline: -0.1%

Memory: ✅ 45.974MB (SLO: <49.000MB -6.2%) vs baseline: +5.6%


✅ profiler

Time: ✅ 1.924ms (SLO: <2.100ms -8.4%) vs baseline: +0.4%

Memory: ✅ 52.050MB (SLO: <53.500MB -2.7%) vs baseline: +5.9%


✅ resource-renaming

Time: ✅ 3.384ms (SLO: <3.650ms -7.3%) vs baseline: -0.2%

Memory: ✅ 58.536MB (SLO: <60.000MB -2.4%) vs baseline: +5.4%


✅ tracer

Time: ✅ 3.403ms (SLO: <3.650ms -6.8%) vs baseline: +0.3%

Memory: ✅ 58.539MB (SLO: <60.000MB -2.4%) vs baseline: +5.4%


✅ tracer-native

Time: ✅ 3.390ms (SLO: <3.650ms -7.1%) vs baseline: ~same

Memory: ✅ 58.548MB (SLO: <60.000MB -2.4%) vs baseline: +5.4%


flasksqli - 6/6

✅ appsec-enabled

Time: ✅ 2.117ms (SLO: <4.200ms 📉 -49.6%) vs baseline: -0.2%

Memory: ✅ 58.293MB (SLO: <66.000MB 📉 -11.7%) vs baseline: +5.0%


✅ iast-enabled

Time: ✅ 2.124ms (SLO: <2.800ms 📉 -24.2%) vs baseline: ~same

Memory: ✅ 58.366MB (SLO: <62.500MB -6.6%) vs baseline: +5.1%


✅ tracer-enabled

Time: ✅ 2.115ms (SLO: <2.250ms -6.0%) vs baseline: +0.2%

Memory: ✅ 58.237MB (SLO: <60.000MB -2.9%) vs baseline: +4.9%


forktime - 4/4

✅ baseline

Time: ✅ 1.941ms (SLO: <3.000ms 📉 -35.3%) vs baseline: +5.1%

Memory: ✅ 29.295MB (SLO: <33.000MB 📉 -11.2%) vs baseline: +4.7%


✅ configured

Time: ✅ 9.403ms (SLO: <13.000ms 📉 -27.7%) vs baseline: +0.1%

Memory: ✅ 58.306MB (SLO: <60.000MB -2.8%) vs baseline: +5.0%


httppropagationextract - 60/60

✅ all_styles_all_headers

Time: ✅ 81.247µs (SLO: <100.000µs 📉 -18.8%) vs baseline: +5.0%

Memory: ✅ 36.313MB (SLO: <38.000MB -4.4%) vs baseline: +5.1%


✅ b3_headers

Time: ✅ 12.893µs (SLO: <20.000µs 📉 -35.5%) vs baseline: +0.3%

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.3%


✅ b3_single_headers

Time: ✅ 11.919µs (SLO: <20.000µs 📉 -40.4%) vs baseline: ~same

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +4.9%


✅ datadog_tracecontext_tracestate_not_propagated_on_trace_id_no_match

Time: ✅ 60.766µs (SLO: <80.000µs 📉 -24.0%) vs baseline: ~same

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.2%


✅ datadog_tracecontext_tracestate_propagated_on_trace_id_match

Time: ✅ 64.463µs (SLO: <80.000µs 📉 -19.4%) vs baseline: +0.8%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.0%


✅ empty_headers

Time: ✅ 1.300µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +0.7%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.3%


✅ full_t_id_datadog_headers

Time: ✅ 21.739µs (SLO: <30.000µs 📉 -27.5%) vs baseline: +0.5%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.1%


✅ invalid_priority_header

Time: ✅ 5.958µs (SLO: <10.000µs 📉 -40.4%) vs baseline: +1.1%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


✅ invalid_span_id_header

Time: ✅ 5.917µs (SLO: <10.000µs 📉 -40.8%) vs baseline: +0.2%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +4.9%


✅ invalid_tags_header

Time: ✅ 5.891µs (SLO: <10.000µs 📉 -41.1%) vs baseline: -0.2%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +4.9%


✅ invalid_trace_id_header

Time: ✅ 5.958µs (SLO: <10.000µs 📉 -40.4%) vs baseline: +0.8%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.1%


✅ large_header_no_matches

Time: ✅ 27.516µs (SLO: <30.000µs -8.3%) vs baseline: +0.6%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +4.7%


✅ large_valid_headers_all

Time: ✅ 28.605µs (SLO: <40.000µs 📉 -28.5%) vs baseline: +0.5%

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.3%


✅ medium_header_no_matches

Time: ✅ 9.337µs (SLO: <20.000µs 📉 -53.3%) vs baseline: ~same

Memory: ✅ 36.372MB (SLO: <38.000MB -4.3%) vs baseline: +5.5%


✅ medium_valid_headers_all

Time: ✅ 10.642µs (SLO: <20.000µs 📉 -46.8%) vs baseline: -0.1%

Memory: ✅ 36.333MB (SLO: <38.000MB -4.4%) vs baseline: +5.5%


✅ none_propagation_style

Time: ✅ 1.394µs (SLO: <10.000µs 📉 -86.1%) vs baseline: ~same

Memory: ✅ 36.137MB (SLO: <38.000MB -4.9%) vs baseline: +4.7%


✅ tracecontext_headers

Time: ✅ 32.971µs (SLO: <40.000µs 📉 -17.6%) vs baseline: -0.2%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.1%


✅ valid_headers_all

Time: ✅ 5.919µs (SLO: <10.000µs 📉 -40.8%) vs baseline: ~same

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


✅ valid_headers_basic

Time: ✅ 5.477µs (SLO: <10.000µs 📉 -45.2%) vs baseline: ~same

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.1%


✅ wsgi_empty_headers

Time: ✅ 1.308µs (SLO: <10.000µs 📉 -86.9%) vs baseline: +0.8%

Memory: ✅ 36.196MB (SLO: <38.000MB -4.7%) vs baseline: +4.8%


✅ wsgi_invalid_priority_header

Time: ✅ 5.982µs (SLO: <10.000µs 📉 -40.2%) vs baseline: +0.7%

Memory: ✅ 36.294MB (SLO: <38.000MB -4.5%) vs baseline: +5.1%


✅ wsgi_invalid_span_id_header

Time: ✅ 1.306µs (SLO: <10.000µs 📉 -86.9%) vs baseline: +0.6%

Memory: ✅ 36.313MB (SLO: <38.000MB -4.4%) vs baseline: +5.5%


✅ wsgi_invalid_tags_header

Time: ✅ 5.990µs (SLO: <10.000µs 📉 -40.1%) vs baseline: +0.9%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.3%


✅ wsgi_invalid_trace_id_header

Time: ✅ 5.982µs (SLO: <10.000µs 📉 -40.2%) vs baseline: ~same

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.3%


✅ wsgi_large_header_no_matches

Time: ✅ 28.589µs (SLO: <40.000µs 📉 -28.5%) vs baseline: +0.5%

Memory: ✅ 36.372MB (SLO: <38.000MB -4.3%) vs baseline: +5.8%


✅ wsgi_large_valid_headers_all

Time: ✅ 29.898µs (SLO: <40.000µs 📉 -25.3%) vs baseline: +0.1%

Memory: ✅ 36.313MB (SLO: <38.000MB -4.4%) vs baseline: +5.5%


✅ wsgi_medium_header_no_matches

Time: ✅ 9.499µs (SLO: <20.000µs 📉 -52.5%) vs baseline: -0.4%

Memory: ✅ 36.372MB (SLO: <38.000MB -4.3%) vs baseline: +5.3%


✅ wsgi_medium_valid_headers_all

Time: ✅ 11.071µs (SLO: <20.000µs 📉 -44.6%) vs baseline: +1.4%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.2%


✅ wsgi_valid_headers_all

Time: ✅ 5.989µs (SLO: <10.000µs 📉 -40.1%) vs baseline: +0.3%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.3%


✅ wsgi_valid_headers_basic

Time: ✅ 5.544µs (SLO: <10.000µs 📉 -44.6%) vs baseline: +0.2%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.1%


httppropagationinject - 16/16

✅ ids_only

Time: ✅ 20.758µs (SLO: <30.000µs 📉 -30.8%) vs baseline: +3.9%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.3%


✅ with_all

Time: ✅ 26.986µs (SLO: <40.000µs 📉 -32.5%) vs baseline: ~same

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.2%


✅ with_dd_origin

Time: ✅ 23.761µs (SLO: <30.000µs 📉 -20.8%) vs baseline: -0.7%

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.2%


✅ with_priority_and_origin

Time: ✅ 23.176µs (SLO: <40.000µs 📉 -42.1%) vs baseline: -0.3%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.1%


✅ with_sampling_priority

Time: ✅ 20.030µs (SLO: <30.000µs 📉 -33.2%) vs baseline: -0.2%

Memory: ✅ 36.313MB (SLO: <38.000MB -4.4%) vs baseline: +5.4%


✅ with_tags

Time: ✅ 25.002µs (SLO: <40.000µs 📉 -37.5%) vs baseline: -0.6%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.2%


✅ with_tags_invalid

Time: ✅ 26.554µs (SLO: <40.000µs 📉 -33.6%) vs baseline: +0.5%

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.5%


✅ with_tags_max_size

Time: ✅ 25.434µs (SLO: <40.000µs 📉 -36.4%) vs baseline: -0.4%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


iastaspectssplit - 12/12

✅ rsplit_aspect

Time: ✅ 165.096µs (SLO: <250.000µs 📉 -34.0%) vs baseline: +4.6%

Memory: ✅ 43.953MB (SLO: <46.000MB -4.5%) vs baseline: +6.1%


✅ rsplit_noaspect

Time: ✅ 160.002µs (SLO: <250.000µs 📉 -36.0%) vs baseline: -0.3%

Memory: ✅ 43.891MB (SLO: <46.000MB -4.6%) vs baseline: +5.7%


✅ split_aspect

Time: ✅ 152.596µs (SLO: <250.000µs 📉 -39.0%) vs baseline: +2.2%

Memory: ✅ 44.040MB (SLO: <46.000MB -4.3%) vs baseline: +5.9%


✅ split_noaspect

Time: ✅ 161.239µs (SLO: <250.000µs 📉 -35.5%) vs baseline: +4.1%

Memory: ✅ 43.911MB (SLO: <46.000MB -4.5%) vs baseline: +5.5%


✅ splitlines_aspect

Time: ✅ 152.517µs (SLO: <250.000µs 📉 -39.0%) vs baseline: +2.4%

Memory: ✅ 43.893MB (SLO: <46.000MB -4.6%) vs baseline: +5.2%


✅ splitlines_noaspect

Time: ✅ 158.101µs (SLO: <250.000µs 📉 -36.8%) vs baseline: +1.3%

Memory: ✅ 43.680MB (SLO: <46.000MB -5.0%) vs baseline: +5.4%


iastpropagation - 8/8

✅ no-propagation

Time: ✅ 48.390µs (SLO: <60.000µs 📉 -19.4%) vs baseline: +0.7%

Memory: ✅ 40.049MB (SLO: <42.000MB -4.6%) vs baseline: +5.2%


✅ propagation_enabled

Time: ✅ 135.387µs (SLO: <190.000µs 📉 -28.7%) vs baseline: +0.4%

Memory: ✅ 40.010MB (SLO: <42.000MB -4.7%) vs baseline: +5.0%


✅ propagation_enabled_100

Time: ✅ 1.545ms (SLO: <2.300ms 📉 -32.8%) vs baseline: -1.7%

Memory: ✅ 40.010MB (SLO: <42.000MB -4.7%) vs baseline: +5.1%


✅ propagation_enabled_1000

Time: ✅ 28.890ms (SLO: <34.550ms 📉 -16.4%) vs baseline: -0.8%

Memory: ✅ 40.029MB (SLO: <42.000MB -4.7%) vs baseline: +5.4%


packagespackageforrootmodulemapping - 4/4

✅ cache_off

Time: ✅ 341.836ms (SLO: <354.300ms -3.5%) vs baseline: +0.2%

Memory: ✅ 42.812MB (SLO: <46.000MB -6.9%) vs baseline: +5.7%


✅ cache_on

Time: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: ~same

Memory: ✅ 41.335MB (SLO: <46.000MB 📉 -10.1%) vs baseline: +5.1%


rand - 2/2

✅ rand128bits

Time: ✅ 0.183µs (SLO: <21.000µs 📉 -99.1%) vs baseline: -1.5%


✅ rand64bits

Time: ✅ 0.123µs (SLO: <15.000µs 📉 -99.2%) vs baseline: +5.4%


ratelimiter - 12/12

✅ defaults

Time: ✅ 2.338µs (SLO: <10.000µs 📉 -76.6%) vs baseline: -0.2%

Memory: ✅ 36.549MB (SLO: <38.000MB -3.8%) vs baseline: +6.2%


✅ high_rate_limit

Time: ✅ 2.435µs (SLO: <10.000µs 📉 -75.7%) vs baseline: +1.2%

Memory: ✅ 36.569MB (SLO: <38.000MB -3.8%) vs baseline: +6.5%


✅ long_window

Time: ✅ 2.351µs (SLO: <10.000µs 📉 -76.5%) vs baseline: -0.4%

Memory: ✅ 36.530MB (SLO: <38.000MB -3.9%) vs baseline: +6.2%


✅ low_rate_limit

Time: ✅ 2.390µs (SLO: <10.000µs 📉 -76.1%) vs baseline: +1.4%

Memory: ✅ 36.549MB (SLO: <38.000MB -3.8%) vs baseline: +6.2%


✅ no_rate_limit

Time: ✅ 0.827µs (SLO: <10.000µs 📉 -91.7%) vs baseline: ~same

Memory: ✅ 36.530MB (SLO: <38.000MB -3.9%) vs baseline: +6.0%


✅ short_window

Time: ✅ 2.476µs (SLO: <10.000µs 📉 -75.2%) vs baseline: -0.7%

Memory: ✅ 36.549MB (SLO: <38.000MB -3.8%) vs baseline: +6.2%


samplingrules - 8/8

✅ average_match

Time: ✅ 167.980µs (SLO: <300.000µs 📉 -44.0%) vs baseline: -0.7%

Memory: ✅ 36.137MB (SLO: <38.000MB -4.9%) vs baseline: +5.1%


✅ high_match

Time: ✅ 217.120µs (SLO: <480.000µs 📉 -54.8%) vs baseline: +0.4%

Memory: ✅ 36.156MB (SLO: <38.000MB -4.9%) vs baseline: +4.9%


✅ low_match

Time: ✅ 120.298µs (SLO: <130.000µs -7.5%) vs baseline: +0.6%

Memory: ✅ 701.716MB (SLO: <780.000MB 📉 -10.0%) vs baseline: +4.9%


✅ very_low_match

Time: ✅ 3.122ms (SLO: <9.000ms 📉 -65.3%) vs baseline: +0.4%

Memory: ✅ 78.672MB (SLO: <85.000MB -7.4%) vs baseline: +5.0%


sethttpmeta - 32/32

✅ all-disabled

Time: ✅ 10.564µs (SLO: <20.000µs 📉 -47.2%) vs baseline: -0.2%

Memory: ✅ 37.179MB (SLO: <38.750MB -4.1%) vs baseline: +5.5%


✅ all-enabled

Time: ✅ 39.759µs (SLO: <50.000µs 📉 -20.5%) vs baseline: +1.5%

Memory: ✅ 37.080MB (SLO: <38.750MB -4.3%) vs baseline: +5.1%


✅ collectipvariant_exists

Time: ✅ 39.797µs (SLO: <50.000µs 📉 -20.4%) vs baseline: -0.2%

Memory: ✅ 37.238MB (SLO: <38.750MB -3.9%) vs baseline: +5.7%


✅ no-collectipvariant

Time: ✅ 39.041µs (SLO: <50.000µs 📉 -21.9%) vs baseline: +0.1%

Memory: ✅ 37.198MB (SLO: <38.750MB -4.0%) vs baseline: +5.9%


✅ no-useragentvariant

Time: ✅ 38.049µs (SLO: <50.000µs 📉 -23.9%) vs baseline: +0.9%

Memory: ✅ 37.179MB (SLO: <38.750MB -4.1%) vs baseline: +5.5%


✅ obfuscation-no-query

Time: ✅ 39.758µs (SLO: <50.000µs 📉 -20.5%) vs baseline: +0.5%

Memory: ✅ 37.120MB (SLO: <38.750MB -4.2%) vs baseline: +5.4%


✅ obfuscation-regular-case-explicit-query

Time: ✅ 75.952µs (SLO: <90.000µs 📉 -15.6%) vs baseline: +0.5%

Memory: ✅ 37.493MB (SLO: <38.750MB -3.2%) vs baseline: +6.5%


✅ obfuscation-regular-case-implicit-query

Time: ✅ 76.307µs (SLO: <90.000µs 📉 -15.2%) vs baseline: +0.2%

Memory: ✅ 37.513MB (SLO: <38.750MB -3.2%) vs baseline: +6.2%


✅ obfuscation-send-querystring-disabled

Time: ✅ 154.381µs (SLO: <170.000µs -9.2%) vs baseline: ~same

Memory: ✅ 37.493MB (SLO: <38.750MB -3.2%) vs baseline: +5.5%


✅ obfuscation-worst-case-explicit-query

Time: ✅ 149.060µs (SLO: <160.000µs -6.8%) vs baseline: +0.2%

Memory: ✅ 37.454MB (SLO: <38.750MB -3.3%) vs baseline: +5.3%


✅ obfuscation-worst-case-implicit-query

Time: ✅ 155.128µs (SLO: <170.000µs -8.7%) vs baseline: +0.1%

Memory: ✅ 37.434MB (SLO: <38.750MB -3.4%) vs baseline: +5.4%


✅ useragentvariant_exists_1

Time: ✅ 38.560µs (SLO: <50.000µs 📉 -22.9%) vs baseline: +0.2%

Memory: ✅ 37.061MB (SLO: <38.750MB -4.4%) vs baseline: +5.1%


✅ useragentvariant_exists_2

Time: ✅ 39.639µs (SLO: <50.000µs 📉 -20.7%) vs baseline: +0.3%

Memory: ✅ 37.100MB (SLO: <38.750MB -4.3%) vs baseline: +5.6%


✅ useragentvariant_exists_3

Time: ✅ 39.178µs (SLO: <50.000µs 📉 -21.6%) vs baseline: ~same

Memory: ✅ 37.080MB (SLO: <38.750MB -4.3%) vs baseline: +5.2%


✅ useragentvariant_not_exists_1

Time: ✅ 38.719µs (SLO: <50.000µs 📉 -22.6%) vs baseline: +0.4%

Memory: ✅ 37.159MB (SLO: <38.750MB -4.1%) vs baseline: +5.4%


✅ useragentvariant_not_exists_2

Time: ✅ 38.813µs (SLO: <50.000µs 📉 -22.4%) vs baseline: +0.5%

Memory: ✅ 37.336MB (SLO: <38.750MB -3.6%) vs baseline: +6.1%


telemetryaddmetric - 30/30

✅ 1-count-metric-1-times

Time: ✅ 2.319µs (SLO: <20.000µs 📉 -88.4%) vs baseline: +9.5%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


✅ 1-count-metrics-100-times

Time: ✅ 157.578µs (SLO: <220.000µs 📉 -28.4%) vs baseline: +2.3%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


✅ 1-distribution-metric-1-times

Time: ✅ 2.432µs (SLO: <20.000µs 📉 -87.8%) vs baseline: -3.2%

Memory: ✅ 36.294MB (SLO: <38.000MB -4.5%) vs baseline: +5.4%


✅ 1-distribution-metrics-100-times

Time: ✅ 166.491µs (SLO: <230.000µs 📉 -27.6%) vs baseline: -1.4%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +4.9%


✅ 1-gauge-metric-1-times

Time: ✅ 1.990µs (SLO: <20.000µs 📉 -90.1%) vs baseline: -1.5%

Memory: ✅ 36.294MB (SLO: <38.000MB -4.5%) vs baseline: +5.5%


✅ 1-gauge-metrics-100-times

Time: ✅ 138.051µs (SLO: <150.000µs -8.0%) vs baseline: -1.0%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.2%


✅ 1-rate-metric-1-times

Time: ✅ 2.302µs (SLO: <20.000µs 📉 -88.5%) vs baseline: +2.5%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.3%


✅ 1-rate-metrics-100-times

Time: ✅ 171.603µs (SLO: <250.000µs 📉 -31.4%) vs baseline: +2.8%

Memory: ✅ 36.274MB (SLO: <38.000MB -4.5%) vs baseline: +5.4%


✅ 100-count-metrics-100-times

Time: ✅ 15.468ms (SLO: <22.000ms 📉 -29.7%) vs baseline: +0.2%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.4%


✅ 100-distribution-metrics-100-times

Time: ✅ 1.740ms (SLO: <2.550ms 📉 -31.8%) vs baseline: -1.4%

Memory: ✅ 36.215MB (SLO: <38.000MB -4.7%) vs baseline: +5.2%


✅ 100-gauge-metrics-100-times

Time: ✅ 1.428ms (SLO: <1.550ms -7.9%) vs baseline: -0.3%

Memory: ✅ 36.235MB (SLO: <38.000MB -4.6%) vs baseline: +5.3%


✅ 100-rate-metrics-100-times

Time: ✅ 1.754ms (SLO: <2.550ms 📉 -31.2%) vs baseline: +0.9%

Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.6%


✅ flush-1-metric

Time: ✅ 3.605µs (SLO: <20.000µs 📉 -82.0%) vs baseline: +1.8%

Memory: ✅ 36.648MB (SLO: <38.000MB -3.6%) vs baseline: +6.6%


✅ flush-100-metrics

Time: ✅ 175.990µs (SLO: <250.000µs 📉 -29.6%) vs baseline: +0.3%

Memory: ✅ 36.746MB (SLO: <38.000MB -3.3%) vs baseline: +6.0%


✅ flush-1000-metrics

Time: ✅ 2.216ms (SLO: <2.500ms 📉 -11.3%) vs baseline: +0.5%

Memory: ✅ 37.002MB (SLO: <38.750MB -4.5%) vs baseline: +5.2%

ℹ️ Scenarios Missing SLO Configuration (46 scenarios)

The following scenarios exist in candidate data but have no SLO thresholds configured:

  • coreapiscenario-core_dispatch_listeners
  • coreapiscenario-core_dispatch_no_listeners
  • coreapiscenario-core_dispatch_with_results_listeners
  • coreapiscenario-core_dispatch_with_results_no_listeners
  • djangosimple-baseline
  • errortrackingdjangosimple-baseline
  • errortrackingflasksqli-baseline
  • flasksimple-baseline
  • flasksqli-baseline
  • iast_aspects-re_expand_aspect
  • iast_aspects-re_expand_noaspect
  • iast_aspects-re_findall_aspect
  • iast_aspects-re_findall_noaspect
  • iast_aspects-re_finditer_aspect
  • iast_aspects-re_finditer_noaspect
  • iast_aspects-re_fullmatch_aspect
  • iast_aspects-re_fullmatch_noaspect
  • iast_aspects-re_group_aspect
  • iast_aspects-re_group_noaspect
  • iast_aspects-re_groups_aspect
  • iast_aspects-re_groups_noaspect
  • iast_aspects-re_match_aspect
  • iast_aspects-re_match_noaspect
  • iast_aspects-re_search_aspect
  • iast_aspects-re_search_noaspect
  • iast_aspects-re_sub_aspect
  • iast_aspects-re_sub_noaspect
  • iast_aspects-re_subn_aspect
  • iast_aspects-re_subn_noaspect
  • sethttpmeta-obfuscation-disabled
  • startup-baseline
  • startup-baseline_django
  • startup-baseline_flask
  • startup-ddtrace_run
  • startup-ddtrace_run_appsec
  • startup-ddtrace_run_profiling
  • startup-ddtrace_run_runtime_metrics
  • startup-ddtrace_run_send_span
  • startup-ddtrace_run_telemetry_disabled
  • startup-ddtrace_run_telemetry_enabled
  • startup-import_ddtrace
  • startup-import_ddtrace_auto
  • startup-import_ddtrace_auto_django
  • startup-import_ddtrace_auto_flask
  • startup-import_ddtrace_django
  • startup-import_ddtrace_flask

Comment thread ddtrace/llmobs/_writer.py Outdated
Comment thread ddtrace/llmobs/_writer.py Outdated
…ove debug artifacts

Restructure eval_scope validation to default to 'span', pass eval_scope
to telemetry, and remove leftover debug headers and print statement.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented Apr 15, 2026

Codeowners resolved as

releasenotes/notes/llmobs-trace-session-level-evals-c5e45391b3e52604.yaml  @DataDog/apm-python

@datadog-datadog-prod-us1-2

This comment has been minimized.

cdfox and others added 2 commits April 15, 2026 15:26
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cdfox cdfox changed the title support submitting trace and session level evals feat(llmobs): support submitting trace and session level evals Apr 15, 2026
@cdfox cdfox marked this pull request as ready for review April 16, 2026 00:44
@cdfox cdfox requested a review from a team as a code owner April 16, 2026 00:44
@cdfox cdfox closed this Apr 16, 2026
@cdfox cdfox deleted the christopher.fox/trace-session-level-evals branch April 16, 2026 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant