Skip to content

Wr/ngt @W-22143479@ @W-22143478@#407

Open
WillieRuemmele wants to merge 18 commits intomainfrom
wr/ngt
Open

Wr/ngt @W-22143479@ @W-22143478@#407
WillieRuemmele wants to merge 18 commits intomainfrom
wr/ngt

Conversation

@WillieRuemmele
Copy link
Copy Markdown
Contributor

What does this PR do?

adds NGT support to current agent test commands

requires: forcedotcom/agents#272

What issues does this PR fix or reference?

@W-22143479@
@W-22143478@

@WillieRuemmele WillieRuemmele requested a review from a team as a code owner April 28, 2026 20:36
andresrivas-sf

This comment was marked as duplicate.

@andresrivas-sf andresrivas-sf requested review from andresrivas-sf and removed request for andresrivas-sf April 30, 2026 16:18
Copy link
Copy Markdown
Contributor

@andresrivas-sf andresrivas-sf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@andresrivas-sf
Copy link
Copy Markdown
Contributor

QA Results — manual testing against willies-orgfarm org

Tested with local builds of both @salesforce/agents (linked) and plugin-agent (sf plugins link).

agent test list

  • Type column renders correctly: testing-center for AiEvaluationDefinition, agentforce-studio for AiTestingDefinition
  • Org had both types present, both labeled correctly

agent test run (auto-detect, NGT)

  • Auto-detected AndresTest as agentforce-studio runner correctly (runId prefix 3A2)
  • Ran end-to-end with --wait 10, completed successfully

agent test run --test-runner agentforce-studio (explicit runner)

  • Explicit flag respected, ran cleanly

agent test results — human format

  • Per-test-case tables render with Scorer / Result / Expected / Actual / Reasoning columns
  • Summary table shows Status, Total/Passing/Failing counts

agent test results --result-format junit

  • Valid XML with <testsuites> / <testsuite> / <failure> structure
  • Failure messages and scorer names populated correctly

agent test results --result-format tap

  • TAP v13 output, ok/not ok lines per scorer, YAML diagnostics blocks on failures with actual/expected/message

andresrivas-sf and others added 2 commits April 30, 2026 20:01
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants