feat: Add Agent Discovery Protocol (ADP) support by walkojas-boop · Pull Request #12756 · Significant-Gravitas/AutoGPT

walkojas-boop · 2026-04-12T23:42:29Z

Summary

Adds ADP support -- agents discover services at any domain via /.well-known/agent-discovery.json. Zero new deps.

Spec: https://github.com/walkojas-boop/agent-discovery-protocol

🤖 Generated with Claude Code

Spec: https://github.com/walkojas-boop/agent-discovery-protocol Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CLAassistant · 2026-04-12T23:42:39Z

All committers have signed the CLA.

github-actions · 2026-04-12T23:42:41Z

This PR targets the master branch but does not come from dev or a hotfix/* branch.

Automatically setting the base branch to dev.

coderabbitai · 2026-04-12T23:42:49Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a new ADP v0.1 lightweight client and in-memory cache: strict FQDN validation, DNS resolution with private/reserved/loopback rejection and DNS-pinning, Host header/TLS SNI preservation, redirect blocking, JSON decoding into a read-only DiscoveryResult, and an optional 1-hour negative/positive cache; includes pytest coverage.

Changes

Cohort / File(s)	Summary
Agent Discovery Implementation `autogpt/utils/agent_discovery.py`	New ADP v0.1 client with `discover_services(domain, *, timeout=5.0, use_cache=True)` and `DiscoveryResult`. Implements strict FQDN validation, DNS resolution rejecting private/reserved/loopback addresses, pinned-IP outbound connection while preserving Host/SNI, rejects redirects, parses JSON on 2xx, and provides optional 1-hour in-memory positive and selective negative caching.
Agent Discovery Tests `autogpt/tests/test_agent_discovery.py`	New pytest suite for `DiscoveryResult` (read-only view, defensive deep copies, service lookup) and `discover_services` (input validation edge cases, SSRF protections via mocked DNS, redirect blocking, HTTP 2xx/3xx/404/410 handling and negative-caching, timeout/JSON-failure behavior, cache hits/expiry, and pinned-IP TLS Host usage).

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client
    participant discover_services
    participant Validator
    participant Cache
    participant DNSResolver
    participant HTTPClient
    participant JSONParser
    participant DiscoveryResult

    Client->>discover_services: call(domain)
    discover_services->>Validator: validate FQDN
    alt invalid domain
        Validator-->>discover_services: invalid
        discover_services-->>Client: raise ValueError / None
    else valid domain
        discover_services->>Cache: check(domain)
        alt cache hit
            Cache-->>discover_services: cached (Result|None)
            discover_services-->>Client: return cached
        else cache miss
            discover_services->>DNSResolver: resolve(domain)
            alt resolved -> blocked IP
                DNSResolver-->>discover_services: blocked
                discover_services-->>Client: None
            else resolved -> public IP
                DNSResolver-->>discover_services: public IP
                discover_services->>HTTPClient: GET https://{pinned_ip}/.well-known/agent-discovery.json Host: domain
                alt HTTP 2xx
                    HTTPClient-->>discover_services: body
                    discover_services->>JSONParser: parse(body)
                    alt valid JSON
                        JSONParser-->>discover_services: parsed_data
                        discover_services->>DiscoveryResult: create(parsed_data)
                        discover_services->>Cache: store success (1h)
                        discover_services-->>Client: DiscoveryResult
                    else invalid JSON
                        JSONParser-->>discover_services: error
                        discover_services-->>Client: None
                    end
                else HTTP 3xx (redirect)
                    HTTPClient-->>discover_services: redirect
                    discover_services-->>Client: None
                else HTTP other (404/410)
                    HTTPClient-->>discover_services: status
                    discover_services->>Cache: store negative (select statuses)
                    discover_services-->>Client: None
                end
            end
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I sniffed the names till I found an IP,
I keep Host and SNI so the handshake won’t slip,
I block the sly redirects and stash truths for an hour,
I parse what’s honest and let broken bits cower,
Hopping through discovery with a twitch and a hop!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 32.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: Add Agent Discovery Protocol (ADP) support' directly and clearly summarizes the main change: introducing ADP support for service discovery.
Description check	✅ Passed	The description is directly related to the changeset, explaining that ADP support enables agents to discover services via /.well-known/agent-discovery.json with no new dependencies.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-12T23:45:34Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.20%. Comparing base (ef477ae) to head (f3b1291).
⚠️ Report is 116 commits behind head on dev.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##              dev   #12756      +/-   ##
==========================================
- Coverage   62.70%   62.20%   -0.51%     
==========================================
  Files        1932     1784     -148     
  Lines      138629   127063   -11566     
  Branches    14987    13969    -1018     
==========================================
- Hits        86934    79042    -7892     
+ Misses      49042    45564    -3478     
+ Partials     2653     2457     -196

Flag	Coverage Δ
autogpt-agent	`25.71% <ø> (-2.73%)`	⬇️
forge	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Platform Backend	`74.08% <ø> (ø)`
Platform Frontend	`22.19% <ø> (ø)`
AutoGPT Libs	`∅ <ø> (∅)`
Classic AutoGPT	`25.71% <ø> (-30.82%)`	⬇️

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 92-95: The cached DiscoveryResult payloads are returned by
reference from _cache, allowing callers to mutate shared cached objects; update
the cache retrieval in agent_discovery.py (the branches that currently do
"return DiscoveryResult(cached_result) if cached_result else None" around the
_cache lookup at lines referenced) to return a deep/shallow copy of
cached_result (e.g., using copy.deepcopy(cached_result) or an appropriate clone
method) so callers receive an independent object, and apply the same change to
the other cache-returning branch (the similar logic at the 103-105 area) to
prevent mutability leaks.
- Line 68: The __repr__ return line in DiscoveryResult exceeds max line length;
fix it by breaking the long f-string into multiple concatenated parts or using a
local variable for services, e.g., assign services = self.list_services() and
then return a wrapped f-string across implicit string literal concatenation or
parentheses so the line length is within the linter limit; update the return in
the DiscoveryResult.__repr__ (referencing self.domain and self.list_services())
accordingly.
- Around line 97-100: The code currently interpolates the domain directly into
url and catches Exception broadly; update the function that builds url (uses
variable url and urllib.request.Request/urllib.request.urlopen) to first
validate the domain string with a safe hostname regex, resolve the domain (e.g.,
via socket.getaddrinfo) and use the ipaddress module to reject any resolved IPs
in private, loopback, link-local, or other non-public ranges before constructing
the URL, and then replace the broad except with specific exception handlers for
urllib.error.HTTPError, urllib.error.URLError, socket.timeout (or TimeoutError),
json.JSONDecodeError and ValueError so only expected errors are caught. Ensure
the validated hostname is used when creating the Request and include clear
rejection logic before opening the connection.

In `@classic/original_autogpt/autogpt/utils/agent_discovery.py`:
- Around line 92-95: The cache returns the stored JSON dict by reference which
lets callers mutate the cached object; update the cache access in
agent_discovery (the use_cache/domain/_cache logic and the DiscoveryResult
return paths) to return a deep copy of cached_result rather than the original
reference (e.g., deepcopy the dict when creating DiscoveryResult and also when
writing into _cache) and ensure you import/use copy.deepcopy (or equivalent) so
nested service/trust data cannot be mutated through the returned
DiscoveryResult.
- Line 68: The __repr__ method in class DiscoveryResult currently returns a
single long f-string that exceeds the 88-char line length; change the return to
a multi-line expression (e.g., use an f-string inside parentheses or concatenate
short strings) so it wraps under 88 chars while preserving the same output
format "DiscoveryResult(domain=..., services=...)" and still calls
self.list_services(); update the return in the __repr__ method of
DiscoveryResult accordingly.
- Around line 106-110: The current broad except Exception around the discovery
call swallows programming errors and still writes _cache[domain] = (time.time(),
None), causing negative-caching of unexpected failures; narrow the except to
only expected runtime errors (e.g., network/HTTP and parsing errors such as
requests.exceptions.RequestException, json.JSONDecodeError or ValueError) and
remove the write to _cache in those unexpected-exception cases — only set
_cache[domain] = (time.time(), None) in the normal “not found” path after a
successful call that yields no discovery; reference the logger.debug call, the
use_cache flag, domain and _cache to locate and adjust the error handling and
caching logic.
- Around line 97-100: Validate and restrict the incoming domain before
constructing the URL to prevent SSRF: in the code that builds url =
f"https://{domain}/.well-known/agent-discovery.json" (and before calling
urllib.request.Request / urllib.request.urlopen), ensure the domain variable is
a plain hostname (no scheme, no userinfo, no port), reject IP literals
(IPv4/IPv6) and hostnames that resolve to private/reserved IP ranges by
performing a DNS resolution (e.g., socket.getaddrinfo) and checking addresses
against RFC1918/loopback/link-local/mapped ranges, or alternatively enforce a
strict FQDN regex/allowlist of trusted domains; raise an error on invalid or
non-public resolution and only then proceed to create the Request and open the
URL.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a148c505-c8a8-4c09-b3e8-09a07e815fce

📥 Commits

Reviewing files that changed from the base of the PR and between ef477ae and f3b1291.

📒 Files selected for processing (2)

autogpt/utils/agent_discovery.py
classic/original_autogpt/autogpt/utils/agent_discovery.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Seer Code Review
GitHub Check: test
GitHub Check: build (release)
GitHub Check: build (dev)
GitHub Check: test
GitHub Check: Analyze (python)
GitHub Check: Analyze (typescript)
GitHub Check: Check PR Status

🧰 Additional context used

📓 Path-based instructions (1)

classic/**/*.py

📄 CodeRabbit inference engine (classic/CLAUDE.md)

classic/**/*.py: Use fully qualified imports: from forge.agent.base import BaseAgent, from autogpt.agents.agent import Agent, from direct_benchmark.harness import BenchmarkHarness
Python target version is 3.12
Use Black for code formatting with line length of 88 characters
Use isort for import ordering with profile='black'
Include type hints in Python code for Pyright type checking

Files:

classic/original_autogpt/autogpt/utils/agent_discovery.py

🧠 Learnings (3)

📓 Common learnings

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12592
File: autogpt_platform/frontend/AGENTS.md:1-3
Timestamp: 2026-03-27T08:39:45.696Z
Learning: In Significant-Gravitas/AutoGPT, Claude is the primary coding agent. AGENTS.md files intentionally retain Claude-specific wording (e.g., "CLAUDE.md - Frontend", "This file provides guidance to Claude Code") even though AGENTS.md is the canonical cross-agent instruction source. Do not flag Claude-specific titles or phrasing in AGENTS.md files as issues.

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:23.273Z
Learning: Applies to autogpt_platform/frontend/**/AGENTS.md : Document agent responsibilities and capabilities in AGENTS.md with clear descriptions of what each agent does

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:06.525Z
Learning: Applies to AGENTS.md : Document all agent configurations and capabilities in AGENTS.md

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:23.273Z
Learning: Applies to autogpt_platform/frontend/**/AGENTS.md : Use AGENTS.md as the central registry for agent definitions and their capabilities

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

📚 Learning: 2026-04-08T17:26:23.273Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:23.273Z
Learning: Applies to autogpt_platform/frontend/**/AGENTS.md : Document agent responsibilities and capabilities in AGENTS.md with clear descriptions of what each agent does

Applied to files:

autogpt/utils/agent_discovery.py
classic/original_autogpt/autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-08T17:26:23.273Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:23.273Z
Learning: Applies to autogpt_platform/frontend/**/AGENTS.md : Use AGENTS.md as the central registry for agent definitions and their capabilities

Applied to files:

autogpt/utils/agent_discovery.py

🪛 GitHub Actions: Classic - Python checks

autogpt/utils/agent_discovery.py

[error] 68-68: flake8 E501 line too long (90 > 88 characters)

classic/original_autogpt/autogpt/utils/agent_discovery.py

[error] 68-68: flake8 E501 line too long (90 > 88 characters)

🪛 Ruff (0.15.9)

autogpt/utils/agent_discovery.py

[error] 99-99: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[error] 100-100: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[warning] 106-106: Do not catch blind exception: Exception

(BLE001)

classic/original_autogpt/autogpt/utils/agent_discovery.py

[error] 99-99: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[error] 100-100: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[warning] 106-106: Do not catch blind exception: Exception

(BLE001)

coderabbitai · 2026-04-12T23:49:43Z

+    if use_cache and domain in _cache:
+        cached_at, cached_result = _cache[domain]
+        if time.time() - cached_at < _CACHE_TTL:
+            return DiscoveryResult(cached_result) if cached_result else None


⚠️ Potential issue | 🟠 Major

Avoid returning cached mutable payloads by reference.

The cached JSON dict is reused directly; callers can mutate nested service/trust data and poison future cache reads.

Proposed fix

+from copy import deepcopy @@ - return DiscoveryResult(cached_result) if cached_result else None + return DiscoveryResult(deepcopy(cached_result)) if cached_result else None @@ data = json.loads(resp.read()) if use_cache: - _cache[domain] = (time.time(), data) - return DiscoveryResult(data) + _cache[domain] = (time.time(), deepcopy(data)) + return DiscoveryResult(deepcopy(data))

Also applies to: 103-105

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@classic/original_autogpt/autogpt/utils/agent_discovery.py` around lines 92 - 95, The cache returns the stored JSON dict by reference which lets callers mutate the cached object; update the cache access in agent_discovery (the use_cache/domain/_cache logic and the DiscoveryResult return paths) to return a deep copy of cached_result rather than the original reference (e.g., deepcopy the dict when creating DiscoveryResult and also when writing into _cache) and ensure you import/use copy.deepcopy (or equivalent) so nested service/trust data cannot be mutated through the returned DiscoveryResult.

majdyz · 2026-04-14T09:25:46Z

+    url = f"https://{domain}/.well-known/agent-discovery.json"
+    try:
+        req = urllib.request.Request(url, headers={"User-Agent": "agent-discovery/0.1"})
+        with urllib.request.urlopen(req, timeout=timeout) as resp:


🤖 🔴 Blocker: SSRF vulnerability — domain not validated before outbound fetch.

discover_services(domain) builds https://{domain}/.well-known/agent-discovery.json with zero validation. Any caller that passes user-supplied or externally-configured input can probe internal hosts (localhost, RFC-1918 IPs, metadata services). This is a standard SSRF vector.

Minimum required fix:

Reject non-FQDN input (no IP literals, no embedded schemes/userinfo/ports).

Optionally resolve the hostname and reject private/loopback/link-local addresses.

Narrow the except Exception to specific types (URLError, HTTPError, TimeoutError, json.JSONDecodeError) so programming bugs are not silently swallowed and negative-cached.

The same code is duplicated in classic/original_autogpt/autogpt/utils/agent_discovery.py — fix both.

majdyz · 2026-04-14T09:25:54Z

@@ -0,0 +1,111 @@
+"""Agent Discovery Protocol (ADP) v0.1 -- discover agent services at any domain.


🤖 🔴 Blocker: Module added but never imported or used anywhere in the codebase.

Neither autogpt/utils/agent_discovery.py nor classic/original_autogpt/autogpt/utils/agent_discovery.py is imported by any existing code. There are no call sites for discover_services(), no integration with the agent loop, tool system, or any other component. The PR description says 'agents discover services' but does not show where or how this is triggered.

Before this can merge:

Show what agent code calls discover_services, when, and with what input.

If this is a library utility, it needs to be wired up. Dead code should not be merged.

majdyz · 2026-04-14T09:26:02Z

+    """Discover agent services at a domain via the Agent Discovery Protocol.
+
+    Fetches /.well-known/agent-discovery.json from the given domain and returns
+    a parsed result. Returns None if the domain doesn't implement ADP.


🤖 🟠 Should Fix: No tests included.

The PR adds a utility module with network I/O, caching logic, and error handling but ships zero tests. At minimum, unit tests should cover:

Cache hit/miss behaviour

Non-200 responses returning None

Timeout returning None (not raising)

Malformed JSON returning None

list_services / has_service / get_service on a DiscoveryResult

Once domain validation is added (see SSRF comment), tests for rejected inputs are also needed.

majdyz · 2026-04-14T09:26:09Z

@@ -0,0 +1,111 @@
+"""Agent Discovery Protocol (ADP) v0.1 -- discover agent services at any domain.


🤖 🟠 Should Fix: Code is duplicated verbatim across two locations.

autogpt/utils/agent_discovery.py and classic/original_autogpt/autogpt/utils/agent_discovery.py are byte-for-byte identical. If this utility is intentionally shared, the classic/ path should import from the parent module. If they are independently intended, explain why both are needed. Duplicated security-sensitive code is especially problematic as future fixes need to be applied twice.

majdyz

🤖 REQUEST CHANGES — PR #12756 (feat: Add Agent Discovery Protocol (ADP) support)

Thanks for the contribution! The code itself is clean and well-structured, but there are several blockers before this can merge.

Critical issues:

Dead code — no integration point. Neither file is imported or called from anywhere in the codebase. The PR description says agents "discover services" but doesn't show how or where this is triggered. Dead code that is security-sensitive should not be merged. Please show the integration — which agent component calls discover_services, on what trigger, and with what input source.
SSRF vulnerability. The domain parameter goes directly into https://{domain}/... with no validation. An attacker with any influence over the domain value can probe internal services (localhost, EC2 metadata at 169.254.169.254, RFC-1918 ranges). Add at minimum: an FQDN regex check that rejects IP literals, embedded scheme/port-injection attempts. See inline comment on line 100.
No tests. A network-I/O utility with caching and error handling requires tests. Mocking urllib.request.urlopen and exercising cache hits, timeouts, 404 responses, and malformed JSON is expected coverage.

Should fix:

Code duplicated verbatim in autogpt/utils/ and classic/original_autogpt/autogpt/utils/. Security fixes will need to be applied to both files. Eliminate the duplication.
except Exception too broad. It swallows programming bugs and still writes a negative cache entry. Narrow to (urllib.error.URLError, urllib.error.HTTPError, TimeoutError, json.JSONDecodeError).

PR description is also thin — the summary says "Zero new deps" and "Spec: <link>" but does not explain Why (what problem it solves for AutoGPT users today), What was changed, or How it integrates. Please fill in the template sections.

CI is currently failing (flake8 E501 on line 68 in both files).

walkojas-boop · 2026-04-14T09:42:34Z

Great review, appreciate the thoroughness. You're right on all counts.

Fixing:

SSRF — FQDN validation, reject IP literals, private/loopback/link-local ranges
Integration — wire discover_services into the agent tool system with a concrete call site
Tests — full coverage: cache hit/miss, timeouts, 404s, malformed JSON, SSRF rejection
Dedup — single source in autogpt/utils/, import from classic/
Exception narrowing — URLError, HTTPError, TimeoutError, JSONDecodeError only
flake8 — fix E501

Will push the update shortly.

Sign in to view

+        )
+        with urllib.request.urlopen(req, timeout=timeout) as resp:


Sign in to view

+    except (
+        urllib.error.URLError,
+        urllib.error.HTTPError,
+        TimeoutError,
+        json.JSONDecodeError,
+    ) as e:
+        logger.debug("ADP: no discovery at %s (%s)", domain, e)


Sign in to view

+    def test_rejects_localhost(self):
+        with pytest.raises(ValueError, match="blocked address"):
+            discover_services("localhost")


coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

autogpt/utils/agent_discovery.py (1)
79-83: ⚠️ Potential issue | 🟠 Major

Cached discovery payloads are still shared across callers.

DiscoveryResult keeps the incoming dicts by reference, so mutating result.services[...] or get_service(...) can still poison _cache[domain] for later readers. This looks like the same mutability leak that was already called out earlier. Clone the payload on entry to DiscoveryResult (or before caching/returning it) so each caller gets an isolated copy.
Possible fix
+from copy import deepcopy
@@
 class DiscoveryResult:
     """Parsed agent discovery document."""
 
     def __init__(self, data: dict[str, Any]) -> None:
-        self._data = data
+        self._data = deepcopy(data)
         self._services = {
-            s["name"]: s for s in data.get("services", [])
+            s["name"]: s for s in self._data.get("services", [])
         }
Also applies to: 94-103, 156-157, 169-171
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt/utils/agent_discovery.py` around lines 79 - 83, DiscoveryResult is
storing the incoming dicts by reference which lets callers mutate shared cached
payloads (self._data and self._services) and poison _cache[domain]; fix by
deep-copying the input payload at entry (e.g., in DiscoveryResult.__init__) or
immediately before caching/returning so each caller receives an isolated copy,
and apply the same defensive copy behavior to other places mentioned (the code
that constructs _services and any functions that populate or return _data or
service dicts such as get_service) to prevent mutability leaks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt/tests/test_agent_discovery.py`:
- Around line 89-91: The test test_rejects_localhost is asserting the wrong
error because discover_services("localhost") fails in _validate_domain's FQDN
regex and raises an "invalid domain" error before reaching the
blocked-address/DNS branch; either change the expected match to "invalid domain"
to reflect the current validation, OR update the test to pass a syntactically
valid FQDN (e.g., "example.local") and mock socket.getaddrinfo so
discover_services hits the DNS/SSRF blocked-address logic and then assert
"blocked address"; reference test_rejects_localhost, discover_services, and
_validate_domain and use mocked getaddrinfo when choosing the second option.

In `@autogpt/utils/agent_discovery.py`:
- Around line 147-166: The current flow validates domain once via
_validate_domain but then uses urllib.request.urlopen (with default
HTTPRedirectHandler) which can follow 3xx redirects to internal hosts; modify
the fetch logic around urllib.request.Request and urlopen so redirects are not
automatically followed: create a custom opener or HTTPRedirectHandler that
either disables automatic redirects or intercepts redirect responses,
re-validate each redirect target with _validate_domain before following, and
only perform a manual follow when _validate_domain succeeds; update the code
paths that construct the request and use urlopen (the block creating Request and
calling urlopen) to use this custom opener/handler and preserve existing caching
behavior with _cache/_CACHE_TTL and returning DiscoveryResult as before.
- Around line 172-181: Currently every caught error in the discovery try/except
(the except handling urllib.error.URLError, urllib.error.HTTPError,
TimeoutError, json.JSONDecodeError) writes a negative cache entry into _cache
when use_cache is true, causing transient failures to be treated as
authoritative for _CACHE_TTL; change this so only explicit "no ADP" HTTP misses
are negative-cached: detect urllib.error.HTTPError and only set _cache[domain] =
(time.time(), None) for status codes 404 and 410 (or whatever canonical "no ADP"
codes you use) inside the HTTPError branch, and for other transient errors
(URLError, TimeoutError, JSONDecodeError) either do not write to _cache or write
with a much shorter TTL (e.g., a separate short-failure timestamp) so the
discovery function (agent_discovery.py) and its use_cache/_CACHE_TTL logic no
longer become sticky on transient failures.

---

Duplicate comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 79-83: DiscoveryResult is storing the incoming dicts by reference
which lets callers mutate shared cached payloads (self._data and self._services)
and poison _cache[domain]; fix by deep-copying the input payload at entry (e.g.,
in DiscoveryResult.__init__) or immediately before caching/returning so each
caller receives an isolated copy, and apply the same defensive copy behavior to
other places mentioned (the code that constructs _services and any functions
that populate or return _data or service dicts such as get_service) to prevent
mutability leaks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d386b590-21be-4aaf-9928-903b14f268de

📥 Commits

Reviewing files that changed from the base of the PR and between f3b1291 and c1db7db.

📒 Files selected for processing (2)

autogpt/tests/test_agent_discovery.py
autogpt/utils/agent_discovery.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Seer Code Review
GitHub Check: conflicts
GitHub Check: Check PR Status
GitHub Check: Analyze (python)

🧰 Additional context used

🧠 Learnings (7)

📓 Common learnings

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:23.306Z
Learning: Applies to autogpt_platform/frontend/**/AGENTS.md : Document agent responsibilities and capabilities in AGENTS.md with clear descriptions of what each agent does

📚 Learning: 2026-04-08T17:26:28.252Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/src/tests/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:28.252Z
Learning: Applies to autogpt_platform/frontend/src/tests/**/AGENTS.md : Maintain AGENTS.md as the single source of truth for agent specifications and interfaces across the codebase

Applied to files:

autogpt/tests/test_agent_discovery.py

📚 Learning: 2026-04-08T17:26:28.252Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/src/tests/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:28.252Z
Learning: Applies to autogpt_platform/frontend/src/tests/**/AGENTS.md : Document all agents in AGENTS.md with their name, description, input/output schemas, and usage examples

Applied to files:

autogpt/tests/test_agent_discovery.py

📚 Learning: 2026-04-08T17:26:28.252Z

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/src/tests/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:28.252Z
Learning: Applies to autogpt_platform/frontend/src/tests/**/AGENTS.md : Include clear usage examples in AGENTS.md for each agent to facilitate integration and reduce onboarding time

Applied to files:

autogpt/tests/test_agent_discovery.py

📚 Learning: 2026-03-19T15:10:50.676Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12483
File: autogpt_platform/backend/backend/copilot/tools/test_dry_run.py:298-303
Timestamp: 2026-03-19T15:10:50.676Z
Learning: When using Python’s `unittest.mock.patch` in tests, choose the patch target based on how the imported name is resolved:
- If the code under test uses an **eager/module-level import** (e.g., `from foo.bar import baz` at module top), patch **the module where the name is looked up** (i.e., where it is used in the SUT), e.g. `patch("mymodule.baz")`.
- If the code under test uses a **lazy import** executed later (e.g., `from foo.bar import baz` inside a function/branch), patch **the source module** (e.g., `patch("foo.bar.baz")`) because the late `from ... import` will read the (potentially patched) name from the source module at call time.

For a concrete example: if `simulate_block` is imported inside an `if dry_run:` block in the SUT, then the correct test patch target is the source module path for `simulate_block` as it exists at call time (e.g., `patch("backend.executor.simulator.simulate_block")`), not the test file’s import location.

Applied to files:

autogpt/tests/test_agent_discovery.py

📚 Learning: 2026-04-03T13:53:33.653Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12206
File: autogpt_platform/backend/snapshots/v2_unhandled_exception_500:1-5
Timestamp: 2026-04-03T13:53:33.653Z
Learning: In Significant-Gravitas/AutoGPT (autogpt_platform), the catch-all `Exception` handler in `autogpt_platform/backend/backend/api/utils/exceptions.py` (`_handle_error()`) intentionally surfaces `str(exc)` as the `detail` field in HTTP 500 responses for non-Prisma errors. This is by design: errors are logged server-side, and the detail helps API consumers report issues. Only `PrismaError` responses are sanitized (see commit ce6910b4a). Do not flag `str(exc)` in the generic 500 handler as an information disclosure issue; the snapshot `autogpt_platform/backend/snapshots/v2_unhandled_exception_500` ("connection refused") correctly reflects this behavior.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-18T14:03:32.534Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12473
File: autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py:86-195
Timestamp: 2026-03-18T14:03:32.534Z
Learning: In Significant-Gravitas/AutoGPT, the integration tests in `autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py` intentionally use real external URLs (example.com, httpbin.org). They are gated with `pytest.mark.skipif(shutil.which("agent-browser") is None, ...)`, so they are automatically skipped in standard CI where agent-browser is not installed. They are designed to be run explicitly inside the Docker environment to verify that system Chromium (AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium) actually launches and can fetch pages end-to-end. Do not flag the use of real network calls in these tests as a flakiness concern.

Applied to files:

autogpt/utils/agent_discovery.py

🪛 Ruff (0.15.9)

autogpt/utils/agent_discovery.py

[warning] 68-68: Loop control variable family not used within loop body

Rename unused family to _family

(B007)

[error] 163-165: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[error] 166-166: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

🔇 Additional comments (1)

autogpt/tests/test_agent_discovery.py (1)

122-223: Nice coverage on the fetch and cache branches.

These tests pin the 200/404/timeout/malformed-JSON paths plus positive and negative cache behavior well.

…e cache, 2xx support

walkojas-boop · 2026-04-14T10:09:23Z

Good catch from Sentry — the pinned-IP HTTPS request would fail SSL cert validation since certs are issued for domains, not IPs.

Fixed: switched from urllib.request to http.client.HTTPSConnection with explicit SNI (server_hostname=domain). The connection goes to the pinned IP but SSL validates the cert against the original domain name.

Also:

Redirect blocking moved to explicit status code check (no more relying on urllib handler)
ssl.SSLError added to the exception list
CodeQL example.com substring warnings are false positives — those are test assertions checking cache keys, not URL sanitization

Sign in to view

+        conn.connect()
+        conn.sock = ssl_ctx.wrap_socket(
+            conn.sock,
+            server_hostname=domain,
+        )


coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 125-130: The payload is being cached before validating its shape,
causing malformed documents (e.g., {"services": {}}) to be stored and later
replayed; change the flow so you construct and validate a DiscoveryResult from
the raw dict before writing to _cache (use the DiscoveryResult(...) constructor
inside discover_services or wherever caching occurs), have
DiscoveryResult.__init__ raise ValueError on invalid ADP shapes (instead of
failing later), and ensure discover_services catches that ValueError and returns
None rather than caching the bad payload; update references to _data and
_services (in DiscoveryResult.__init__) to be set only after validation
succeeds.
- Around line 216-229: The code currently builds a URL using the pinned IP and
relies on urllib opener/_NoRedirectHandler and Request, which causes TLS SNI and
certificate verification to use the IP (failing cert validation); replace this
by creating a custom HTTPSConnection (or HTTPAdapter-like connector) that
connects to pinned_ip but calls connect/ssl_wrap_socket with server_hostname set
to the original domain so SNI and cert verification use the domain, then use
that connection for the request path (keep the Request headers including "Host":
domain) instead of swapping the URL host; update the code that constructs url,
opener, and the with opener.open(req, ...) flow to use this custom connection so
certificate validation succeeds while still pinning the TCP endpoint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3de5c0a-70e7-4170-bff8-08469f99ab08

📥 Commits

Reviewing files that changed from the base of the PR and between c1db7db and 9de1500.

📒 Files selected for processing (2)

autogpt/tests/test_agent_discovery.py
autogpt/utils/agent_discovery.py

✅ Files skipped from review due to trivial changes (1)

autogpt/tests/test_agent_discovery.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Seer Code Review
GitHub Check: Check PR Status
GitHub Check: Analyze (python)

🧰 Additional context used

🧠 Learnings (7)

📓 Common learnings

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/frontend/CLAUDE.md:0-0
Timestamp: 2026-04-08T17:26:23.306Z
Learning: Applies to autogpt_platform/frontend/**/AGENTS.md : Document agent responsibilities and capabilities in AGENTS.md with clear descriptions of what each agent does

📚 Learning: 2026-04-03T13:53:33.653Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12206
File: autogpt_platform/backend/snapshots/v2_unhandled_exception_500:1-5
Timestamp: 2026-04-03T13:53:33.653Z
Learning: In Significant-Gravitas/AutoGPT (autogpt_platform), the catch-all `Exception` handler in `autogpt_platform/backend/backend/api/utils/exceptions.py` (`_handle_error()`) intentionally surfaces `str(exc)` as the `detail` field in HTTP 500 responses for non-Prisma errors. This is by design: errors are logged server-side, and the detail helps API consumers report issues. Only `PrismaError` responses are sanitized (see commit ce6910b4a). Do not flag `str(exc)` in the generic 500 handler as an information disclosure issue; the snapshot `autogpt_platform/backend/snapshots/v2_unhandled_exception_500` ("connection refused") correctly reflects this behavior.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-18T14:03:32.534Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12473
File: autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py:86-195
Timestamp: 2026-03-18T14:03:32.534Z
Learning: In Significant-Gravitas/AutoGPT, the integration tests in `autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py` intentionally use real external URLs (example.com, httpbin.org). They are gated with `pytest.mark.skipif(shutil.which("agent-browser") is None, ...)`, so they are automatically skipped in standard CI where agent-browser is not installed. They are designed to be run explicitly inside the Docker environment to verify that system Chromium (AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium) actually launches and can fetch pages end-to-end. Do not flag the use of real network calls in these tests as a flakiness concern.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T21:25:15.983Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T10:35:39.146Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py:154-178
Timestamp: 2026-03-24T10:35:39.146Z
Learning: In Significant-Gravitas/AutoGPT autogpt_platform, the add-to-library endpoints (both public `add_marketplace_agent_to_library` in `backend/api/features/library/routes/agents.py` and admin `admin_add_agent_to_library` in `backend/api/features/admin/store_admin_routes.py`) intentionally return HTTP 201 even when the underlying helper (`add_store_agent_to_library` / `add_store_agent_to_library_as_admin`) may return an existing or restored LibraryAgent (idempotent behavior). Do not flag HTTP 201 on these endpoints as incorrect — this is an established and intentional codebase pattern.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T21:27:22.326Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-14T06:33:59.422Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12774
File: autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py:0-0
Timestamp: 2026-04-14T06:33:59.422Z
Learning: In `autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py`, the `asyncio.wait_for()` retry loop around `AsyncSandbox.create()` (introduced in PR `#12774`) can leak up to `_SANDBOX_CREATE_MAX_RETRIES - 1` (≤2) orphaned E2B sandboxes per hang incident because `wait_for` cancels only the client-side wait while E2B may complete server-side provisioning. With the default `on_timeout="pause"` lifecycle, leaked orphaned sandboxes are **paused** (not killed) when their original `end_at` is reached and persist indefinitely until explicitly killed — there is NO automatic E2B project-level cleanup. Operators must manage these manually or via their own cleanup jobs. The sandbox_id is not accessible from the timed-out coroutine, so recovery via `AsyncSandbox.connect(sandbox_id)` is not possible at timeout. This is an intentionally accepted trade-off; a proper fix is deferred to a follow-up PR. Do NOT flag the retry loop as a blocking issue.

Applied to files:

autogpt/utils/agent_discovery.py

🪛 Ruff (0.15.9)

autogpt/utils/agent_discovery.py

[warning] 91-93: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

[error] 221-227: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

🔇 Additional comments (1)

autogpt/utils/agent_discovery.py (1)

235-252: Nice split between authoritative misses and transient failures.

Only negative-caching 404/410 avoids turning one timeout or bad payload into a one-hour sticky miss.

Sign in to view

+        raise ValueError(fmt_error)
+
+    # SSRF step 2: resolve DNS and reject private IPs
+    pinned_ip = _resolve_and_validate(domain)


Sign in to view

+        "autogpt.utils.agent_discovery.urllib.request"
+        ".build_opener"
+    )
+    @patch("autogpt.utils.agent_discovery.socket.getaddrinfo")


walkojas-boop · 2026-04-14T10:22:04Z

Both Sentry findings addressed:

CRITICAL — Cache check moved before DNS
Reordered discover_services so cache check happens BEFORE _resolve_and_validate. Transient DNS failures no longer bypass valid cached results. Added test_cache_hit_survives_dns_failure to verify the cache returns data even when _resolve_and_validate raises.

HIGH — Tests now mock http.client.HTTPSConnection
Rewrote the entire test suite to patch http.client.HTTPSConnection instead of the old urllib.request.build_opener. New tests:

test_https_connection_uses_domain_for_sni verifies the connection is constructed with the domain (not the IP) so SSL cert validation works
test_cache_hit_skips_dns verifies cache hit doesn't call DNS or HTTPSConnection
All fetch/timeout/404/malformed JSON tests now use the correct mock target

The bots are doing excellent work. Keep them coming.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

autogpt/utils/agent_discovery.py (1)
264-269: ⚠️ Potential issue | 🟠 Major

Data cached before DiscoveryResult validates the schema.

Line 268 caches data before line 269 constructs DiscoveryResult(data). If the JSON is syntactically valid but semantically malformed (e.g., services contains objects without a "name" key), DiscoveryResult.__init__ will raise KeyError which is not caught by the except block at lines 277-282. The exception propagates but the bad data is already cached, poisoning future lookups.
Proposed fix: validate before caching
         if 200 <= resp.status < 300:
             data = json.loads(resp.read())
             conn.close()
+            result = DiscoveryResult(data)  # Validate schema first
             if use_cache:
-                    _cache[domain] = (time.time(), data)
-            return DiscoveryResult(data)
+                _cache[domain] = (time.time(), data)
+            return result
Also consider catching KeyError/TypeError from DiscoveryResult construction and treating malformed schemas as a failed discovery (return None).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt/utils/agent_discovery.py` around lines 264 - 269, The code currently
writes raw JSON into _cache[domain] before constructing DiscoveryResult(data),
which can cache semantically invalid data; change the flow in the block handling
successful HTTP responses so you first construct DiscoveryResult(data) (i.e.,
call DiscoveryResult(data) and let it validate) and only if that succeeds and
use_cache is true, assign _cache[domain] = (time.time(), data); additionally
wrap the DiscoveryResult construction in a try/except that catches KeyError and
TypeError from DiscoveryResult.__init__ and treat those as a failed discovery
(return None) instead of letting them propagate and poison the cache.

🧹 Nitpick comments (2)

autogpt/utils/agent_discovery.py (2)
107-119: Dead code: _NoRedirectHandler is unused.

This class was likely needed when the implementation used urllib.request, but after switching to http.client.HTTPSConnection (lines 227-252) with manual redirect handling (lines 254-262), this class is no longer referenced anywhere.
Proposed fix: remove the unused class
-class _NoRedirectHandler(urllib.request.HTTPRedirectHandler):
-    """Disable automatic redirect following to prevent SSRF bypass."""
-
-    def redirect_request(
-        self, req, fp, code, msg, headers, newurl
-    ):
-        raise urllib.error.HTTPError(
-            req.full_url,
-            code,
-            f"Redirect to {newurl} blocked (SSRF protection)",
-            headers,
-            fp,
-        )
-
-
You can also remove the unused urllib.request import at line 29 if no other code in this module uses it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt/utils/agent_discovery.py` around lines 107 - 119, The
_NoRedirectHandler class is dead code and should be removed: delete the class
definition named _NoRedirectHandler and any references to it, and also remove
the now-unused urllib.request import if no other symbol in this module depends
on urllib.request; ensure there are no remaining usages elsewhere in this file
(search for "_NoRedirectHandler" and "urllib.request") before committing.
234-243: Consider documenting Python version compatibility for _create_connection monkey-patch.

The monkey-patching of conn._create_connection on http.client.HTTPSConnection is a private implementation detail. While this pattern has remained stable across Python 3.10–3.14 (the supported range), the lack of explicit version documentation makes future maintenance harder.

Add a comment noting the Python versions this approach has been verified against, or consider a version check to fail gracefully if the private attribute structure changes unexpectedly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt/utils/agent_discovery.py` around lines 234 - 243, Document that the
monkey-patch on conn._create_connection (used on http.client.HTTPSConnection)
relies on private internals stable in Python 3.10–3.14 and either add a short
comment stating the verified Python versions (e.g., "Verified on Python
3.10–3.14") next to the _orig_create/_pinned_create block or add a runtime check
that inspects hasattr(conn, "_create_connection") and raises a clear, graceful
error if the attribute shape differs; reference conn._create_connection,
_pinned_create, pinned_ip and the HTTPSConnection usage so reviewers can find
and update this behavior if Python internals change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 267-268: The assignment "_cache[domain] = (time.time(), data)" is
over-indented under the "if use_cache:" check; align it to the same indentation
level as the "if" block body (single indent) so the statement is executed
conditionally as intended—locate the occurrence in agent_discovery.py near the
"if use_cache:" block and reduce the extra indentation so the assignment lines
up with other statements inside that block.
- Around line 80-104: The _resolve_and_validate function’s signature/docstring
promise a (ip, family_str) tuple but currently only returns a string and the
initial ValueError on socket.gaierror is not exception-chained; update the
function to (1) raise ValueError(...) from the caught socket.gaierror (use
"raise ValueError(...) from e") and (2) return a two-tuple matching the
annotation and docstring: compute the address family string from the chosen
address (use the _family value from addrs to map socket.AF_INET/AF_INET6 to e.g.
"ipv4"/"ipv6") and return (first_ip, family_str); keep using
_is_blocked_ip(ip_str) to filter addresses and reference
addrs/_family/first_ip/_is_blocked_ip/_resolve_and_validate when making changes.

---

Duplicate comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 264-269: The code currently writes raw JSON into _cache[domain]
before constructing DiscoveryResult(data), which can cache semantically invalid
data; change the flow in the block handling successful HTTP responses so you
first construct DiscoveryResult(data) (i.e., call DiscoveryResult(data) and let
it validate) and only if that succeeds and use_cache is true, assign
_cache[domain] = (time.time(), data); additionally wrap the DiscoveryResult
construction in a try/except that catches KeyError and TypeError from
DiscoveryResult.__init__ and treat those as a failed discovery (return None)
instead of letting them propagate and poison the cache.

---

Nitpick comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 107-119: The _NoRedirectHandler class is dead code and should be
removed: delete the class definition named _NoRedirectHandler and any references
to it, and also remove the now-unused urllib.request import if no other symbol
in this module depends on urllib.request; ensure there are no remaining usages
elsewhere in this file (search for "_NoRedirectHandler" and "urllib.request")
before committing.
- Around line 234-243: Document that the monkey-patch on conn._create_connection
(used on http.client.HTTPSConnection) relies on private internals stable in
Python 3.10–3.14 and either add a short comment stating the verified Python
versions (e.g., "Verified on Python 3.10–3.14") next to the
_orig_create/_pinned_create block or add a runtime check that inspects
hasattr(conn, "_create_connection") and raises a clear, graceful error if the
attribute shape differs; reference conn._create_connection, _pinned_create,
pinned_ip and the HTTPSConnection usage so reviewers can find and update this
behavior if Python internals change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3bf98ab4-dab5-49a4-8a00-997adb8cd769

📥 Commits

Reviewing files that changed from the base of the PR and between 9de1500 and d667bea.

📒 Files selected for processing (1)

autogpt/utils/agent_discovery.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Seer Code Review
GitHub Check: Check PR Status

🧰 Additional context used

🧠 Learnings (9)

📓 Common learnings

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

📚 Learning: 2026-04-03T13:53:33.653Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12206
File: autogpt_platform/backend/snapshots/v2_unhandled_exception_500:1-5
Timestamp: 2026-04-03T13:53:33.653Z
Learning: In Significant-Gravitas/AutoGPT (autogpt_platform), the catch-all `Exception` handler in `autogpt_platform/backend/backend/api/utils/exceptions.py` (`_handle_error()`) intentionally surfaces `str(exc)` as the `detail` field in HTTP 500 responses for non-Prisma errors. This is by design: errors are logged server-side, and the detail helps API consumers report issues. Only `PrismaError` responses are sanitized (see commit ce6910b4a). Do not flag `str(exc)` in the generic 500 handler as an information disclosure issue; the snapshot `autogpt_platform/backend/snapshots/v2_unhandled_exception_500` ("connection refused") correctly reflects this behavior.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-18T14:03:32.534Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12473
File: autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py:86-195
Timestamp: 2026-03-18T14:03:32.534Z
Learning: In Significant-Gravitas/AutoGPT, the integration tests in `autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py` intentionally use real external URLs (example.com, httpbin.org). They are gated with `pytest.mark.skipif(shutil.which("agent-browser") is None, ...)`, so they are automatically skipped in standard CI where agent-browser is not installed. They are designed to be run explicitly inside the Docker environment to verify that system Chromium (AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium) actually launches and can fetch pages end-to-end. Do not flag the use of real network calls in these tests as a flakiness concern.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T21:25:15.983Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T10:35:39.146Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py:154-178
Timestamp: 2026-03-24T10:35:39.146Z
Learning: In Significant-Gravitas/AutoGPT autogpt_platform, the add-to-library endpoints (both public `add_marketplace_agent_to_library` in `backend/api/features/library/routes/agents.py` and admin `admin_add_agent_to_library` in `backend/api/features/admin/store_admin_routes.py`) intentionally return HTTP 201 even when the underlying helper (`add_store_agent_to_library` / `add_store_agent_to_library_as_admin`) may return an existing or restored LibraryAgent (idempotent behavior). Do not flag HTTP 201 on these endpoints as incorrect — this is an established and intentional codebase pattern.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T21:27:22.326Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-11T23:51:20.769Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12750
File: autogpt_platform/backend/backend/copilot/sdk/file_tools.py:0-0
Timestamp: 2026-04-11T23:51:20.769Z
Learning: In `autogpt_platform/backend/backend/copilot/sdk/file_tools.py`, the TOCTOU gap between `is_allowed_local_path(resolved, sdk_cwd)` and the subsequent `open()` in `_handle_write_non_e2b` (and analogous read/edit handlers) is an accepted trade-off. In non-E2B mode the SDK working directory is ephemeral and user-scoped (created per session under `/tmp`), making the practical attack surface minimal. The E2B sandbox already uses `readlink -f` for defense-in-depth. Additional TOCTOU hardening (e.g., `O_NOFOLLOW`) should be reconsidered only if/when persistent workspaces are introduced. Do not flag this as a blocking issue for non-E2B ephemeral workspace paths.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-14T06:33:59.422Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12774
File: autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py:0-0
Timestamp: 2026-04-14T06:33:59.422Z
Learning: In `autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py`, the `asyncio.wait_for()` retry loop around `AsyncSandbox.create()` (introduced in PR `#12774`) can leak up to `_SANDBOX_CREATE_MAX_RETRIES - 1` (≤2) orphaned E2B sandboxes per hang incident because `wait_for` cancels only the client-side wait while E2B may complete server-side provisioning. With the default `on_timeout="pause"` lifecycle, leaked orphaned sandboxes are **paused** (not killed) when their original `end_at` is reached and persist indefinitely until explicitly killed — there is NO automatic E2B project-level cleanup. Operators must manage these manually or via their own cleanup jobs. The sandbox_id is not accessible from the timed-out coroutine, so recovery via `AsyncSandbox.connect(sandbox_id)` is not possible at timeout. This is an intentionally accepted trade-off; a proper fix is deferred to a follow-up PR. Do NOT flag the retry loop as a blocking issue.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-12T07:32:34.139Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12747
File: autogpt_platform/backend/backend/copilot/sdk/openrouter_compat_proxy.py:329-335
Timestamp: 2026-04-12T07:32:34.139Z
Learning: In Significant-Gravitas/AutoGPT, accessing `site._server.sockets[0].getsockname()` on an `aiohttp.web.TCPSite` is the established codebase pattern for retrieving the actual bound port after an ephemeral (`port=0`) `site.start()` call. No public aiohttp API exists for this. The pattern is used consistently in `autogpt_platform/backend/backend/copilot/sdk/openrouter_compat_proxy.py`, `cli_openrouter_compat_test.py`, and `openrouter_compat_proxy_test.py`. Do not flag access to `site._server` as a private-API concern in this repository.

Applied to files:

autogpt/utils/agent_discovery.py

🪛 Ruff (0.15.9)

autogpt/utils/agent_discovery.py

[warning] 91-93: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (1)

autogpt/utils/agent_discovery.py (1)

46-56: SSRF protections and cache isolation are well-implemented.

Comprehensive blocked networks list covering IPv4/IPv6 private, loopback, and link-local ranges

DNS resolution validates all returned addresses before use

DNS-pinning via _create_connection override prevents TOCTOU/rebinding attacks

DiscoveryResult properly uses deepcopy on init and accessors to prevent cache mutation

Redirect blocking at lines 254-262 handles 3xx responses correctly

Only 404/410 are negative-cached (lines 274-275); transient failures return None without caching

Also applies to: 59-65, 122-165, 195-212, 254-262, 271-286

…xceptions

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

autogpt/utils/agent_discovery.py (1)

110-115: ⚠️ Potential issue | 🟠 Major

Validate the ADP document shape in DiscoveryResult.

json.loads() can return a list, string, or number. Those currently hit self._data.get(...) and raise AttributeError, which bypasses the local except (KeyError, TypeError) and escapes to callers. {} is also inconsistent today: the first fetch returns a DiscoveryResult, but the cache-hit branch treats the cached empty dict as a miss because it relies on truthiness. Make DiscoveryResult.__init__ reject malformed payloads up front and handle ValueError here as another “no ADP” case.

Suggested hardening

 class DiscoveryResult:
     """Parsed agent discovery document."""
 
     def __init__(self, data: dict[str, Any]) -> None:
+        if not isinstance(data, dict):
+            raise ValueError("ADP payload must be a JSON object")
+        services = data.get("services")
+        if not isinstance(services, list):
+            raise ValueError("ADP payload must contain a services list")
+
         self._data = deepcopy(data)
-        self._services = {
-            s["name"]: s
-            for s in self._data.get("services", [])
-        }
+        self._services = {}
+        for service in self._data["services"]:
+            name = service.get("name") if isinstance(service, dict) else None
+            if not isinstance(name, str) or not name:
+                raise ValueError("ADP service entries must have a name")
+            self._services[name] = service
@@
-            except (KeyError, TypeError) as e:
+            except ValueError as e:
                 logger.debug(
                     "ADP: malformed payload at %s (%s)",
                     domain,
                     e,
                 )

Also applies to: 190-193, 255-263

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@autogpt/utils/agent_discovery.py` around lines 110 - 115,
DiscoveryResult.__init__ must validate the ADP payload shape up front: ensure
the incoming data is a dict and that "services" (if present) is a list of dicts;
otherwise raise ValueError. Move deepcopy(data) after that validation, build
self._services from self._data["services"] only when it's a list, and reject
other types to avoid AttributeError when json.loads returns non-dict values.
Also update callers that treat cache hits/misses (the branch that reads cached
ADP) to treat ValueError the same as a "no ADP" case (instead of relying on
truthiness of {}), i.e., catch ValueError and handle it as a cache miss/no-ADP
scenario.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 250-265: Prevent unbounded allocation by enforcing a maximum ADP
body size before json.loads() and before caching: define a MAX_ADP_BODY_BYTES
constant and when handling the successful response in agent_discovery.py (the
block using resp.read(), json.loads(), DiscoveryResult, and _cache), first check
resp.getheader("Content-Length") and reject (log and return None) if it exceeds
the max; if no Content-Length, read the body in bounded chunks up to
MAX_ADP_BODY_BYTES and stop/abort if the body grows past the limit, then only
call json.loads() and construct DiscoveryResult if within the limit, and ensure
oversized responses are never written to _cache even if use_cache is true.
- Around line 79-104: Resolve-and-validate currently returns only the first
resolved IP which causes failures when that one is unreachable; change
_resolve_and_validate to collect and return a list of all validated IP strings
(iterating addrs and using _is_blocked_ip to filter) instead of returning
addrs[0][4][0], and update callers (notably _pinned_create) to iterate over that
list and attempt connection to each pinned address in turn before failing (try
each validated IP, catching/continuing on connection errors, and only
raise/return None if all attempts fail). Ensure callers that expect a single IP
are updated to handle the list (or try sequentially) and preserve the same
security checks already implemented.

---

Duplicate comments:
In `@autogpt/utils/agent_discovery.py`:
- Around line 110-115: DiscoveryResult.__init__ must validate the ADP payload
shape up front: ensure the incoming data is a dict and that "services" (if
present) is a list of dicts; otherwise raise ValueError. Move deepcopy(data)
after that validation, build self._services from self._data["services"] only
when it's a list, and reject other types to avoid AttributeError when json.loads
returns non-dict values. Also update callers that treat cache hits/misses (the
branch that reads cached ADP) to treat ValueError the same as a "no ADP" case
(instead of relying on truthiness of {}), i.e., catch ValueError and handle it
as a cache miss/no-ADP scenario.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8572a08b-59e8-4e1f-b447-f40d23ceb1c9

📥 Commits

Reviewing files that changed from the base of the PR and between d667bea and d7186d6.

📒 Files selected for processing (2)

autogpt/tests/test_agent_discovery.py
autogpt/utils/agent_discovery.py

✅ Files skipped from review due to trivial changes (1)

autogpt/tests/test_agent_discovery.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Seer Code Review
GitHub Check: Check PR Status

🧰 Additional context used

🧠 Learnings (9)

📓 Common learnings

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

📚 Learning: 2026-04-03T13:53:33.653Z

Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12206
File: autogpt_platform/backend/snapshots/v2_unhandled_exception_500:1-5
Timestamp: 2026-04-03T13:53:33.653Z
Learning: In Significant-Gravitas/AutoGPT (autogpt_platform), the catch-all `Exception` handler in `autogpt_platform/backend/backend/api/utils/exceptions.py` (`_handle_error()`) intentionally surfaces `str(exc)` as the `detail` field in HTTP 500 responses for non-Prisma errors. This is by design: errors are logged server-side, and the detail helps API consumers report issues. Only `PrismaError` responses are sanitized (see commit ce6910b4a). Do not flag `str(exc)` in the generic 500 handler as an information disclosure issue; the snapshot `autogpt_platform/backend/snapshots/v2_unhandled_exception_500` ("connection refused") correctly reflects this behavior.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-18T14:03:32.534Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12473
File: autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py:86-195
Timestamp: 2026-03-18T14:03:32.534Z
Learning: In Significant-Gravitas/AutoGPT, the integration tests in `autogpt_platform/backend/backend/copilot/tools/agent_browser_integration_test.py` intentionally use real external URLs (example.com, httpbin.org). They are gated with `pytest.mark.skipif(shutil.which("agent-browser") is None, ...)`, so they are automatically skipped in standard CI where agent-browser is not installed. They are designed to be run explicitly inside the Docker environment to verify that system Chromium (AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium) actually launches and can fetch pages end-to-end. Do not flag the use of real network calls in these tests as a flakiness concern.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T21:25:15.983Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5770-5790
Timestamp: 2026-03-24T21:25:15.983Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12536`
File: autogpt_platform/frontend/src/app/api/openapi.json
Learning: The OpenAPI spec file is auto-generated; per established convention, endpoints generally declare only 200/201, 401, and 422 responses. Do not suggest adding explicit 403/404 response entries for single operations unless planning a repo-wide spec update. Prefer clarifying such behaviors in endpoint descriptions/docstrings instead of altering response maps.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T10:35:39.146Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py:154-178
Timestamp: 2026-03-24T10:35:39.146Z
Learning: In Significant-Gravitas/AutoGPT autogpt_platform, the add-to-library endpoints (both public `add_marketplace_agent_to_library` in `backend/api/features/library/routes/agents.py` and admin `admin_add_agent_to_library` in `backend/api/features/admin/store_admin_routes.py`) intentionally return HTTP 201 even when the underlying helper (`add_store_agent_to_library` / `add_store_agent_to_library_as_admin`) may return an existing or restored LibraryAgent (idempotent behavior). Do not flag HTTP 201 on these endpoints as incorrect — this is an established and intentional codebase pattern.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-03-24T21:27:22.326Z

Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12536
File: autogpt_platform/frontend/src/app/api/openapi.json:5732-5752
Timestamp: 2026-03-24T21:27:22.326Z
Learning: Repo: Significant-Gravitas/AutoGPT — Preference: Do not add explicit 403/404 entries to FastAPI route decorators for admin endpoints just to influence OpenAPI. Keep openapi.json autogenerated and use route docstrings to document admin-only (403) and not-found (404) behavior; rely on tests for enforcement. File context: autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py. PR `#12536`.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-11T23:51:20.769Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12750
File: autogpt_platform/backend/backend/copilot/sdk/file_tools.py:0-0
Timestamp: 2026-04-11T23:51:20.769Z
Learning: In `autogpt_platform/backend/backend/copilot/sdk/file_tools.py`, the TOCTOU gap between `is_allowed_local_path(resolved, sdk_cwd)` and the subsequent `open()` in `_handle_write_non_e2b` (and analogous read/edit handlers) is an accepted trade-off. In non-E2B mode the SDK working directory is ephemeral and user-scoped (created per session under `/tmp`), making the practical attack surface minimal. The E2B sandbox already uses `readlink -f` for defense-in-depth. Additional TOCTOU hardening (e.g., `O_NOFOLLOW`) should be reconsidered only if/when persistent workspaces are introduced. Do not flag this as a blocking issue for non-E2B ephemeral workspace paths.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-14T06:33:59.422Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12774
File: autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py:0-0
Timestamp: 2026-04-14T06:33:59.422Z
Learning: In `autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py`, the `asyncio.wait_for()` retry loop around `AsyncSandbox.create()` (introduced in PR `#12774`) can leak up to `_SANDBOX_CREATE_MAX_RETRIES - 1` (≤2) orphaned E2B sandboxes per hang incident because `wait_for` cancels only the client-side wait while E2B may complete server-side provisioning. With the default `on_timeout="pause"` lifecycle, leaked orphaned sandboxes are **paused** (not killed) when their original `end_at` is reached and persist indefinitely until explicitly killed — there is NO automatic E2B project-level cleanup. Operators must manage these manually or via their own cleanup jobs. The sandbox_id is not accessible from the timed-out coroutine, so recovery via `AsyncSandbox.connect(sandbox_id)` is not possible at timeout. This is an intentionally accepted trade-off; a proper fix is deferred to a follow-up PR. Do NOT flag the retry loop as a blocking issue.

Applied to files:

autogpt/utils/agent_discovery.py

📚 Learning: 2026-04-12T07:32:34.139Z

Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12747
File: autogpt_platform/backend/backend/copilot/sdk/openrouter_compat_proxy.py:329-335
Timestamp: 2026-04-12T07:32:34.139Z
Learning: In Significant-Gravitas/AutoGPT, accessing `site._server.sockets[0].getsockname()` on an `aiohttp.web.TCPSite` is the established codebase pattern for retrieving the actual bound port after an ephemeral (`port=0`) `site.start()` call. No public aiohttp API exists for this. The pattern is used consistently in `autogpt_platform/backend/backend/copilot/sdk/openrouter_compat_proxy.py`, `cli_openrouter_compat_test.py`, and `openrouter_compat_proxy_test.py`. Do not flag access to `site._server` as a private-API concern in this repository.

Applied to files:

autogpt/utils/agent_discovery.py

…t round 7)

… round 7)

walkojas-boop · 2026-04-14T10:50:07Z

Thanks for another solid round. All three findings addressed in 12716ee and afb6176:

1. DiscoveryResult schema validation
__init__ now rejects malformed payloads with ValueError before any field access:

Non-dict payloads
services that isn't a list
Service entries that aren't dicts
Service entries missing a non-empty string name

The fetch path already catches ValueError and returns None without caching, so a malformed response can't poison the cache.

2. Multi-IP failover
_resolve_and_validate now returns list[str] with every validated IP, deduplicated. The fetch loop iterates through them: transport errors (OSError, TimeoutError, ssl.SSLError) advance to the next IP; authoritative responses (2xx, 3xx, 4xx, 5xx) short-circuit so we don't retry after the server has answered. A mixed multi-A record where any IP is in a blocked range still fails the whole lookup — an attacker must not be able to mix one private entry with public ones.

3. Body size cap (1 MiB)
New _MAX_ADP_BODY_BYTES = 1_048_576 constant. _read_bounded_body checks Content-Length first (fast reject), then does a bounded resp.read(MAX + 1) and rejects if the server lied about the length (chunked/no-CL case). Malformed Content-Length values are also rejected.

New tests cover all three: TestDiscoveryResultSchemaValidation (9 cases), TestMultiIPFailover (4 cases), TestBodySizeCap (4 cases), plus test_rejects_mixed_public_and_private for the multi-A SSRF edge case and test_malformed_schema_not_cached to verify cache isolation on schema failure.

Smoke-tested locally; CI will verify end-to-end.

Sign in to view

+                DiscoveryResult(cached_result)
+                if cached_result
+                else None
+            )


… treated as misses (Sentry)

walkojas-boop · 2026-04-14T13:07:32Z

Good catch — valid bug. Fixed in 70c0c9a (+ regression test in ad121dc).

An empty-dict ADP payload {} passes schema validation (services defaults to []), gets cached correctly via _cache[domain] = (time.time(), {}), but then if cached_result on the retrieve path treats {} as falsy and returns None — producing a cache-never-hit for that domain until TTL expires. Switched to if cached_result is not None which is the correct discriminator between a positive cache entry and a negative one (only None is negative here).

Added test_cache_hit_empty_dict_not_mistaken_for_miss to pin this behavior: seed cache with (t, {}), assert discover_services returns a DiscoveryResult and never touches DNS or HTTPSConnection.

Sign in to view

+        except (
+            OSError,
+            TimeoutError,
+            ssl.SSLError,
+        ) as e:
+            last_error = e
+            logger.debug(
+                "ADP: transport failure at %s via %s (%s)",


… treated as misses (Sentry)

walkojas-boop · 2026-04-14T13:18:23Z

Valid HIGH — real leak. Fixed in 89719a5 (+ regression tests in b5ca521).

Refactored the per-IP attempt to try/except/finally, with conn.close() in the finally block. Any exception path (transport error, unexpected error, KeyboardInterrupt) now releases the socket before moving to the next IP or propagating. close() is idempotent so the success path is safe too — I removed the now-redundant explicit conn.close() calls on the success/redirect/non-2xx branches.

Three regression tests added to TestMultiIPFailover:

test_conn_closed_on_transport_error — single flaky IP, conn.close() called exactly once
test_all_failover_attempts_close_their_conn — three failing IPs, each of the three mocked conns has close() asserted
test_conn_closed_on_success — finally-block coverage on the happy path

Verified locally: repeated calls to a failing domain no longer accumulate file descriptors.

ntindle · 2026-04-15T21:01:12Z

closing as this isn't used anywhere so is functionally adding dead code. please re-open if you find another integration path

feat: Add Agent Discovery Protocol (ADP) support

f3b1291

Spec: https://github.com/walkojas-boop/agent-discovery-protocol Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

walkojas-boop requested a review from a team as a code owner April 12, 2026 23:42

walkojas-boop requested review from Swiftyos and majdyz and removed request for a team April 12, 2026 23:42

github-project-automation bot added this to AutoGPT development kanban Apr 12, 2026

github-project-automation bot moved this to 🆕 Needs initial review in AutoGPT development kanban Apr 12, 2026

github-actions bot added the Classic AutoGPT Agent label Apr 12, 2026

github-actions bot changed the base branch from master to dev April 12, 2026 23:42

github-actions bot added the size/l label Apr 12, 2026

coderabbitai bot reviewed Apr 12, 2026

View reviewed changes

majdyz reviewed Apr 14, 2026

View reviewed changes

majdyz requested changes Apr 14, 2026

View reviewed changes

github-project-automation bot moved this from 🆕 Needs initial review to 🚧 Needs work in AutoGPT development kanban Apr 14, 2026

walkojas-boop added 3 commits April 14, 2026 05:47

fix: SSRF validation, narrow exceptions, fix flake8

f67e07b

remove duplicate, import from parent module

9ceb4d6

test: add comprehensive tests for agent discovery

c1db7db

github-actions bot removed the Classic AutoGPT Agent label Apr 14, 2026

github-advanced-security AI found potential problems Apr 14, 2026

View reviewed changes

Comment thread autogpt/tests/test_agent_discovery.py Fixed

sentry bot reviewed Apr 14, 2026

View reviewed changes

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/tests/test_agent_discovery.py Outdated

Comment thread autogpt/utils/agent_discovery.py Outdated

Comment thread autogpt/utils/agent_discovery.py Outdated

fix: DNS rebinding (TOCTOU), redirect bypass, cache mutation, negativ…

5b1e2e3

…e cache, 2xx support

github-advanced-security AI found potential problems Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py Fixed

sentry bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py Outdated

Comment on lines +232 to +236

conn.connect()

conn.sock = ssl_ctx.wrap_socket(

conn.sock,

server_hostname=domain,

)

This comment was marked as outdated.

Sign in to view

walkojas-boop added 2 commits April 14, 2026 06:13

fix: enforce TLS 1.2 minimum (CodeQL ssl warning)

d5a8ba1

fix: remove double SSL wrap, use set_tunnel for SNI hostname

daeaaae

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py

Comment thread autogpt/utils/agent_discovery.py Outdated

fix: proper SNI via domain in HTTPSConnection + pinned IP override

d667bea

sentry bot reviewed Apr 14, 2026

View reviewed changes

walkojas-boop added 2 commits April 14, 2026 06:21

fix: check cache before DNS resolution (Sentry CRITICAL)

84f2ff9

test: rewrite to mock http.client.HTTPSConnection (Sentry HIGH)

44f63f6

github-advanced-security AI found potential problems Apr 14, 2026

View reviewed changes

Comment thread autogpt/tests/test_agent_discovery.py Fixed

test: use variables for domain assertions (CodeQL false positive)

bc08627

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py Outdated

Comment thread autogpt/utils/agent_discovery.py Outdated

fix: validate before cache, remove dead code, fix type hints, chain e…

d7186d6

…xceptions

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py

Comment thread autogpt/utils/agent_discovery.py Outdated

Comment thread autogpt/utils/agent_discovery.py Outdated

walkojas-boop added 2 commits April 14, 2026 06:46

fix: validate ADP schema, multi-IP failover, cap body size (CodeRabbi…

12716ee

…t round 7)

test: schema validation, multi-IP failover, body size cap (CodeRabbit…

afb6176

… round 7)

sentry bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py

Comment on lines +252 to +255

DiscoveryResult(cached_result)

if cached_result

else None

)

This comment was marked as outdated.

Sign in to view

walkojas-boop added 2 commits April 14, 2026 09:07

fix: use is not None for cache check so empty-dict positives aren't…

70c0c9a

… treated as misses (Sentry)

test: regression for empty-dict cache positive (Sentry)

ad121dc

sentry bot reviewed Apr 14, 2026

View reviewed changes

Comment thread autogpt/utils/agent_discovery.py

Comment on lines +362 to +369

except (

OSError,

TimeoutError,

ssl.SSLError,

) as e:

last_error = e

logger.debug(

"ADP: transport failure at %s via %s (%s)",

This comment was marked as outdated.

Sign in to view

walkojas-boop added 2 commits April 14, 2026 09:17

fix: use is not None for cache check so empty-dict positives aren't…

89719a5

… treated as misses (Sentry)

test: regression for empty-dict cache positive (Sentry)

b5ca521

ntindle closed this Apr 15, 2026

github-project-automation bot moved this from 🚧 Needs work to ✅ Done in AutoGPT development kanban Apr 15, 2026

		@@ -0,0 +1,111 @@
		"""Agent Discovery Protocol (ADP) v0.1 -- discover agent services at any domain.

Conversation

walkojas-boop commented Apr 12, 2026

Summary

Uh oh!

CLAassistant commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 12, 2026

Uh oh!

coderabbitai bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

majdyz Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

majdyz Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

majdyz Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

majdyz Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

majdyz left a comment

Choose a reason for hiding this comment

Uh oh!

walkojas-boop commented Apr 14, 2026

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

walkojas-boop commented Apr 14, 2026

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

walkojas-boop commented Apr 14, 2026

CLAassistant commented Apr 12, 2026 •

edited

Loading

coderabbitai bot commented Apr 12, 2026 •

edited

Loading

codecov bot commented Apr 12, 2026 •

edited

Loading