Skip to content

[dbus] add API to fetch netdiag TLVs#3149

Open
jdswensen wants to merge 1 commit intoopenthread:mainfrom
jdswensen:jds/dbus-netdiag-get
Open

[dbus] add API to fetch netdiag TLVs#3149
jdswensen wants to merge 1 commit intoopenthread:mainfrom
jdswensen:jds/dbus-netdiag-get

Conversation

@jdswensen
Copy link
Copy Markdown
Contributor

Add an API to fetch arbitrary network diagnostic TLVs. This new API mirrors the ot-ctl command networkdiagnostic get.

A generic approach was preferred over adding individual APIs so that an external application can support fetching future TLVs that OTBR may not support.

Depends on: openthread/openthread#12191

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @jdswensen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the OpenThread Border Router (OTBR) by adding a new D-Bus API for retrieving network diagnostic Type-Length-Value (TLV) data. This allows external applications to query detailed diagnostic information from the Thread network in a flexible and future-proof manner, aligning with existing ot-ctl capabilities.

Highlights

  • New D-Bus API: Introduced a new D-Bus API method, GetNetworkDiagnosticTlvs, to fetch arbitrary network diagnostic TLVs.
  • Generic Approach: The API is designed generically to support fetching future TLVs without requiring further API modifications, ensuring extensibility for external applications.
  • ot-ctl Mirror: This new API mirrors the functionality of the ot-ctl command networkdiagnostic get, providing a programmatic equivalent.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@jdswensen
Copy link
Copy Markdown
Contributor Author

Example command:

dbus-send --reply-timeout=30000 --print-reply --system --type="method_call" --dest=io.openthread.BorderRouter.wpan0 /io/openthread/BorderRouter/wpan0 io.openthread.BorderRouter.GetNetworkDiagnosticTlvs array:byte:0xfd,0xbb,0x3d,0x6a,0x2b,0xdf,0xd9,0x43,0x00,0x00,0x00,0xff,0xfe,0x00,0xbc,0x00 array:byte:1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18

And reply:

method return time=1764697971.150622 sender=:1.138 -> destination=:1.142 serial=12 reply_serial=2
   array of bytes [
      01 02 bc 00 02 01 0f 04 0a 00 00 00 00 00 4a 01 05 00 01 05 0a 4a 00 00
      00 00 00 01 00 00 01 06 08 0a 11 6d 69 41 ae 50 2f 07 66 08 04 0b 02 ab
      40 0b 0e 80 01 01 0d 09 bc 00 75 00 05 00 00 0e 10 03 14 00 40 fd 8d 24
      87 4c fd 00 01 05 04 bc 00 f1 00 07 02 11 40 03 08 00 07 fc 01 03 bc 00
      00 03 13 00 60 fd 8d 24 87 4c fd 00 02 00 00 00 00 01 03 bc 00 e0 0b 19
      81 01 5d 0d 14 bc 00 fd bb 3d 6a 2b df d9 43 b7 aa 91 97 32 44 7b 0d d1
      20 08 80 fd bb 3d 6a 2b df d9 43 00 00 00 ff fe 00 fc 11 fd 8d 24 87 4c
      fd 00 01 00 93 f1 7b 31 c9 b9 18 fd bb 3d 6a 2b df d9 43 00 00 00 ff fe
      00 fc 38 fd bb 3d 6a 2b df d9 43 00 00 00 ff fe 00 fc 10 fd bb 3d 6a 2b
      df d9 43 00 00 00 ff fe 00 fc 00 fd bb 3d 6a 2b df d9 43 00 00 00 ff fe
      00 bc 00 fd bb 3d 6a 2b df d9 43 b7 aa 91 97 32 44 7b 0d fe 80 00 00 00
      00 00 00 9c b2 1e fd 6f e9 01 81 09 24 00 00 00 00 00 00 00 00 00 00 00
      00 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00 00 00 00 01 ca 00 00 00
      00 10 00 11 01 00
   ]

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new D-Bus API for fetching network diagnostic TLVs, which is a valuable addition. The implementation is mostly well-structured, but there's a critical logic issue in how asynchronous responses are handled. The current implementation only processes the first response and ignores any subsequent ones, which would not work correctly in a network with multiple devices. I've provided a detailed comment and a code suggestion to fix this behavior. Additionally, there are a couple of minor suggestions to improve code clarity and consistency.

Comment on lines +242 to +270
void ThreadHelper::NetDiagCallback(otError aError, otMessage *aMessage, const otMessageInfo *aMessageInfo)
{
otError error = OT_ERROR_NONE;
const uint16_t replySize = 1024;
uint8_t reply[replySize] = {};
uint16_t written = 0;

OTBR_UNUSED_VARIABLE(aMessageInfo);
VerifyOrExit(mDiagHandler);

if (aError != OT_ERROR_NONE)
{
mDiagHandler(aError, {});
}
else
{
SuccessOrExit(error = otThreadGetRawDiagnosticTlvs(aMessage, replySize, reply, &written));
mNetworkDiagTlvResults.insert(mNetworkDiagTlvResults.end(), reply, reply + written);
mDiagHandler(error, mNetworkDiagTlvResults);
}

exit:
if (error != OT_ERROR_NONE && mDiagHandler)
{
mDiagHandler(error, {});
}

mDiagHandler = nullptr;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of NetDiagCallback incorrectly handles responses. It calls the handler and clears it upon receiving the first diagnostic response, which means any subsequent responses from other devices will be ignored. The otThreadSendDiagnosticGet function can trigger multiple callbacks (one per responding device) before a final timeout callback.

The correct approach is to accumulate results from all successful responses and only invoke the final handler mDiagHandler when the transaction is complete, which is typically signaled by a timeout (aError != OT_ERROR_NONE).

Additionally, allocating a 1KB buffer on the stack (uint8_t reply[replySize]) is risky and could lead to stack overflow. Using std::vector is a safer alternative.

The suggested fix below addresses both of these issues.

void ThreadHelper::NetDiagCallback(otError aError, otMessage *aMessage, const otMessageInfo *aMessageInfo)
{
    otError error = OT_ERROR_NONE;
    const uint16_t replySize = 1024;
    std::vector<uint8_t> reply(replySize);
    uint16_t written = 0;

    OTBR_UNUSED_VARIABLE(aMessageInfo);
    VerifyOrExit(mDiagHandler);

    if (aError == OT_ERROR_NONE)
    {
        SuccessOrExit(error = otThreadGetRawDiagnosticTlvs(aMessage, reply.size(), reply.data(), &written));
        reply.resize(written);
        mNetworkDiagTlvResults.insert(mNetworkDiagTlvResults.end(), reply.begin(), reply.end());
        // Wait for more responses or timeout.
        ExitNow();
    }

    // End of transaction (timeout or other error), reply with what we have.
    mDiagHandler(mNetworkDiagTlvResults.empty() ? aError : OT_ERROR_NONE, mNetworkDiagTlvResults);
    mDiagHandler = nullptr;

exit:
    if (error != OT_ERROR_NONE)
    {
        if (mDiagHandler)
        {
            mDiagHandler(error, {});
            mDiagHandler = nullptr;
        }
    }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a mechanism for defining timeouts within the dbus architecture? I think that might be required as an input if the caller wants to define how long to wait for responses.

For now, I refactored things so the dbus API will reject multicast addresses.

Comment thread src/dbus/server/dbus_thread_object_rcp.cpp Outdated
Comment thread src/host/thread_helper.cpp Outdated
@jdswensen jdswensen force-pushed the jds/dbus-netdiag-get branch from 433ae3d to bf49e25 Compare December 19, 2025 19:19
@jdswensen
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new D-Bus API to fetch network diagnostic TLVs, which is a valuable addition. However, the current implementation has a few critical issues that need to be addressed. There's a use-after-free vulnerability due to incorrect handling of a buffer's lifetime in an asynchronous call, and a buffer over-read when processing the response message. Additionally, there's a logic error where a handler is not cleared in an error path, which could render the feature unusable after the first failure. I've provided detailed comments and suggestions to fix these problems.

Comment thread src/host/thread_helper.cpp
Comment thread src/host/thread_helper.cpp Outdated
Comment thread src/host/thread_helper.cpp Outdated
Comment thread src/dbus/server/dbus_thread_object_rcp.cpp Outdated
Add an API to fetch arbitrary network diagnostic TLVs. This new API
mirrors the `ot-ctl` command `networkdiagnostic get`.

A generic approach was preferred over adding individual APIs so that an
external application can support fetching future TLVs that OTBR may not
support.
@jdswensen jdswensen force-pushed the jds/dbus-netdiag-get branch from bf49e25 to d2f43b6 Compare January 5, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant