Skip to content

Commit c851c45

Browse files
committed
add todo on DNS
1 parent f62e5b8 commit c851c45

1 file changed

Lines changed: 67 additions & 0 deletions

File tree

TODO.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,70 @@ Currently we identify trackers purely through DNS interception. Parsing the TLS
3838
- SNI should be a secondary/fallback layer only
3939
- Need to handle fragmented ClientHello messages across multiple TCP segments
4040
- Consider whether the IP exposure trade-off is acceptable given TC's threat model (local VPN, no remote proxy to hide behind)
41+
42+
## Tracker blocking ambiguity: global DNS evidence vs per-app decisions
43+
44+
Tracker detection currently relies on two separate data stores with different levels of attribution:
45+
46+
- The `access` table stores `uid`, destination IP, block decision, and the `uncertain` flag.
47+
- The `dns` table stores only `qname`, `aname`, `resource`, `time`, and `ttl`. It does **not** store `uid`.
48+
49+
That means `ServiceSinkhole` can ask `getQAName(uid, ip, ...)`, but the current implementation cannot truly answer "which hostname did this app resolve for this IP?". It can only answer "which hostnames have recently been seen for this IP globally?". The `uid` parameter is accepted by the method, but it is not used in the SQL query.
50+
51+
### Why this matters
52+
53+
Tracker blocking is applied per app, but the DNS evidence used to infer whether an IP belongs to a tracker is global. This creates two related ambiguity problems:
54+
55+
1. **Shared-IP ambiguity**
56+
- Multiple unrelated hostnames can legitimately resolve to the same IP.
57+
- Some of those hostnames may be trackers; some may not.
58+
- The current code already models this via the `uncertain` states and blocking-mode policy.
59+
60+
2. **Cross-app attribution ambiguity**
61+
- Even if app A triggered the DNS observation, app B may later connect to the same IP.
62+
- The code may then reuse the global hostname evidence when deciding whether to block app B.
63+
- This is conceptually consistent with the current database model, but it can still lead to surprising per-app blocking decisions.
64+
65+
### Possible low-risk improvement
66+
67+
A relatively small runtime-only change would be to make the in-memory tracker verdict cache UID-aware:
68+
69+
- current shape: cache by destination IP only
70+
- safer shape: cache by `(uid, destination IP)`
71+
72+
This would reduce cross-app cache contamination, because one app's inferred tracker verdict would no longer automatically carry over to another app. Importantly, this would **not** change the underlying database model and would **not** make DNS attribution truly app-specific. It would only make cache reuse more conservative.
73+
74+
### Why this is still not a full fix
75+
76+
Even with a UID-aware cache:
77+
78+
- the DNS table would still be global
79+
- `getQAName(...)` would still return globally observed hostnames for an IP
80+
- uncertainty handling would still be necessary
81+
82+
So a UID-aware cache is only a partial correctness improvement. It does not solve the deeper attribution problem.
83+
84+
### Real fix if stronger attribution is desired
85+
86+
If the goal is to make per-app tracker blocking decisions rest on genuinely per-app DNS evidence, the persistence model would need redesign. Options include:
87+
88+
- storing DNS observations with a UID
89+
- linking DNS evidence to specific access observations
90+
- moving to a different attribution model entirely
91+
92+
This is a larger change because it affects schema, collection logic, query logic, migration, and likely the semantics of `uncertain`.
93+
94+
### Current decision
95+
96+
Do **not** change this yet.
97+
98+
Reasoning:
99+
100+
- The current behavior matches the limitations of the existing database design.
101+
- A UID-aware runtime cache would be easy to add, but it only partially addresses the issue.
102+
- A true attribution fix is more invasive and needs a deliberate product decision: how conservative should `standard` be when evidence is global and ambiguous?
103+
104+
If revisited later, start by deciding whether the desired goal is:
105+
106+
- just to reduce cross-app cache bleed, or
107+
- to redesign tracker attribution so "per-app blocking" is backed by per-app evidence.

0 commit comments

Comments
 (0)