Commit 96e731b
ContentDateEnrichment: Filter _update_by_query to only unresolved documents (#3118)
* ContentDateEnrichment: Filter _update_by_query to only unresolved documents
The _update_by_query in ResolveContentDatesAsync was re-indexing every
document in both the lexical and semantic indices. On the semantic index,
this triggered ML inference for all 6 semantic_text fields on every
document — causing the deploy workflow to hang for 3+ hours.
After HashedBulkUpdate, unchanged documents (noop) retain their resolved
content_last_updated from the previous run. Only new/changed documents
have the field at the default DateTimeOffset.MinValue (0001-01-01). The
filter restricts _update_by_query to only these unresolved documents,
reducing the typical deploy from hundreds of thousands of documents to
just the changed ones.
Also enhances integration tests to use real HashedBulkUpdate-style
scripted upserts with full DocumentationDocument serialization, and adds
tests proving the filter behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Simplify: Clean up review findings
- Extract query JsonObject to static readonly string (avoid re-allocation per call)
- Remove debug output.WriteLine from discovery test
- Fix double JsonNode.Parse — reuse parsed node for params.doc
- Use const for hash field name
- Fix misleading comment on doc1 in FilteredResolve test
- Remove unnecessary null-forgiving operators
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Tests: Assert bulk response has no item-level errors
The existing check only verified HTTP status code, which can be 200 even
when individual bulk items fail. Parse the response body and assert
"errors": false.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Tests: Fail explicitly on missing or malformed bulk errors field
Assert bulkResult and its "errors" property exist before checking the
boolean value, rather than defaulting to false on missing/malformed JSON.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 5c018e0 commit 96e731b
2 files changed
Lines changed: 271 additions & 65 deletions
File tree
- src/Elastic.Markdown/Exporters/Elasticsearch
- tests-integration/Elastic.ContentDateEnrichment.IntegrationTests
Lines changed: 25 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
80 | 104 | | |
81 | 105 | | |
82 | 106 | | |
83 | 107 | | |
84 | | - | |
| 108 | + | |
85 | 109 | | |
86 | 110 | | |
87 | 111 | | |
| |||
0 commit comments