Conversation
Implement near-optimal vector quantization based on "TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate" (Zandieh et al., 2025). The algorithm uses randomized Walsh-Hadamard rotation + optimal per-coordinate scalar quantization to achieve ~2.7x of theoretical distortion limits. Data-oblivious and per-vector, making it ideal for streaming RAG pipelines. - Add turboQuantize/turboDequantize in @workglow/util/schema - Add turboQuantizedInnerProduct/turboQuantizedCosineSimilarity for direct similarity on quantized vectors - Extend VectorQuantizeTask with "turbo" method option and turboBits/ turboSeed parameters - Add 29 tests covering roundtrip quality, compression, and similarity https://claude.ai/code/session_01YD75mdbcw6ygET7hdjQdWD
…tion or class' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new TurboQuant vector quantization implementation to @workglow/util, exports it via the util schema entrypoint, and wires an optional “turbo” path into VectorQuantizeTask, along with a dedicated TurboQuant test suite.
Changes:
- Added
TurboQuantize.tsimplementing TurboQuant quantize/dequantize + similarity helpers and storage sizing utilities. - Exported TurboQuant APIs from
packages/util/src/schema-entry.ts. - Added
VectorQuantizeTaskinput options for selecting linear vs turbo behavior. - Added
TurboQuantizeunit tests.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| packages/util/src/vector/TurboQuantize.ts | Implements TurboQuant quantize/dequantize + quantized similarity and storage helpers. |
| packages/util/src/schema-entry.ts | Re-exports TurboQuant APIs for public consumption via @workglow/util/schema. |
| packages/test/src/test/util/TurboQuantize.test.ts | Adds unit coverage for TurboQuant roundtrip, similarity estimates, determinism, and utilities. |
| packages/ai/src/task/VectorQuantizeTask.ts | Adds method selection and TurboQuant configuration to the vector quantization task. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| readonly bits: number; | ||
| /** Seed for deterministic random rotation. If omitted, uses a fixed default seed. */ | ||
| readonly seed: number | undefined; |
There was a problem hiding this comment.
TurboQuantizeOptions makes seed (and bits) required properties even though the implementation treats them as optional via defaults. This prevents callers from passing { bits: 4 } or {}. Make these fields optional (e.g., bits?: number; seed?: number) or provide a separate TurboQuantizeOptionsInput type that reflects the defaulting behavior.
| readonly bits: number; | |
| /** Seed for deterministic random rotation. If omitted, uses a fixed default seed. */ | |
| readonly seed: number | undefined; | |
| readonly bits?: number; | |
| /** Seed for deterministic random rotation. If omitted, uses a fixed default seed. */ | |
| readonly seed?: number; |
| function createPrng(seed: number): () => number { | ||
| let state = seed | 0 || 1; | ||
| return () => { | ||
| state ^= state << 13; | ||
| state ^= state >> 17; | ||
| state ^= state << 5; | ||
| // Convert to [0, 1) range | ||
| return (state >>> 0) / 4294967296; |
There was a problem hiding this comment.
createPrng coerces seed = 0 to state 1 (seed | 0 || 1), so a caller-provided seed of 0 will not be honored. Either document this explicitly or map 0 to a non-zero constant in a way that preserves the input seed’s determinism contract (e.g., hash/mix the seed instead of treating 0 specially).
| function randomRotate(values: Float64Array, seed: number): Float64Array { | ||
| const d = values.length; | ||
| // Pad to next power of 2 for Hadamard transform | ||
| const paddedLen = nextPowerOf2(d); | ||
| const result = new Float64Array(paddedLen); | ||
| result.set(values); | ||
|
|
||
| const prng = createPrng(seed); | ||
|
|
||
| // Apply 3 rounds for good mixing (standard practice for randomized Hadamard) | ||
| for (let round = 0; round < 3; round++) { | ||
| // Random sign flips (diagonal Rademacher matrix) | ||
| for (let i = 0; i < paddedLen; i++) { | ||
| if (prng() < 0.5) { | ||
| result[i] = -result[i]; | ||
| } | ||
| } | ||
|
|
||
| // Fast Walsh-Hadamard transform (in-place, normalized) | ||
| fastWalshHadamard(result); | ||
| } | ||
|
|
||
| // Return only the first d dimensions (drop padding) | ||
| return result.subarray(0, d); | ||
| } |
There was a problem hiding this comment.
The padding/truncation in randomRotate breaks orthogonality/invertibility for non-power-of-two dimensions: you rotate in paddedLen space but then drop the padded coordinates (subarray(0, d)). This loses information and means inverseRandomRotate() cannot correctly undo the rotation (and inner products won’t be preserved) for common dimensions like 768. Consider keeping/quantizing all paddedLen coordinates (and storing paddedLen in the result) or using an orthogonal transform that supports arbitrary lengths; alternatively, explicitly require power-of-two dimensions and throw otherwise.
| function unpackCodes(packed: Uint8Array, bits: number, count: number): number[] { | ||
| const codes: number[] = new Array(count); | ||
|
|
||
| let bitPos = 0; | ||
| for (let i = 0; i < count; i++) { | ||
| let code = 0; | ||
| let remaining = bits; | ||
| let shift = 0; | ||
| while (remaining > 0) { | ||
| const byteIdx = bitPos >> 3; | ||
| const bitOffset = bitPos & 7; | ||
| const bitsToRead = Math.min(remaining, 8 - bitOffset); | ||
| const mask = (1 << bitsToRead) - 1; | ||
| code |= ((packed[byteIdx] >> bitOffset) & mask) << shift; | ||
| shift += bitsToRead; | ||
| bitPos += bitsToRead; | ||
| remaining -= bitsToRead; | ||
| } | ||
| codes[i] = code; |
There was a problem hiding this comment.
unpackCodes() does not validate that packed.length is large enough for count * bits bits. If a truncated/invalid buffer is passed, typed-array out-of-bounds reads yield undefined which is coerced to 0, silently producing wrong codes (and potentially masking data corruption). Add an explicit length check (expected bytes = ceil(count * bits / 8)) and throw on mismatch.
| let quantized: TypedArray[]; | ||
|
|
||
| if (method === QuantizationMethod.TURBO) { | ||
| quantized = vectors.map((v) => { | ||
| const result = turboQuantize(v, { bits: turboBits, seed: turboSeed }); | ||
| return turboDequantize(result); | ||
| }); | ||
| } else { | ||
| quantized = vectors.map((v) => this.vectorQuantize(v, targetType, normalize)); | ||
| } |
There was a problem hiding this comment.
In the TURBO branch, the task returns turboDequantize(...) (a Float32Array) but still reports targetType as the requested type, and does not actually quantize to targetType. This is an observable mismatch (e.g., targetType: INT8 can return a Float32Array) and defeats the task’s “reduce storage” purpose. Either (1) change the output schema to return TurboQuant’s packed codes + metadata, (2) set targetType to FLOAT32 for the turbo path, and/or (3) post-process the dequantized vector through vectorQuantize(..., targetType, ...) if you intend turbo to be a preconditioning step.
| override async executeReactive(input: VectorQuantizeTaskInput): Promise<VectorQuantizeTaskOutput> { | ||
| const { vector, targetType, normalize = true } = input; | ||
| const { | ||
| vector, | ||
| targetType, | ||
| normalize = true, | ||
| method = QuantizationMethod.LINEAR, | ||
| turboBits = 4, | ||
| turboSeed = 42, | ||
| } = input; | ||
| const isArray = Array.isArray(vector); | ||
| const vectors = isArray ? vector : [vector]; | ||
| const originalType = this.getVectorType(vectors[0]); | ||
|
|
||
| const quantized = vectors.map((v) => this.vectorQuantize(v, targetType, normalize)); | ||
| let quantized: TypedArray[]; | ||
|
|
||
| if (method === QuantizationMethod.TURBO) { | ||
| quantized = vectors.map((v) => { | ||
| const result = turboQuantize(v, { bits: turboBits, seed: turboSeed }); | ||
| return turboDequantize(result); | ||
| }); | ||
| } else { | ||
| quantized = vectors.map((v) => this.vectorQuantize(v, targetType, normalize)); | ||
| } |
There was a problem hiding this comment.
TurboQuant support in VectorQuantizeTask isn’t covered by the existing VectorQuantizeTask tests (they only exercise the linear path). Add at least one test case that sets method: 'turbo' and asserts the returned type/metadata behavior you intend (and that it is deterministic for a fixed seed).
| /** | ||
| * Computes optimal quantization boundaries and reconstruction points for | ||
| * coordinates of a rotated unit vector. | ||
| * | ||
| * After random rotation, each coordinate of a d-dimensional unit vector follows | ||
| * approximately N(0, 1/d). For practical purposes with moderate dimensions (>50), | ||
| * we use uniform quantization over the range [-c/sqrt(d), c/sqrt(d)] where c | ||
| * controls the coverage (we use c ≈ 3 for 99.7% coverage). | ||
| */ | ||
| function getQuantizationParams( | ||
| bits: number, | ||
| dimensions: number | ||
| ): { readonly levels: number; readonly scale: number } { | ||
| const levels = 1 << bits; // 2^bits quantization levels | ||
| // After rotation, coordinates are approximately N(0, 1/d). | ||
| // Standard deviation is 1/sqrt(d). Cover ±3 standard deviations. | ||
| const coverage = 3.0; | ||
| const scale = coverage / Math.sqrt(dimensions); | ||
| return { levels, scale }; |
There was a problem hiding this comment.
The docstrings here describe “optimal quantization boundaries / Beta distribution” behavior, but getQuantizationParams() actually implements a fixed-range uniform quantizer (±3/√d) and doesn’t compute any boundaries or reconstruction points beyond a linear scale. Please align the documentation with the implemented quantizer, or implement the described non-uniform/optimal scalar quantization so callers aren’t misled about distortion guarantees.
| turboBits: { | ||
| type: "number", | ||
| title: "TurboQuant Bits", | ||
| description: | ||
| "Bits per dimension for TurboQuant method (1-8). Lower = more compression. 4 bits gives ~8x compression with near-lossless quality.", | ||
| default: 4, | ||
| minimum: 1, | ||
| maximum: 8, | ||
| }, | ||
| turboSeed: { | ||
| type: "number", | ||
| title: "TurboQuant Seed", | ||
| description: | ||
| "Seed for the random rotation in TurboQuant. All vectors in the same collection must use the same seed for similarity search to work.", | ||
| default: 42, | ||
| }, |
There was a problem hiding this comment.
turboBits and turboSeed are modeled as type: "number" in the input schema, but the underlying implementation treats bits as an integer (and bitwise PRNG logic will effectively truncate non-integer seeds). Consider using type: "integer" (or multipleOf: 1) for these fields to match runtime expectations and fail validation earlier.
|
@copilot apply changes based on the comments in this thread |
…uantizeTask Agent-Logs-Url: https://github.com/workglow-dev/workglow/sessions/6f050f81-5568-4d3e-90f0-422553b01ee8 Co-authored-by: sroussey <127349+sroussey@users.noreply.github.com>
Applied all 8 review changes in commit 051f0b7:
|
…tible output TurboQuant's rotation + optimal scalar quantization now outputs directly into byte-aligned TypedArrays (Int8Array, Uint8Array, Int16Array, Uint16Array) with the same .length as the input vector. This means the output works transparently with all existing storage backends and cosineSimilarity search — no dimensional mismatch. - Add turboQuantizeToTypedArray() that rotates then quantizes into the target integer type at its native bit width - Update VectorQuantizeTask turbo branch to call turboQuantizeToTypedArray directly instead of quantize+dequantize roundtrip - Remove turboBits parameter (bit width determined by targetType) - Add 14 tests for the new function covering type output, similarity preservation, determinism, range bounds, and edge cases https://claude.ai/code/session_01YD75mdbcw6ygET7hdjQdWD
TurboQuantizeOptions- makebitsandseedoptional fieldscreatePrngseed=0 handling by XOR-mixing seed with golden-ratio constantpaddedDimensionsthroughout (avoids dropping coordinates for non-power-of-2 dims)getQuantizationParams(accurately describes uniform quantizer now)unpackCodes()VectorQuantizeTaskto reporttargetType: FLOAT32(matches actual Float32Array output)turboBits/turboSeedschema fields fromtype: "number"totype: "integer"VectorQuantizeTask.test.ts(type/metadata, determinism, array-of-vectors)TurboQuantize.test.tsstorage/compression tests for padded-dimension calculations