You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are integrating filecoin-pin (via the Synapse SDK) into a project where large CAR files (>1GiB) are pre-generated on disk. To utilize the Synapse upload capabilities, we must currently load the entire CAR file into memory before calling uploadToSynapse, as the function signature strictly accepts Uint8Array.
This results in peak memory usage of ~2.3 GiB for a 1 GiB file (original data + CAR format + memory buffers), which causes instability in resource-constrained environments. While the underlying Synapse SDK already supports streaming uploads via ReadableStream<Uint8Array> to minimize memory usage, the current uploadToSynapse implementation in filecoin-pin does not expose this capability, forcing a buffering strategy.
This limitation contradicts the goal of supporting files larger than 1GiB. As noted in synapse-sdk#110:
IMO : Loading 1GB file into memory is not ideal and we need this change to fully support what @rjan90 said back in December.
To truly support larger files without crashing or requiring excessive RAM, uploadToSynapse needs to accept streams so that the file can be uploaded directly from disk rather than being loaded into memory first.
This request is distinct from #288 (True Streaming CAR Uploads).
Issue True Streaming CAR Uploads #288 addresses creating a CAR and uploading it simultaneously from source data, avoiding the need to buffer the CAR during creation. It is currently blocked by upstream Synapse API changes requiring a split store/commit flow to handle metadata timing.
This Issue addresses uploading an already existing CAR file from disk. The CAR is fully built and the root CID is known prior to the upload call. The goal is to pass this CAR to uploadToSynapse as a stream rather than a Uint8Array to prevent the memory overhead of loading the file into RAM.
Since the CAR is pre-generated, the metadata timing blockers in #288 do not apply here. This request can be fulfilled by updating the uploadToSynapse signature to accept streams.
Update filecoin-pin to support ReadableStream<Uint8Array> for the CAR data payload to enable zero-copy (or low-memory) uploads from disk.
Update uploadToSynapse signature: Modify the function to accept ReadableStream<Uint8Array> in addition to Uint8Array.
exportasyncfunctionuploadToSynapse(synapseService: SynapseService,carData: Uint8Array|ReadableStream<Uint8Array>,// ← Support streamingrootCid: CID,logger: Logger,options: SynapseUploadOptions={}): Promise<SynapseUploadResult>
Propagate Stream: Ensure the stream is passed directly through to synapse.storage.upload() without accumulating the entire payload in memory.
Documentation: Document this usage for scenarios involving pre-generated CARs on disk.
Why This Matters
Without ReadableStream support in filecoin-pin, consumers are forced to implement intermediate buffering layers when uploading CARs from disk, resulting in O(n) memory usage relative to file size. Supporting streams allows memory usage to drop to O(chunk size), enabling reliable handling of large files without requiring significant RAM allocation.
Environment / Repro
Client: Integration using filecoin-pin to upload pre-generated CAR files.
Scenario: Attempting to upload a CAR file (~1GiB) located on disk.
Current Behavior: The file must be read entirely into a buffer (e.g., via fs.readFile or Deno.readFile) to satisfy the Uint8Array requirement of uploadToSynapse, causing high memory usage.
Desired Behavior: A file stream can be opened and passed directly to uploadToSynapse.
Thanks for considering—happy to test a streaming implementation and provide feedback.
Context / Problem
We are integrating
filecoin-pin(via the Synapse SDK) into a project where large CAR files (>1GiB) are pre-generated on disk. To utilize the Synapse upload capabilities, we must currently load the entire CAR file into memory before callinguploadToSynapse, as the function signature strictly acceptsUint8Array.This results in peak memory usage of ~2.3 GiB for a 1 GiB file (original data + CAR format + memory buffers), which causes instability in resource-constrained environments. While the underlying
Synapse SDKalready supports streaming uploads viaReadableStream<Uint8Array>to minimize memory usage, the currentuploadToSynapseimplementation infilecoin-pindoes not expose this capability, forcing a buffering strategy.This limitation contradicts the goal of supporting files larger than 1GiB. As noted in synapse-sdk#110:
IMO : Loading 1GB file into memory is not ideal and we need this change to fully support what @rjan90 said back in December.
To truly support larger files without crashing or requiring excessive RAM,
uploadToSynapseneeds to accept streams so that the file can be uploaded directly from disk rather than being loaded into memory first.Comparison with Issue #288
This request is distinct from #288 (True Streaming CAR Uploads).
store/commitflow to handle metadata timing.uploadToSynapseas a stream rather than aUint8Arrayto prevent the memory overhead of loading the file into RAM.Since the CAR is pre-generated, the metadata timing blockers in #288 do not apply here. This request can be fulfilled by updating the
uploadToSynapsesignature to accept streams.Requested Changes
Upstream code that needs to be aligned with - https://github.com/FilOzone/synapse-sdk/blob/4be67a838c5b602608b3ac33f9702c1343f87e09/packages/synapse-sdk/src/storage/context.ts#L907
Update
filecoin-pinto supportReadableStream<Uint8Array>for the CAR data payload to enable zero-copy (or low-memory) uploads from disk.Update
uploadToSynapsesignature: Modify the function to acceptReadableStream<Uint8Array>in addition toUint8Array.Propagate Stream: Ensure the stream is passed directly through to
synapse.storage.upload()without accumulating the entire payload in memory.Documentation: Document this usage for scenarios involving pre-generated CARs on disk.
Why This Matters
Without
ReadableStreamsupport infilecoin-pin, consumers are forced to implement intermediate buffering layers when uploading CARs from disk, resulting in O(n) memory usage relative to file size. Supporting streams allows memory usage to drop to O(chunk size), enabling reliable handling of large files without requiring significant RAM allocation.Environment / Repro
filecoin-pinto upload pre-generated CAR files.fs.readFileorDeno.readFile) to satisfy theUint8Arrayrequirement ofuploadToSynapse, causing high memory usage.uploadToSynapse.Thanks for considering—happy to test a streaming implementation and provide feedback.