Skip to content

Support ReadableStream in uploadToSynapse for memory-efficient uploads #325

@HavenCTO

Description

@HavenCTO

Context / Problem

We are integrating filecoin-pin (via the Synapse SDK) into a project where large CAR files (>1GiB) are pre-generated on disk. To utilize the Synapse upload capabilities, we must currently load the entire CAR file into memory before calling uploadToSynapse, as the function signature strictly accepts Uint8Array.

This results in peak memory usage of ~2.3 GiB for a 1 GiB file (original data + CAR format + memory buffers), which causes instability in resource-constrained environments. While the underlying Synapse SDK already supports streaming uploads via ReadableStream<Uint8Array> to minimize memory usage, the current uploadToSynapse implementation in filecoin-pin does not expose this capability, forcing a buffering strategy.

This limitation contradicts the goal of supporting files larger than 1GiB. As noted in synapse-sdk#110:

IMO : Loading 1GB file into memory is not ideal and we need this change to fully support what @rjan90 said back in December.

To truly support larger files without crashing or requiring excessive RAM, uploadToSynapse needs to accept streams so that the file can be uploaded directly from disk rather than being loaded into memory first.

Comparison with Issue #288

This request is distinct from #288 (True Streaming CAR Uploads).

  • Issue True Streaming CAR Uploads #288 addresses creating a CAR and uploading it simultaneously from source data, avoiding the need to buffer the CAR during creation. It is currently blocked by upstream Synapse API changes requiring a split store/commit flow to handle metadata timing.
  • This Issue addresses uploading an already existing CAR file from disk. The CAR is fully built and the root CID is known prior to the upload call. The goal is to pass this CAR to uploadToSynapse as a stream rather than a Uint8Array to prevent the memory overhead of loading the file into RAM.

Since the CAR is pre-generated, the metadata timing blockers in #288 do not apply here. This request can be fulfilled by updating the uploadToSynapse signature to accept streams.

Requested Changes

Upstream code that needs to be aligned with - https://github.com/FilOzone/synapse-sdk/blob/4be67a838c5b602608b3ac33f9702c1343f87e09/packages/synapse-sdk/src/storage/context.ts#L907

Update filecoin-pin to support ReadableStream<Uint8Array> for the CAR data payload to enable zero-copy (or low-memory) uploads from disk.

  1. Update uploadToSynapse signature: Modify the function to accept ReadableStream<Uint8Array> in addition to Uint8Array.

    export async function uploadToSynapse(
      synapseService: SynapseService,
      carData: Uint8Array | ReadableStream<Uint8Array>, // ← Support streaming
      rootCid: CID,
      logger: Logger,
      options: SynapseUploadOptions = {}
    ): Promise<SynapseUploadResult>
  2. Propagate Stream: Ensure the stream is passed directly through to synapse.storage.upload() without accumulating the entire payload in memory.

  3. Documentation: Document this usage for scenarios involving pre-generated CARs on disk.

Why This Matters

Without ReadableStream support in filecoin-pin, consumers are forced to implement intermediate buffering layers when uploading CARs from disk, resulting in O(n) memory usage relative to file size. Supporting streams allows memory usage to drop to O(chunk size), enabling reliable handling of large files without requiring significant RAM allocation.

Environment / Repro

  • Client: Integration using filecoin-pin to upload pre-generated CAR files.
  • Scenario: Attempting to upload a CAR file (~1GiB) located on disk.
  • Current Behavior: The file must be read entirely into a buffer (e.g., via fs.readFile or Deno.readFile) to satisfy the Uint8Array requirement of uploadToSynapse, causing high memory usage.
  • Desired Behavior: A file stream can be opened and passed directly to uploadToSynapse.

Thanks for considering—happy to test a streaming implementation and provide feedback.

Metadata

Metadata

Assignees

No one assigned

    Labels

    team/filecoin-pin"Filecoin Pin" project is a stakeholder for this work.team/fs-wgFOC working group is a stakeholder for this work, and thus wants to track it on their project board.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions