Skip to content

Latest commit

 

History

History
35 lines (32 loc) · 1.37 KB

File metadata and controls

35 lines (32 loc) · 1.37 KB

To read from/write to MinIO buckets:

  • Set up the configuration for connecting to MinIO cluster with the SparkConfActor

    actor:
      type: spark-conf-actor
      properties:
        hadoopConfigs:
          fs.s3a.path.style.access: "true"
          fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
          fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
          fs.s3a.connection.ssl.enabled: "true"
          fs.s3a.connection.timeout: "30000",
          fs.s3a.endpoint: "${minio.host}:${minio.port}"
          fs.s3a.access.key: "${minio.access_key}"
          fs.s3a.secret.key: "${minio.secret_key}"

    Note:

    • ${minio.host} - the full-name of a host of the MinIO cluster.
    • ${minio.port} - the port number at the host for connecting to the MinIO cluster
    • ${minio.access_key} & ${minio.secret_key} - the access key & secret key for connecting to the MinIO cluster and accessing objects in buckets.
  • Use FileReader/FileWriter, IcebergReader/IcebergWriter and/or DeltaReader/DeltaWriter for the read/write operations.

    The following is one example of reading csv objects from the users bucket and an underneath input path:

    actor:
      type: file-reader:
      properties:
        format: csv
        options:
          header: true
          delimiter: "|"
        fileUri: "s3a://users/input"