Skip to content

Dataframe doesn't properly implement ArrowStream export interface #1166

@johnnyg

Description

@johnnyg

Describe the bug
When trying to pass a dataframe to another library that expects an ArrowStream export interface we get the following error:

TypeError: argument 'input': DataFrame.__arrow_c_stream__() missing 1 required positional argument: 'requested_schema'

this is because the requested_schema argument should be optional but it's not.

To Reproduce

from arro3.core import RecordBatchReader
import datafusion

data = [{"num": 42}]
ctx = datafusion.SessionContext()
df = ctx.from_pylist(data)
reader = RecordBatchReader.from_arrow(df)

Expected behavior
The above to run without error

Additional context
Replacing RecordBatchReader.from_arrow(df) with RecordBatchReader.from_arrow(df.df) works around the bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions