Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ordered and app_metadata attributes to PyArrow FlightInfo #1

Closed
wants to merge 6 commits into from

Conversation

EnricoMi
Copy link
Owner

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@@ -844,7 +844,7 @@ cdef class FlightInfo(_Weakrefable):
return obj

def __init__(self, Schema schema, FlightDescriptor descriptor, endpoints,
total_records, total_bytes):
total_records, total_bytes, ordered, app_metadata):
Copy link

@adamreeve adamreeve Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
total_records, total_bytes, ordered, app_metadata):
total_records, total_bytes, ordered=False, app_metadata=b''):

These new parameters should be optional for backwards compatibility

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried that, this is not supported by Cython 😞.

Copy link

@adamreeve adamreeve Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd, there are other cdef classes in this same file that have optional parameters in their __init__, eg. FlightError also has an optional bytes parameter. What is the error you get? (Disclaimer: I'm not very familiar with Cython)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it works this time. 🥳

the total bytes in this flight, or -1 if unknown.
ordered : boolean
Whether endpoints are in the same order as the data.
app_metadata : str
Copy link

@adamreeve adamreeve Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
app_metadata : str
app_metadata : bytes, optional

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we wanted to be a bit more flexible we could allow bytes or str, and call tobytes on this.

@@ -950,7 +972,9 @@ cdef class FlightInfo(_Weakrefable):
f"descriptor={self.descriptor} "
f"endpoints={self.endpoints} "
f"total_records={self.total_records} "
f"total_bytes={self.total_bytes}>")
f"total_bytes={self.total_bytes} "
f"ordered={'true' if self.ordered else 'false'} "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking, but I'd expect to have True and False in a Python repr, as the builtin bool constants are capitalized.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll flag this with the arrow devs.

"total_records=1 "
"total_bytes=42 "
"ordered=true "
"app_metadata='7465737420617070206d65746164617461'>")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour of using the hex representation might be a little confusing. It might be best to stick with the default bytes repr, ie. app_metadata=b'test app metadata'. This will still handle non-ascii bytes ok by displaying them like \x00.

Although I'm also a bit concerned that this field could be quite large, maybe we just want something like app_metadata=<bytes with length 17>

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the c++ implementation does. I guess they use hex because it is opaque metadata. I'd leave this to / flag this with the arrow devs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK yeah matching the C++ behaviour makes sense then.

@EnricoMi EnricoMi force-pushed the more-flightinfo-attributes branch 2 times, most recently from 3d69078 to 2c8bfa3 Compare July 26, 2024 07:43
@EnricoMi
Copy link
Owner Author

EnricoMi commented Aug 2, 2024

Thanks for the code review, raised upstream: apache#43537.

@EnricoMi EnricoMi closed this Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants