Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add api for loading plans of all types #80

Merged
merged 11 commits into from
Aug 26, 2023

Conversation

EpsilonPrime
Copy link
Member

features:
* supports reading/writing all of the major formats

caveats:
* only read/writes to filenames so in-memory use should use other interfaces
* does not support compression
* does not have a zero copy interface

features:
    * supports reading/writing all of the major formats

caveats:
    * only read/writes to filenames so in-memory use should use other interfaces
    * does not support compression
    * does not have a zero copy interface
@EpsilonPrime EpsilonPrime marked this pull request as ready for review July 29, 2023 02:03
Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review, just from looking at the public header.

include/substrait/common/Io.h Outdated Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
src/substrait/common/Io.cpp Outdated Show resolved Hide resolved
src/substrait/common/Io.cpp Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
@EpsilonPrime
Copy link
Member Author

@westonpace Is there anything else you'd like to see done here?

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the API so I'm marking this approved. I did go through and have a closer look at the rest. I have a bunch of minor suggestions but I don't think any of them really impact correctness. Please don't feel obliged to fix them all. Feel free to resolve, ignore, or defer to a follow-up at your leisure.

include/substrait/common/Io.h Outdated Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
include/substrait/common/Io.h Outdated Show resolved Hide resolved
src/substrait/common/Io.cpp Outdated Show resolved Hide resolved
src/substrait/textplan/converter/SaveBinary.cpp Outdated Show resolved Hide resolved
}

std::string output;
auto status = ::google::protobuf::util::MessageToJsonString(plan, &output);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output here is created from an ofstream. In the binary method it is created from google::protobuf::io::FileOutputStream. Does FileOutputStream not work with MessageToJsonString?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I attempted to use it originally and had to fall back to a separate implementation.

}
stream << output;
if (stream.fail()) {
return absl::UnknownError("Failed to write the plan as a JSON protobuf.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor but this message could be annoyingly vague (e.g. maybe this happened because we ran out of disk space?) Perhaps https://en.cppreference.com/w/cpp/io/basic_ios/exceptions could be used to generate exceptions immediately (which would hopefully be more informative).

That being said, these failures will be rare, and we can always fix this later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to push this to later. I would like to get better messages but don't want to create an exception to get them. Maybe more use of std::error_code is the answer.

src/substrait/textplan/converter/SaveBinary.cpp Outdated Show resolved Hide resolved

} // namespace

absl::StatusOr<::substrait::proto::Plan> loadPlan(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the load and save methods return StatusOr then I, as a user, would not expect to get an exception. I think we still rely on exceptions in some places (maybe this isn't true?) If so, we should probably wrap these methods with a try/catch and wrap the exception in an invalid status.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed most of the cases where we use SUBSTRAIT_FAIL. There are about 5 remaining and most of those are for nearly impossible cases. I'd prefer to finish removing them over wrapping this code in an exception. If it turns out we're getting exceptions from other libraries then we will be forced to add it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And the fuzz testing work will help illuminate if we need to wrap the code.)

@EpsilonPrime EpsilonPrime merged commit ecfd5b8 into substrait-io:main Aug 26, 2023
3 checks passed
@EpsilonPrime EpsilonPrime deleted the readwrite2 branch August 26, 2023 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants