You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm looking at the async API for reading parquet files and it offers creating stream of RecordBatch. Right now I'm using parquet::file::serialized_reader::SerializedFileReader for getting iterator over rows of the file, but I'm a bit lost on how to achieve similar access when using async and RecordBatch, which just gives raw access to columns.
Is this kind of conversion into Row objects easy to perform using existing APIs? Should row reader over RecordBatch be added as a new feature to parquet::record module? Or is there going to be parquet::file::async_reader::AsyncFileReader?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm looking at the
async
API for reading parquet files and it offers creating stream ofRecordBatch
. Right now I'm usingparquet::file::serialized_reader::SerializedFileReader
for getting iterator over rows of the file, but I'm a bit lost on how to achieve similar access when usingasync
andRecordBatch
, which just gives raw access to columns.Is this kind of conversion into
Row
objects easy to perform using existing APIs? Should row reader overRecordBatch
be added as a new feature toparquet::record
module? Or is there going to beparquet::file::async_reader::AsyncFileReader
?Beta Was this translation helpful? Give feedback.
All reactions