Replies: 1 comment
-
Are you able to provide a simplified example of your parquet file to help us replicate your issue? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a python AWS lambda function which tries to read a parquet file which has 2 columns whose type is boolean (total 46 different columns in each parquet file).
When I exclude those two boolean columns named: "iscritical" and "iscyclic" from the input columns list the read_parquet operation success.
code snippet:
valid_cols = [col for col in list(parquet_file_cols_metadata.keys()) if col != "iscritical" and col != "iscyclic"]
stage_file_full_data_df = wr.s3.read_parquet(
path=stage_file,
ignore_empty=True,
use_threads=True,
columns=valid_cols)
When I am trying to read the entire data (inlcude the boolean types columns) the read_parquet operation fails with exception: "Unknown encoding"
code snippet:
stage_file_full_data_df = wr.s3.read_parquet(
path=stage_file,
ignore_empty=True,
use_threads=True)
What I am asking is why wr.s3.read_parquet() cannot handle boolean columns dtypes?
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions