Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iceberg Metadata Ingestion failing due to S3 FileSystem Initialisation #18512

Closed
Prajwal214 opened this issue Nov 4, 2024 · 0 comments · Fixed by #18521
Closed

Iceberg Metadata Ingestion failing due to S3 FileSystem Initialisation #18512

Prajwal214 opened this issue Nov 4, 2024 · 0 comments · Fixed by #18521
Assignees
Labels
bug Something isn't working Ingestion

Comments

@Prajwal214
Copy link
Contributor

Affected module
Does it impact the UI, backend or Ingestion Framework?
-- Ingestion

Describe the bug
When ingesting Iceberg table metadata in OpenMetadata version 1.5.10, the ingestion process encounters a TypeError indicating an issue with the S3FileSystem initialization. Specifically, the error shows that expected bytes, pydantic_core._pydantic_core.Url found. This appears to be related to the S3 file system implementation that requires a fix.

To Reproduce

Screenshots or steps to reproduce

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/iceberg/metadata.py", line 183, in get_tables_name_and_type 
    table = self.iceberg.load_table(table_identifier)
  File "/home/airflow/.local/lib/python3.10/site-packages/pyiceberg/catalog/hive.py", line 358, in load_table 
    return self._convert_hive_into_iceberg(hive_table, io)
  File "/home/airflow/.local/lib/python3.10/site-packages/pyiceberg/catalog/hive.py", line 239, in _convert_hive_into_iceberg 
    file = io.new_input(metadata_location)
  File "/home/airflow/.local/lib/python3.10/site-packages/pyiceberg/io/pyarrow.py", line 369, in new_input 
    fs=self.fs_by_scheme(scheme),
  File "/home/airflow/.local/lib/python3.10/site-packages/pyiceberg/io/pyarrow.py", line 319, in _initialize_fs 
    return S3FileSystem(**client_kwargs)
  File "pyarrow/_s3fs.pyx", line 356, in pyarrow._s3fs.S3FileSystem.__init__ 
  File "<stringsource>", line 15, in string.from_py.__pyx_convert_string_from_py_6libcpp_6string_std__in_string 
TypeError: expected bytes, pydantic_core._pydantic_core.Url found

Expected behavior
A clear and concise description of what you expected to happen.
--The ingestion process should complete successfully, loading metadata for Iceberg tables without raising errors related to S3FileSystem.

Version:

  • OS: [e.g. iOS]
  • Python version:
  • OpenMetadata version: [e.g. 0.8] v1.5.10
  • OpenMetadata Ingestion package version: [e.g. openmetadata-ingestion[docker]==XYZ]

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Ingestion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants