Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata ingestion fails for Iceberg tables with nested partition column #18491

Open
tomasko-labuda opened this issue Oct 31, 2024 · 0 comments

Comments

@tomasko-labuda
Copy link

tomasko-labuda commented Oct 31, 2024

Affected module
Ingestion Framework

Describe the bug
Metadata ingestion fails for Iceberg tables with nested partition column.

To Reproduce
Data ingestion works for this table:
CREATE TABLE catalog1.db1.table1 (a STRUCT<b: STRING>, b STRING) PARTITIONED BY (b)

Data ingestion fails for this table:
CREATE TABLE catalog1.db1.table1 (a STRUCT<b: STRING>, b STRING) PARTITIONED BY (a.b)

Error:

[2024-10-31T13:58:37.779+0000] {status.py:91} WARNING - Failed to ingest CreateTableRequest [table1] due to api request failure: Invalid column name found in table partition
[2024-10-31T13:58:37.779+0000] {status.py:92} DEBUG - Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 243, in _one_request
    resp.raise_for_status()
  File "/home/airflow/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://openmetadata-server:8585/api/v1/tables
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 146, in _run
    return self._run_dispatch(record)
  File "/usr/local/lib/python3.10/functools.py", line 926, in _method
    return method.__get__(obj, cls)(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 137, in _run_dispatch
    return self.write_create_request(record)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 167, in write_create_request
    created = self.metadata.create_or_update(entity_request)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/ometa_api.py", line 280, in create_or_update
    return self._create(data=data, method="put")
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/ometa_api.py", line 271, in _create
    resp = fn(self.get_suffix(entity), data=data.model_dump_json())
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/execution_time_tracker.py", line 195, in inner
    result = func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 324, in put
    return self._request("PUT", path, data)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 212, in _request
    return self._one_request(method, url, opts, retry)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 263, in _one_request
    raise APIError(error, http_error) from http_error
metadata.ingestion.ometa.client.APIError: Invalid column name found in table partition

Expected behavior
Data ingestion works for table with nested partition column.

Version:

  • OpenMetadata version: 1.5.10
  • OpenMetadata Ingestion package version: 1.5.10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant