Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: call try_merge recursively for list field #5852

Merged
merged 1 commit into from
Jun 13, 2024

Conversation

mnpw
Copy link
Contributor

@mnpw mnpw commented Jun 7, 2024

Which issue does this PR close?

Closes #5843.

Rationale for this change

Merging fields recursively is already performed for a field with datatype Struct. This PR extends that recursive merge to fields with List and LargeList datatypes.

What changes are included in this PR?

  • Updated Field::try_merge to call merge recursively if datatype is List and LargeList
  • Added tests for megre for nested datatypes (like Stuct and List) with Null fields

Are there any user-facing changes?

N/A

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 7, 2024
@mnpw mnpw changed the title feat(breaking): call try_merge recursively for field with DataType::L… feat(breaking): call try_merge recursively for list field Jun 7, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @mnpw -- this PR makes sense to me.

Nicely tested as well 👍

Let me know about DataType::LargeList

@@ -483,6 +483,18 @@ impl Field {
));
}
},
DataType::List(field) => match &from.data_type {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add support for DataType::LargeList https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.LargeList

If you don't want to add it in this PR, I'll file a ticket to track and add it in a follow on

@mnpw
Copy link
Contributor Author

mnpw commented Jun 11, 2024

Thanks @alamb for the review. As per your suggestion I have added the change for DataType::LargeList too.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mnpw

I noticed that this PR is marked "breaking" but it seems to me like it isn't a breaking API change (in the sense that code that compiles before this PR will also compile after this PR)

struct1
);

let mut list1 = Field::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we could add a test for LargeList too?

@mnpw mnpw force-pushed the list-nested-merge branch from df8528a to 68e7d6f Compare June 12, 2024 15:58
@mnpw mnpw changed the title feat(breaking): call try_merge recursively for list field feat: call try_merge recursively for list field Jun 12, 2024
@mnpw
Copy link
Contributor Author

mnpw commented Jun 12, 2024

@alamb

  • Added test for LargeList
  • Updated commit to reflect the non-breaking nature of change.

I added breaking to highlight the different outcome Field::try_merge after my changes. It seems more fit to only add breaking if there are user-facing API changes as you suggest.

Lmk if this looks good

@alamb
Copy link
Contributor

alamb commented Jun 13, 2024

Thanks @mnpw

@alamb alamb merged commit 2d17bf0 into apache:master Jun 13, 2024
26 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 13, 2024

Thanks again @mnpw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Schema::try_merge should be able to merge List of any data type with List of Null data type
2 participants