-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG][Spark] issue when merge using autoMerge property #3336
Comments
I can confirm that this issue is valid. This is why it can can happen:
The fix however is not trivial and risky. We could change the match logic to see the first part of the column name as an alias, but this will fail in the following example:
What will 't.col2' match? the I'll do some research to see how people run this kind of query and decide the next step. |
I think you meant
Well, in this case, there is a clash btw the alias for the delta table and the struct column name... making sure that the alias does not match the column name will probably solve the issue, but it sounds like a workaround. |
Bug [Spark] issue when merge using autoMerge property
Which Delta project/connector is this regarding?
Describe the problem
Merge operation with an insert/update Expr condition doesn't support using an alias when referencing the target table.
and the conf
"spark.databricks.delta.schema.autoMerge.enabled"
is enabled. Aliasing work fine when the parameter is off.Let's suppose the target table has the alias t and the source has the alias s, when we define the Expr condition like that
Supposing new_col is an additional column for the target table, giving such a map
Map( "t.new_col"- > "s.new_col")
raises the error.Steps to reproduce
Run the code below. Update the variable into the updateExpr/insertExpr to reproduce the issue.
`
// spark.conf.set("spark.databricks.delta.schema.autoMerge.enabled","true") <- this cannot be enabled at run time. it has to be set when the spark session is initializing.
`
Observed results
When
val goodColumnsMap = Map("new_col" -> "source.new_col")
is given as updateExpr/insertExpr condition, the merge runs smoothly as expected.When
val badColumnsMap = Map("target.new_col" -> "source.new_col")
is given as updateExpr/insertExpr condition, an error will be raised.Merge Op
ERROR LOG
Expected results
I do expect there is no different behavior between the 2 cases. So the merge should run smooth.
Further details
Environment information
Willingness to contribute
The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?
The text was updated successfully, but these errors were encountered: