Skip to content

Commit

Permalink
materialize-databricks: specifically fail validation for base64 text …
Browse files Browse the repository at this point in the history
…fields with pre-existing string columns

We need this hack for now to make sure that v1 tasks cannot be updated to v2 without failing
validation for pre-existing string columns that should be binary. A re-backfill of these tables is
required.
  • Loading branch information
williamhbaker committed May 8, 2024
1 parent 2edd184 commit 0522254
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions materialize-databricks/sqlgen.go
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,17 @@ var databricksDialect = func() sql.Dialect {
// stringCompatible allow strings of any format, arrays, objects, or fields with multiple types to
// be materialized since they are all converted to strings.
func stringCompatible(p pf.Projection) bool {
// TODO(whb): This is a hack for making sure that pre-existing base64 encoded columns that were
// materialized as string columns in v1 fail validation in v2, which will materialize these
// columns as binary. This is needed because we currently validate a pre-existing "string"
// column positively with a string field having any format, content-type, or content-encoding,
// see https://github.com/estuary/connectors/issues/1501.
if sql.TypesOrNull(p.Inference.Types, []string{"string"}) {
if p.Inference.String_.ContentEncoding == "base64" {
return false
}
}

return sql.StringCompatible(p) || sql.JsonCompatible(p)
}

Expand Down

0 comments on commit 0522254

Please sign in to comment.