Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source-postgres: Support _citext columns #2166

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

willdonnelly
Copy link
Member

@willdonnelly willdonnelly commented Nov 22, 2024

Description:

Arrays of citext are tricky because citext is an extension type which means that it has no stable OID, and as a result the PGX client library doesn't have any default type mappings for it. In the absence of any registered codec, values are always captured in their stringified text-protocol format.

However our discovery logic still notices that it's an array and emits the appropriate schema for an array of text values, which means there's a mismatch and we're not satisfying our own schema. Which is bad. While citext is an extension type, it's a common one and the citext extension ships as part of the Postgres release, so I believe we should still support it as best we can.

This commit fixes that by extending the "datatype tweaks" logic with a query against pg_catalog.pg_type looking for specific extension types we support (which currently is just citext) and for each result row we register custom handlers for the base type and arrays of that type.

Since this is all getting fairly complex, I've also gone ahead and added test cases for citext and _citext datatype support. But of course those don't work against a database without that extension installed, so the CI database setup script has been modified to install it. (The citext extension is shipped as part of Postgres itself, it just isn't enabled by default, so we just have to run CREATE EXTENSION citext).

Workflow steps:

Columns of type citext[] should just work going forward.


This change is Reviewable

Arrays of `citext` are tricky because `citext` is an extension type
which means that it has no stable OID, and as a result the PGX
client library doesn't have any default type mappings for it. In
the absence of any registered codec, values are always captured
in their stringified text-protocol format.

However our discovery logic still notices that it's an array and
emits the appropriate schema for an array of text values, which
means there's a mismatch and we're not satisfying our own schema.
Which is bad.

This commit fixes that by extending the "datatype tweaks" logic
with a query against `pg_catalog.pg_type` looking for specific
extension types we support (which currently is just `citext`)
and for each result row we register custom handlers for the
base type and arrays of that type.

Since this is all getting fairly complex, I've also gone ahead
and added test cases for `citext` and `_citext` datatype support.
But of course those don't work against a database without that
extension installed, so the CI database setup script has been
modified to install it. (The `citext` extension is shipped as
part of Postgres itself, it just isn't enabled by default, so
we just have to run `CREATE EXTENSION citext`).
@willdonnelly willdonnelly added the change:unplanned Unplanned change, useful for things like doc updates label Nov 22, 2024
@willdonnelly willdonnelly requested a review from a team November 22, 2024 19:33
Copy link
Member

@williamhbaker williamhbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@willdonnelly willdonnelly merged commit 1280804 into main Nov 22, 2024
52 of 53 checks passed
@willdonnelly willdonnelly deleted the wgd/postgres-citext-arrays-20241122 branch November 22, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:unplanned Unplanned change, useful for things like doc updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants