Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Snowflake connector #17909

Merged
merged 3 commits into from
Mar 7, 2024
Merged

Add Snowflake connector #17909

merged 3 commits into from
Mar 7, 2024

Conversation

dprophet
Copy link
Contributor

@dprophet dprophet commented Jun 14, 2023

Description

New Snowflake connector

Equal acknowledgements go to

Erik Anderson, Bloomberg
Yu Teng, ForePaaS

This Snowflake connector is a collaboration between both individuals at separate companies. Developing and maintaining such a connector has taken about 2 years of efforts. The community has came together to help solve challenge that is greater than 1 person, or 1 company. A lot of time, energy has gone into creating, maintaining, and open sourcing this connector.

This work references pull request #10387

@cla-bot cla-bot bot added the cla-signed label Jun 14, 2023
@findepi findepi requested a review from hashhar June 14, 2023 20:44
@mosabua mosabua changed the title Snowflake connector Add Snowflake connector Jun 14, 2023
@bitsondatadev bitsondatadev requested a review from ebyhr June 14, 2023 22:21
Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just skimmed.

Please rebase on master and squash commits into one.

<parent>
<groupId>io.trino</groupId>
<artifactId>trino-root</artifactId>
<version>413-SNAPSHOT</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 422-SNAPSHOT

@@ -0,0 +1,180 @@
<?xml version="1.0"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update ci.yml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

</dependency>
</dependencies>

<build>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add cloud-tests profile. You can refer to other SaaS connector. e.g. BigQuery

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Comment on lines 111 to 112
private final Type jsonType;
private final AggregateFunctionRewriter aggregateFunctionRewriter;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Move to under static final fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 119 to 122
private static final DateTimeFormatter snowballDateTimeFormatter = DateTimeFormatter.ofPattern("y-MM-dd'T'HH:mm:ss.SSSSSSSSSXXX");
private static final DateTimeFormatter snowballDateFormatter = DateTimeFormatter.ofPattern("uuuu-MM-dd");
private static final DateTimeFormatter snowballTimestampFormatter = DateTimeFormatter.ofPattern("y-MM-dd'T'HH:mm:ss.SSSSSSSSS");
private static final DateTimeFormatter snowballTimeFormatter = DateTimeFormatter.ofPattern("HH:mm:ss.SSSSSSSSS");
Copy link
Member

@ebyhr ebyhr Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename prefix to snowflake and uppercase these constants.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

import static io.trino.spi.type.VarcharType.createVarcharType;
import static java.time.ZoneOffset.UTC;

public class TestSnowflakeTypeMapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test case for timestamp types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already exist at line 308:
@Test public void testTimestamp() { testTimestamp(UTC); testTimestamp(jvmZone); testTimestamp(vilnius); testTimestamp(kathmandu); testTimestamp(TestingSession.DEFAULT_TIME_ZONE_KEY.getZoneId()); }

Copy link
Member

@ebyhr ebyhr Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's timestamp without time zone. timestamp with time zone test is still missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder.

Comment on lines 682 to 687
private static LongWriteFunction timestampWriter()
{
return (statement, index, value) -> statement.setString(index, StandardColumnMappings.fromTrinoTimestamp(value).toString());
}

private static Optional<ColumnMapping> getUnsignedMapping(JdbcTypeHandle typeHandle)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

public void testCharVarcharComparison()
{
// Override and skip it because snowflake not support this feature
assertThatThrownBy(super::testCharVarcharComparison).isInstanceOf(AssertionError.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please verify the message instead of the class instance. Same for other places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated for all tests

@Override
public void testCreateTable()
{
// Override and skip it because snowflake not support this feature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snowflake supports creating a table. This comment should be updated. Same for other places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Override it. In fact there is only one thing that null doesn't equal to "". So I override it

``char(n)`` ``varchar(n)``
``binary(n)`` ``varbinary``
``varbinary`` ``varbinary``
``date`` ``date``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timestamp type is missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

-------------

To configure the Snowflake connector, create a catalog properties file
in ``etc/catalog`` named, for example, ``snowflake.properties``, to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use example.properties as the template. See other connector docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Did you forget to push the changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this PR is created from the repository of Erik, I can't maintain and update it directly. I have to send him a new PR first, and after he accepts it, you will see the changes. I'll let you know when it has done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let you know when it has done.

Please avoid leaving "Updated" comments until this PR is updated. The comment is confusing and not actionable.

<dependency>
<groupId>net.snowflake</groupId>
<artifactId>snowflake-jdbc</artifactId>
<version>3.13.29</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version is 3.13.32.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

public void testCreateTable()
{
// Override and skip it because snowflake not support this feature
assertThatThrownBy(super::testCreateTable).isInstanceOf(AssertionError.class);
Copy link
Member

@ebyhr ebyhr Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't ship untested features to users. I would recommend making this connector read-only by overriding some methods in SnowflakeClient. e.g. createTable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can support createTable.

@bitsondatadev
Copy link
Member

@dprophet - @yuuteng any updates here?

@yuuteng
Copy link
Contributor

yuuteng commented Jul 10, 2023

@dprophet - @yuuteng any updates here?

I will do PR to @dprophet about these issues mentioned.

Copy link
Contributor

@yuuteng yuuteng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked all reviews and I will update a PR in the next comment

-------------

To configure the Snowflake connector, create a catalog properties file
in ``etc/catalog`` named, for example, ``snowflake.properties``, to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this PR is created from the repository of Erik, I can't maintain and update it directly. I have to send him a new PR first, and after he accepts it, you will see the changes. I'll let you know when it has done.

return Optional.ofNullable(timestampNoTimezoneAsUTC);
}

@Config("snowflake.timestampNoTimezoneAsUtc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 111 to 112
private final Type jsonType;
private final AggregateFunctionRewriter aggregateFunctionRewriter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 119 to 122
private static final DateTimeFormatter snowballDateTimeFormatter = DateTimeFormatter.ofPattern("y-MM-dd'T'HH:mm:ss.SSSSSSSSSXXX");
private static final DateTimeFormatter snowballDateFormatter = DateTimeFormatter.ofPattern("uuuu-MM-dd");
private static final DateTimeFormatter snowballTimestampFormatter = DateTimeFormatter.ofPattern("y-MM-dd'T'HH:mm:ss.SSSSSSSSS");
private static final DateTimeFormatter snowballTimeFormatter = DateTimeFormatter.ofPattern("HH:mm:ss.SSSSSSSSS");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 682 to 687
private static LongWriteFunction timestampWriter()
{
return (statement, index, value) -> statement.setString(index, StandardColumnMappings.fromTrinoTimestamp(value).toString());
}

private static Optional<ColumnMapping> getUnsignedMapping(JdbcTypeHandle typeHandle)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

// .put(Types.TIME, columnMappingPushdown(timeColumnMapping()))
// .buildOrThrow();

private static final Map<String, ColumnMappingFunction> SHOWFLAKE_COLUMN_MAPPINGS = ImmutableMap.<String, ColumnMappingFunction>builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand what do you mean "Inline" ? Can you explain about what do I need to do or do you have an example about it please?

public void testCharVarcharComparison()
{
// Override and skip it because snowflake not support this feature
assertThatThrownBy(super::testCharVarcharComparison).isInstanceOf(AssertionError.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated for all tests

@Override
public void testCreateTable()
{
// Override and skip it because snowflake not support this feature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Override it. In fact there is only one thing that null doesn't equal to "". So I override it

public void testCreateTable()
{
// Override and skip it because snowflake not support this feature
assertThatThrownBy(super::testCreateTable).isInstanceOf(AssertionError.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can support createTable.

@yuuteng
Copy link
Contributor

yuuteng commented Jul 13, 2023

@ebyhr Here is my new pull request : bloomberg#7
It needs to be merged by @dprophet. I let you take a look and merge it.

public Object[][] snowflakeIntegerTypeProvider()
{
// INT , INTEGER , BIGINT , SMALLINT , TINYINT , BYTEINT, DECIMAL , NUMERIC are aliases for NUMBER(38, 0) in snowflake
// https://docs.snowflake.com/en/sql-reference/data-types-numeric.html#int-integer-bigint-smallint-tinyint-byteint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Unsupported test cases, e.g: testUnsupportedTinyint with value min-1 and max+

.addRoundTrip("decimal(24, 4)", "CAST('12345678901234567890.31' AS decimal(24, 4))", createDecimalType(24, 4), "CAST('12345678901234567890.31' AS decimal(24, 4))")
.addRoundTrip("decimal(30, 5)", "CAST('3141592653589793238462643.38327' AS decimal(30, 5))", createDecimalType(30, 5), "CAST('3141592653589793238462643.38327' AS decimal(30, 5))")
.addRoundTrip("decimal(30, 5)", "CAST('-3141592653589793238462643.38327' AS decimal(30, 5))", createDecimalType(30, 5), "CAST('-3141592653589793238462643.38327' AS decimal(30, 5))")
// .addRoundTrip("decimal(38, 0)", "CAST('27182818284590452353602874713526624977' AS decimal(38, 0))", createDecimalType(38, 0), "CAST('27182818284590452353602874713526624977' AS decimal(38, 0))")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

@Override
public void testReadMetadataWithRelationsConcurrentModifications()
{
throw new SkipException("Test fails with a timeout sometimes and is flaky");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible open the test and annotated with @Flaky instead of skipping the test?

public void testDropAmbiguousRowFieldCaseSensitivity()
{
// Override and skip it because snowflake not support this feature
assertThatThrownBy(super::testDropAmbiguousRowFieldCaseSensitivity);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to add more checks like using hasMessage call

public void testBinary()
{
SqlDataTypeTest.create()
.addRoundTrip("binary(18)", "NULL", VARBINARY, "CAST(NULL AS varbinary)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can extract the common test cases like the TestIgniteTypeMapping.binaryTest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it passes, can we leave this place for the moment ? I'll optimize it later.

" shippriority bigint,\n" +
" comment varchar(79)\n" +
")\n" +
"COMMENT ''");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here has COMMENT field even we declare not support SUPPORTS_CREATE_TABLE_WITH_TABLE_COMMENT and SUPPORTS_COMMENT_ON_TABLE

return true;
}

private ColumnMapping jsonColumnMapping()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@afranzi
Copy link

afranzi commented Sep 22, 2023

Hello team, wondering how active is this PR and how much effort is required to make it happen 🚀 ?

@github-actions github-actions bot added ui Web UI hive Hive connector bigquery BigQuery connector labels Oct 31, 2023
@ebyhr
Copy link
Member

ebyhr commented Nov 1, 2023

Files changed 133

Please revert unrelated changes. Also, please squash commits into one.

@luisfalconeri
Copy link

Hello there! Firstly, nice work :-) Wondering if there are any updates on the status here? Thanks a lot

@hongbo-miao
Copy link

hongbo-miao commented Nov 29, 2023

Hi @ebyhr, considering the substantial size of this pull request, squashing it early might pose challenges for ongoing work, such as addressing conflicts. Once the pull request ready, it would be great to have the team's approval and proceed with a squash merge. Looking forward to a merge soon! ☺️

@yuuteng
Copy link
Contributor

yuuteng commented Nov 29, 2023

After updating this PR with the latest trino version, there are some new errors in the previous tests. I am trying to fix them and also I'm trying to reach @dprophet to have a period of time free to review and merge it. Because as we know that with the upgrade of version there are new issues arising. We have to merge it asap after I fix again. 😄

@bitsondatadev
Copy link
Member

@colebow @mosabua I am a bit pressed for time, this week but I would be available in the following weeks to help land this. Would one of you either mind starting to reach out to everyone involved to create a plan on landing this?

I think at this point we need to coordinate what is needed on the PR with @ebyhr, when Yuya and @dprophet will have availability to quickly review, and synchronize the update and review process around the same times to avoid long pauses that then require large merges and breaks previous iterations. That's certainly not fun for @yuuteng to keep up. This is a brand new connector that has quite a few extra concerns around testing and I think a coordinated effort will help this last mile from stagnating. Let me know if either of you are up for it, otherwise happy to raise my hand to coordinate this in the coming weeks.

@mosabua
Copy link
Member

mosabua commented Nov 29, 2023

Agreed .. we really should try to get this over the finish line .. ideally before Trino Summit in two weeks ;-)

I don't know about availability for these efforts from @dprophet and @yuuteng and where the current hold up is - next step is probably to resolve the conflicts.

@bitsondatadev
Copy link
Member

Right but I think @yuuteng doesn't want to keep doing this unless he knows this will the The Last Merge Conflict™.

So I think having DevRel do some upfront coordination to make sure Teng isn't wasting his time would be good at this point.

@yuuteng
Copy link
Contributor

yuuteng commented Dec 4, 2023

@bitsondatadev Thanks for understanding. I will take some time to fix it today and I think I may need some help from @ebyhr . As for the process, I think it will be better if I create a new PR directly from my repository and add @dprophet as reviewer inside so that I can save time to find a time with @dprophet first, and he spent also time to resolve the conflicts. I guess it will not be finished in one commit. In this way, it would be more efficient. I'm talking about it with him at the same time for his opinion.

@github-actions github-actions bot added release-notes jdbc Relates to Trino JDBC driver hudi Hudi connector labels Dec 5, 2023
@ebyhr
Copy link
Member

ebyhr commented Feb 26, 2024

Please fix CI failures.

errorprone

Error:  Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project trino-snowflake: Compilation failure: Compilation failure: 
Error:  /home/runner/work/trino/trino/plugin/trino-snowflake/src/main/java/io/trino/plugin/snowflake/SnowflakeClient.java:[367,16] [UnusedVariable] The local variable 'jdbcTypeName' is never read.
Error:      (see https://errorprone.info/bugpattern/UnusedVariable)
Error:    Did you mean to remove this line or 'typeHandle.getJdbcTypeName()'?
Error:  /home/runner/work/trino/trino/plugin/trino-snowflake/src/main/java/io/trino/plugin/snowflake/SnowflakeClient.java:[369,13] [UnusedVariable] The local variable 'type' is never read.
Error:      (see https://errorprone.info/bugpattern/UnusedVariable)
Error:    Did you mean to remove this line or 'typeHandle.getJdbcType();'?

trino-snowflake

Error:  Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.2.5:test (default-test) on project trino-snowflake: No tests were executed!  (Set -DfailIfNoTests=false to ignore this error.) -> [Help 1]

trino-snowflake cloud-tests

Error:  Errors: 
Error:    TestSnowflakeConnectorTest>AbstractTestQueryFramework.init:113->createQueryRunner:29 » ExceptionInInitializer
Error:    TestSnowflakeTypeMapping>AbstractTestQueryFramework.init:113->createQueryRunner:78 » NoClassDefFound Could not initialize class io.trino.plugin.snowflake.TestingSnowflakeServer

and product tests

@ebyhr ebyhr removed release-notes jdbc Relates to Trino JDBC driver tests:hive hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector bigquery BigQuery connector mongodb MongoDB connector labels Feb 28, 2024
*****

Update ci and delete no use lines (#20)

Approved
Update according to reviews 11/01/2024

Various style fixes and cleanup (#15) (#17)

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Various style fixes and cleanup (#15)

Update the github CI (#12)

* Add Snowflake JDBC Connector

* Add snowflake in the ci
Add Snowflake JDBC Connector (#11)

Had to redo the connector because all the rebases caused havoc

// Mappings for JDBC column types to internal Trino types
final Map<Integer, ColumnMapping> standardColumnMappings = ImmutableMap.<Integer, ColumnMapping>builder()
.put(Types.BOOLEAN, StandardColumnMappings.booleanColumnMapping())
Copy link
Member

@lxynov lxynov Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated.

For example, types INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT are actually synonymous with NUMBER(38, 0) in Snowflake (ref). Their internal "data_type" is always a constant FIXED (ref). When they're read in via JDBC, their SQL type depends on Snowflake session property JDBC_TREAT_DECIMAL_AS_INT. By default its true, so the SQL type is always Types.BIGINT. This logic is in Snowflake JDBC code: https://github.com/snowflakedb/snowflake-jdbc/blob/d4ca9f0e99b142fa0c1a3dffe5cc7bde4fdfaf61/src/main/java/net/snowflake/client/jdbc/SnowflakeUtil.java#L177.

So:

  1. We actually don't need mappings for INT, INTEGER, SMALLINT, TINYINT, etc. as they will never be used.
  2. There will be an issue if a user creates an BIGINT column in Snowflake and uses it to store a value larger than Long.MAX_VALUE.

You can reproduce the example issue:

In Snowflake

CREATE TABLE testdb.test1.t_large_bigint (c_bigint BIGINT);

INSERT INTO testdb.test1.t_large_bigint VALUES(99999999999999999999999999999999999998);

This query

SELECT * FROM testdb.test1.t_large_bigint

will succeed in Snowflake but fails in current connector implementation with an error

Query 20240305_052805_00005_6tebh failed: Cannot convert value in the driver from type:FIXED(null,null) to type:Long, value=99999999999999999999999999999999999998.

A safer way is to set JDBC_TREAT_DECIMAL_AS_INT to false and always read in decimal values.

^The above is an example. It applies to other data types, i.e., DOUBLE is used for all floating-point numbers, VARCHAR is used for all text types, etc. We will have to reference both Snowflake doc (here and here) and JDBC code for source-of-truths so that type conversion correctness is ensured: https://github.com/snowflakedb/snowflake-jdbc/blob/d4ca9f0e99b142fa0c1a3dffe5cc7bde4fdfaf61/src/main/java/net/snowflake/client/jdbc/SnowflakeUtil.java#L205-L321

cc @ebyhr @dprophet @yuuteng

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to fix it when I have time. If you can give me a PR directly, it will be helpful. 😄
@dprophet If you have time you can also take a look on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @yuuteng ! Actually I was working on co-authoring a new one but it took longer than I expected. I just commented here in case you folks are working on updates in parallel. I'll see how much I can get it done this week, btw please feel free to work on it on your side as well, I could help comment in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see that I am still struggling to work on the fixes about tests and ci problem. I will come back to see this when I clean them. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxynov Lets not fix this right now. Connectors dont need to be perfect on day one. This PR has been going on for over a year. We can do supplimental PR's once this one is merged.

Copy link
Member

@ebyhr ebyhr Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dprophet Skipping this comment is fine. However, please file an issue to https://github.com/trinodb/trino/issues/new

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. #20977

@ebyhr
Copy link
Member

ebyhr commented Mar 5, 2024

@yuuteng Please fix the following failures:

Error:  Failures: 
Error:    TestSnowflakeConnectorTest>BaseConnectorTest.testCommentTable:3914 
expected: null
 but was: ""
Error:    TestSnowflakeConnectorTest>BaseConnectorTest.testDataMappingSmokeTest:5579->BaseConnectorTest.testDataMapping:5611->AbstractTestQueryFramework.assertQuery:350 Execution of 'actual' query 20240305_101440_01101_pf2sh failed: SELECT row_id FROM test_data_mapping_smoke_timestampas9yk3wb3e WHERE rand() = 42 OR value IS NULL
Error:    TestSnowflakeConnectorTest>BaseConnectorTest.testSelectInformationSchemaColumns:2107->AbstractTestQueryFramework.assertQuery:350 Execution of 'actual' query 20240305_100625_00287_pf2sh failed: SELECT table_schema FROM information_schema.columns WHERE table_schema = 'tpch' GROUP BY table_schema
Error:  Errors: 
Error:    TestSnowflakeConnectorTest>BaseConnectorTest.testCommentTableSpecialCharacter:5522->BaseConnectorTest.testCommentTableSpecialCharacter:5538->AbstractTestQueryFramework.assertUpdate:405->AbstractTestQueryFramework.assertUpdate:410 » QueryFailed This connector does not support setting table comments
Error:    TestSnowflakeTypeMapping.testTimestamp:306->testTimestamp:344 » QueryFailed Unsupported column type(93):timestampntz
[INFO] 
Error:  Tests run: 313, Failures: 3, Errors: 2, Skipped: 100

https://github.com/trinodb/trino/actions/runs/8154337829/job/22287613018?pr=20895

yuuteng and others added 2 commits March 6, 2024 11:34
* fixup! Add Snowflake JDBC Connector

* Fix tests

* Add the handle for timestamptz back to fix testDataMappingSmokeTest

---------

Co-authored-by: Yuya Ebihara <ebyhry@gmail.com>, Erik Anderson <eanders@pobox.com>
@ebyhr ebyhr merged commit 1cdf55d into trinodb:master Mar 7, 2024
96 of 98 checks passed
@github-actions github-actions bot added this to the 440 milestone Mar 7, 2024
@hongbo-miao
Copy link

Wow, congratulations! It's finally merged! Big thanks to @yuuteng and @dprophet and the entire Trino team for all the hard work on this! 🎉 🥳 🎊 🥂

@bitsondatadev
Copy link
Member

❤️🎉🎊Congratulations team ❄️🐇!! Incredible work and persistence @dprophet, @yuuteng, @ebyhr, @martint, @mosabua, and @findepi.

Special thanks to Bloomberg and ForePaaS for dedicating a large portion of time to get this right. Thanks to Bloomberg for providing the integration testing infrastructure (Snowflake resources) to the Trino Software Foundation.

This massive undertaking wouldn't have been possible without you all!

We started investigating why the first two attempts to create a Snowflake connector failed. It turned out that that each time there was a discussion for how to do integration testing, and both PRs stalled. After seeing this, we noticed @yuuteng had already done a significant amount of work for a third attempt to contribute a Snowflake connector in use at ForePaaS. Not long after I had reached out to Teng to brainstorm a solution, @dprophet reached out to get my thoughts on how to go about contributing Bloomberg's connector. Great, now we have two connector implementations and no solution still.

During the first Trino Contributor Congregation, Erik and I brought up Bloomberg's desire to contribute a Snowflake connector and I articulated the testing issue. Erik said something to the effect of, "Oh I wish I would have known that's the problem, the solution is simple, Bloomberg will provide the TSF a Snowflake account." Done!

Shortly after Erik, Teng, and I discussed the best ways to merge their work, set up the Snowflake accounts for the TSF, and start the arduous process of building a thorough test suite with the help of Yuya, Piotr, Manfred, and Martin.

Despite all of the setbacks and infra issues, everyone persisted. We were almost sure this was going to land late summer of 2023. So much so that the goal was to discuss the release during the July presentation of this connector! Regardless, everyone involved persisted.

What a great outcome and example of open source bringing positive sum value to the world of data and analytics. Y'all are incredible!

Federate them all!

❤️❤️

-Bits

@yuuteng
Copy link
Contributor

yuuteng commented Mar 7, 2024

I'm overjoyed to see the Snowflake connector merged today!🎉 This journey started over two years ago on December 22nd, 2021, and it wouldn't have been possible without the tireless efforts of everyone involved. ❄️🐇

@bitsondatadev, thank you for rescuing this project from limbo and revitalizing it, leading to the incredible collaboration between Bloomberg, Trino community and ForePaaS!

@ebyhr, thank you for your patience and guidance over the past two years, reviewing our code and answering our questions like a wise teacher. You've played a vital role in my growth!

@pgagnon, the first contributor who paved the way! I wouldn't know where to start without your work. Thank you for your initial contribution, and I hope you're proud of seeing it merged today!

@dprophet, my dear partner, it's been two amazing years working with you! We overcame time zones and different holidays while working in different countries and companies. Finally, our "baby" is merged! Thank you for making our work shine ✨. You're a fantastic partner!

@cploujoux and @drappier-charles, my managers, and all my colleagues who cared about this project. Thank you for giving me the opportunity to work on this challenging and rewarding task. Your support and advice have been invaluable!

To all the friends who offered help and reviewed our code: Thank you for your enthusiasm and dedication! Your contributions, big or small, have made a real difference. Although I can't list all your names here, your kindness and support are deeply appreciated.

P.S. Sorry for turning this commit message into a heartfelt essay. The past 2 years and 3 months have been an unforgettable journey. Thanks for reading to here, and see you all on the next adventure!

Best wishes,

❤️❤️

Teng (@yuuteng)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.