Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(csharp/src/Drivers/Apache/Spark): low latency test cases #1948

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jadewang-db
Copy link
Contributor

No description provided.

@github-actions github-actions bot added this to the ADBC Libraries 13 milestone Jun 27, 2024
@CurtHagenlocher CurtHagenlocher changed the title low latency test cases feat(csharp/src/Drivers/Apache/Spark): low latency test cases Jul 18, 2024
@CurtHagenlocher
Copy link
Contributor

I'm sorry; I seem to have missed this. It looks like there might be some imminent merge conflicts with #2014 (attn @birschick-bq).

{
TExecuteStatementReq executeRequest = new TExecuteStatementReq(this.connection.sessionHandle, this.SqlQuery);
SetStatementProperties(executeRequest);
executeRequest.GetDirectResults = new TSparkGetDirectResults(BatchSizeDefault);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set it to the "current" value of BatchSize, in case it's been update through configuration/programmatically.

Suggested change
executeRequest.GetDirectResults = new TSparkGetDirectResults(BatchSizeDefault);
executeRequest.GetDirectResults = new TSparkGetDirectResults(BatchSize);

ThriftHttpTransport transport = new ThriftHttpTransport(httpClient, config);
_httpClient.BaseAddress = new UriBuilder(Uri.UriSchemeHttps, hostName, -1, path).Uri;
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
_httpClient.DefaultRequestHeaders.UserAgent.ParseAdd("SimbaSparkJDBCDriver/2.06.15 Python/PyHive");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd update the UserAgent value and leave a comment as to why this needs to be particular constant. Ideally, the ADBC driver agent name could also be added to your whitelist.

private static readonly char[] Characters = "0123456789abcdef".ToCharArray();
private static readonly Random Random = new Random();

public string Generate()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a more descriptive name

Suggested change
public string Generate()
public string GenerateTraceId()

long count = 0;
long target = 80;
Queue<String> log = new Queue<string>();
string FileName = $@"c:\temp\hmsmeta\{DateTime.Now.ToString("yyyyMMdd_HHmm")}__{target}.log";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a logFolder property into the configuration file properties. This is not likely portable or desirable.

Comment on lines +308 to +311
for (int i = 0; i < 32; i++)
{
result.Append(Characters[Random.Next(Characters.Length)]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be extracted to a function with the length as the parameter and then resused for lines 315-318.

@@ -32,6 +32,17 @@ internal SparkStatement(SparkConnection connection)
: base(connection)
{
}
public override QueryResult ExecuteQuery()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: empty line

Suggested change
public override QueryResult ExecuteQuery()
public override QueryResult ExecuteQuery()

@@ -32,6 +32,17 @@ internal SparkStatement(SparkConnection connection)
: base(connection)
{
}
public override QueryResult ExecuteQuery()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this override code be applied to ExecuteQueryAsync, too?

Comment on lines +37 to +42
var conn = (connection as SparkConnection);

if(conn != null)
{
conn.ResetTraceId();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More concise?

 ```suggestion
if (connection is SparkConnection sparkConnection) sparkConnection.ResetTraceId();

statement.SqlQuery = TestConfiguration.Query;

QueryResult queryResult = statement.ExecuteQuery();
for (int i = 0; i < target; i++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test could be refactored to handle any test "kernel" (lambda) to wrap the test such that we could capture performance information in a log file. What do you think?

Comment on lines +536 to +540
using AdbcStatement statement = adbcConnection.CreateStatement();
statement.SqlQuery = TestConfiguration.Query;

QueryResult queryResult = statement.ExecuteQuery();
Tests.DriverTests.CanExecuteQuery(queryResult, TestConfiguration.ExpectedResultsCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be the lambda fed into the performance logger.

@jadewang-db
Copy link
Contributor Author

sorry, I didn't meant to ask for PR right away, since this is still at initial stage, not sure how the reviewers got added.

@birschick-bq
Copy link
Contributor

You can set it to be a Draft PR

@jadewang-db jadewang-db marked this pull request as draft July 19, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants