Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document filter to retrieval query to improve results on large sets #126

Merged
merged 1 commit into from
Mar 7, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@
COMMIT;";

command.Parameters.AddWithValue("@index", index);
command.Parameters.AddWithValue("@key", record.Id);

Check warning on line 160 in src/KernelMemory.MemoryStorage.SqlServer/SqlServerMemory.cs

View workflow job for this annotation

GitHub Actions / build

In externally visible method 'Task SqlServerMemory.DeleteAsync(string index, MemoryRecord record, CancellationToken cancellationToken = default(CancellationToken))', validate parameter 'record' is non-null before using it. If appropriate, throw an 'ArgumentNullException' when the argument is 'null'. (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1062)

await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);

Expand Down Expand Up @@ -306,6 +306,8 @@

using (SqlCommand command = connection.CreateCommand())
{
var generatedFilters = GenerateFilters(index, command.Parameters, filters);

command.CommandText = $@"
WITH
[embedding] as
Expand All @@ -331,6 +333,10 @@
[embedding]
INNER JOIN
{this.GetFullTableName($"{this._config.EmbeddingsTableName}_{index}")} ON [embedding].vector_value_id = {this.GetFullTableName($"{this._config.EmbeddingsTableName}_{index}")}.vector_value_id
INNER JOIN
{this.GetFullTableName(this._config.MemoryTableName)} ON {this.GetFullTableName($"{this._config.EmbeddingsTableName}_{index}")}.[memory_id] = {this.GetFullTableName(this._config.MemoryTableName)}.[id]
WHERE 1=1
{generatedFilters}
GROUP BY
{this.GetFullTableName($"{this._config.EmbeddingsTableName}_{index}")}.[memory_id]
ORDER BY
Expand All @@ -348,7 +354,7 @@
{this.GetFullTableName(this._config.MemoryTableName)} ON [similarity].[memory_id] = {this.GetFullTableName(this._config.MemoryTableName)}.[id]
WHERE 1=1
AND [cosine_similarity] >= @min_relevance_score
{GenerateFilters(index, command.Parameters, filters)}
{generatedFilters}
ORDER BY [cosine_similarity] desc";

command.Parameters.AddWithValue("@vector", JsonSerializer.Serialize(embedding.Data.ToArray()));
Expand Down Expand Up @@ -445,7 +451,7 @@
COMMIT;";

command.Parameters.AddWithValue("@index", index);
command.Parameters.AddWithValue("@key", record.Id);

Check warning on line 454 in src/KernelMemory.MemoryStorage.SqlServer/SqlServerMemory.cs

View workflow job for this annotation

GitHub Actions / build

In externally visible method 'Task<string> SqlServerMemory.UpsertAsync(string index, MemoryRecord record, CancellationToken cancellationToken = default(CancellationToken))', validate parameter 'record' is non-null before using it. If appropriate, throw an 'ArgumentNullException' when the argument is 'null'. (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1062)
command.Parameters.AddWithValue("@payload", JsonSerializer.Serialize(record.Payload) ?? (object)DBNull.Value);
command.Parameters.AddWithValue("@tags", JsonSerializer.Serialize(record.Tags) ?? (object)DBNull.Value);
command.Parameters.AddWithValue("@embedding", JsonSerializer.Serialize(record.Vector.Data.ToArray()));
Expand Down Expand Up @@ -590,7 +596,7 @@
return filterBuilder.ToString();
}

private async Task<MemoryRecord> ReadEntryAsync(SqlDataReader dataReader, bool withEmbedding, CancellationToken cancellationToken = default)

Check warning on line 599 in src/KernelMemory.MemoryStorage.SqlServer/SqlServerMemory.cs

View workflow job for this annotation

GitHub Actions / build

Member 'ReadEntryAsync' does not access instance data and can be marked as static (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1822)
{
var entry = new MemoryRecord();

Expand Down Expand Up @@ -625,7 +631,7 @@
index = Constants.DefaultIndex;
}

index = s_replaceIndexNameCharsRegex.Replace(index.Trim().ToLowerInvariant(), ValidSeparator);

Check warning on line 634 in src/KernelMemory.MemoryStorage.SqlServer/SqlServerMemory.cs

View workflow job for this annotation

GitHub Actions / build

In method 'NormalizeIndexName', replace the call to 'ToLowerInvariant' with 'ToUpperInvariant' (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1308)


return index;
Expand Down
Loading