Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Force AI to use existing tags (instead of creating them) #612

Merged
merged 3 commits into from
Nov 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 42 additions & 1 deletion apps/workers/openaiWorker.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { and, Column, eq, inArray, sql } from "drizzle-orm";
import { DequeuedJob, Runner } from "liteque";
import { buildImpersonatingTRPCClient } from "trpc";
import { z } from "zod";

import type { InferenceClient } from "@hoarder/shared/inference";
Expand Down Expand Up @@ -200,7 +201,47 @@ async function fetchCustomPrompts(
},
});

return prompts.map((p) => p.text);
let promptTexts = prompts.map((p) => p.text);
if (containsTagsPlaceholder(prompts)) {
promptTexts = await replaceTagsPlaceholders(promptTexts, userId);
}

return promptTexts;
}

async function replaceTagsPlaceholders(
prompts: string[],
userId: string,
): Promise<string[]> {
const api = await buildImpersonatingTRPCClient(userId);
const tags = (await api.tags.list()).tags;
const tagsString = `[${tags.map((tag) => tag.name).join(",")}]`;
const aiTagsString = `[${tags
.filter((tag) => tag.numBookmarksByAttachedType.human ?? 0 == 0)
.map((tag) => tag.name)
.join(",")}]`;
const userTagsString = `[${tags
.filter((tag) => tag.numBookmarksByAttachedType.human ?? 0 > 0)
.map((tag) => tag.name)
.join(",")}]`;

return prompts.map((p) =>
p
.replaceAll("$tags", tagsString)
.replaceAll("$aiTags", aiTagsString)
.replaceAll("$userTags", userTagsString),
);
}

function containsTagsPlaceholder(prompts: { text: string }[]): boolean {
return (
prompts.filter(
(p) =>
p.text.includes("$tags") ||
p.text.includes("$aiTags") ||
p.text.includes("$userTags"),
).length > 0
);
}

async function inferTagsFromPDF(
Expand Down
5 changes: 5 additions & 0 deletions docs/docs/03-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,11 @@ Either `OPENAI_API_KEY` or `OLLAMA_BASE_URL` need to be set for automatic taggin
| INFERENCE_LANG | No | english | The language in which the tags will be generated. |
| INFERENCE_JOB_TIMEOUT_SEC | No | 30 | How long to wait for the inference job to finish before timing out. If you're running ollama without powerful GPUs, you might want to increase the timeout a bit. |

:::info
- You can append additional instructions to the prompt used for automatic tagging, in the `AI Settings` (in the `User Settings` screen)
- You can use the placeholders `$tags`, `$aiTags`, `$userTags` in the prompt. These placeholders will be replaced with all tags, ai generated tags or human created tags when automatic tagging is performed (e.g. `[hoarder, computer, ai]`)
:::

## Crawler Configs

| Name | Required | Default | Description |
Expand Down
Loading