diff --git a/fern/docs/pages/manual/ingestion.mdx b/fern/docs/pages/manual/ingestion.mdx index ece3690138..009dcf8bb1 100644 --- a/fern/docs/pages/manual/ingestion.mdx +++ b/fern/docs/pages/manual/ingestion.mdx @@ -33,16 +33,20 @@ Are you running out of memory when ingesting files? To do not run out of memory, you should ingest your documents without the LLM loaded in your (video) memory. To do so, you should change your configuration to set `llm.mode: mock`. -In other words, you should update your `settings.yaml` (or your custom configuration file) to set the -following **before** ingesting your documents: +You can also use the existing `PGPT_PROFILES=mock` that will set the following configuration for you: + ```yaml llm: mode: mock +embedding: + mode: local ``` +This configuration allows you to use hardware acceleration for creating embeddings while avoiding loading the full LLM into (video) memory. + Once your documents are ingested, you can set the `llm.mode` value back to `local` (or your previous custom value). -You can also use the existing `PGPT_PROFILES=mock` that will set the `llm.mode` to `mock` for you. + ## Supported file formats diff --git a/settings-mock.yaml b/settings-mock.yaml index ab16fae49b..8f9c01f7c7 100644 --- a/settings-mock.yaml +++ b/settings-mock.yaml @@ -1,5 +1,8 @@ server: env_name: ${APP_ENV:mock} +# This configuration allows you to use GPU for creating embeddings while avoiding loading LLM into vRAM llm: mode: mock +embedding: + mode: local