AI runtime configuration scheme #373
nikos-livathinos
started this conversation in
Ideas
Replies: 2 comments 1 reply
-
In terms of resolution priority, we should do (from higher priority to lower):
|
Beta Was this translation helpful? Give feedback.
0 replies
-
To streamline this config mgmt part, let's try to leverage Pydantic Settings as much as possible, as it already covers many of the pain points. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Objective
Introduce the logic to do the resolution of input parameters and decide the AI runtime configuration based on:
Proposal
A. Introduce input parameters in
docling
package:PipelineOptions
to introduce the fields:device
,num_threads
.device
parameter can be anEnum
with values:[AUTO, CUDA, CPU, MPS]
. The default value isAUTO
.num_threads
parameter is an integer. The default value is 4.[DOCLING_DEVICE, DOCLING_NUM_THREADS]
or[DOCLING_DEVICE, OMP_NUM_THREADS]
.device
andnum_threads
.B. Introduce configuration resolution logic as a utility function in
docling-ibm-models
package:[CUDA, MPS, CPU]
.5a. If
device == AUTO
, use the first available device according to the precedence.5b. If
device
is explicitly set by the user but it is not available in the system, replace it with the next available device.C. Usage:
pipeline_options
.docling-ibm-models
is used to resolve the runtime configuration.docling-ibm-models
package or by the "models" inside thedocling
package.Beta Was this translation helpful? Give feedback.
All reactions