model parallelism #243
-
I would like to ask when the model can support parallelism inference? |
Beta Was this translation helpful? Give feedback.
Answered by
zhuohan123
Jun 25, 2023
Replies: 1 comment
-
Thanks for the question. All our models already supports tensor parallel execution. For example, if you have 2 GPUs, you can pass in argument |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
zhuohan123
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for the question. All our models already supports tensor parallel execution. For example, if you have 2 GPUs, you can pass in argument
--tesnor-parallel-size 2
or-tp 2
. We will add documents on distributed execution (#206).