Skip to content

Running multiple jobs, one per TPU core? #16629

Answered by ayaka14732
rog77 asked this question in Q&A
Discussion options

You must be logged in to vote

Solution: https://twitter.com/ayaka14732/status/1589274652354162690

I am also creating a library about this: https://github.com/ayaka14732/llama-jax/blob/main/lib/proc_init_utils/initialisation.py

Usage:

1.py

from lib.proc_init_utils import initialise_tpu; initialise_tpu('v4-16', n_devices=1, rank=0)

2.py

from lib.proc_init_utils import initialise_tpu; initialise_tpu('v4-16', n_devices=1, rank=1)

3.py

from lib.proc_init_utils import initialise_tpu; initialise_tpu('v4-16', n_devices=1, rank=2)

4.py

from lib.proc_init_utils import initialise_tpu; initialise_tpu('v4-16', n_devices=1, rank=3)

Replies: 3 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
2 replies
@rog77
Comment options

@ayaka14732
Comment options

Answer selected by rog77
Comment options

You must be logged in to vote
1 reply
@hawkinsp
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants