Using the 'with mesh' context manager for shard map #22280

ASKabalan · 2024-07-04T18:22:59Z

ASKabalan
Jul 4, 2024

Hello,

I have been using custom_partitioning quite alot and it works very well.

But sometimes I want to define a per_shard implementation for something I can write in JAX (I don't want to rewrite the differentiation rule)

for example :

from jax.sharding import Mesh, NamedSharding
from jax.sharding import PartitionSpec as P
from jax.experimental import mesh_utils

devices = mesh_utils.create_device_mesh((2,2))
mesh = Mesh(devices, axis_names=('a', 'b'))
sharding = jax.sharding.NamedSharding(mesh, P('a', 'b'))

with mesh :
    out = my_custom_partitioned_fn(args)

Is very convinient if I have a cuda or cpp primitive that I want to use in my custom partitioned function.

But for something pure jax I have to do

from jax.experimental.shard_map import shard_map

devices = mesh_utils.create_device_mesh((2,2))
mesh = Mesh(devices, axis_names=('a', 'b'))
sharding = jax.sharding.NamedSharding(mesh, P('a', 'b'))

 @partial(
      shard_map,
      mesh=mesh,
      in_specs=(P('a', 'b'), P('a', 'b')),
      out_specs=P('a', 'b'))
      def my_sharded_fn(x, y):
            # do something with x and y
            return x + y

out = my_sharded_fn(args)

Which is not as convinient as the first example.

My workaround is to define a custom partitioning rule with a pure jax implementation, but I have to rewrite the differentiation rule.

Does anyone know a better way to do this?
Can we somehow use the context mesh with shard_map

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using the 'with mesh' context manager for shard map #22280

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Using the 'with mesh' context manager for shard map #22280

ASKabalan Jul 4, 2024

Replies: 0 comments

ASKabalan
Jul 4, 2024