Data Parallel - Managing State Variables #3264

peterdavidfagan · 2023-08-07T16:22:37Z

peterdavidfagan
Aug 7, 2023

Hi Flax Community,

I am getting started with transforming my flax training pipeline into a format that can run across multiple devices. I've been reading the following guide to accomplish this. I had some clarifying questions that I was hoping to ask (and post here in case the answer to these are also useful to other users).

Managing State Variables
For layers that maintain variables such as nn.BatchNorm (API docs), is it sufficient to pass the logical axis name to the axis_name parameter for state variables to be correctly tracked when training across multiple devices? The API docs for BatchNorm reference pmap but don't mention partitioning with jit. Does this parameter also apply to partitioning with jit as outlined in the guide? I presume yes, but I haven't delved into the codebase for jit yet to verify.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Parallel - Managing State Variables #3264

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Data Parallel - Managing State Variables #3264

peterdavidfagan Aug 7, 2023

Replies: 0 comments

peterdavidfagan
Aug 7, 2023