Replies: 2 comments 3 replies
-
I think this discussion is relevant: |
Beta Was this translation helpful? Give feedback.
-
Hi @totor31 -- thanks for reaching out, and sorry for the late reply!
It's up to you -- I've seen projects where the full model returns
I agree -- do you mind filing a GitHub issue so that we can track this? IIUC you're asking for an assertion that the shapes of variables are the same as what's defined in the call to
Indeed it's a bit annoying to do this kind of "top-level variable dict surgery" when manipulation params and other state separately. |
Beta Was this translation helpful? Give feedback.
-
Hello, I don't know if this is the right place where to put these comments, anyhow, maybe it may be of some use for your team.
First, I am having a lot of fun playing with the new linen interface of Flax. Congrats to the team for your work, this is huge.
Everything is cristal clear, it is good to have the ability to know exactly what is going on, step by steps.
As a side note, I often train networks for production purposes, and I am really anxious on how big frameworks like Keras and Torch hide things underneath. I really need to interact with the code at many levels in order to control what is going on, both at training, inference, debugging and unit testing, hence my interest in your work.
From the architecture point of view, I am interested in two big things:
Following many discussions, I have been able to implement a toy example quite easily for the first case ie. siamese network instanciation and transfert. Here is the example : https://colab.research.google.com/drive/1rplADiwAoDI8byx_foHE-Eoq2e_6ur4y?usp=sharing
While playing with the example, it raised a couple of questions:
Can a module
__call__
return multiple output such as atuple
x1,x2
or a dict of outputs , are there any constraints or best practice ? Should it be avoided ?While transfering weights from a network to another I was able to overwrite the module definition (
n_filters
ofConv
) without raising any error. I think it might be in flax philosophy to have sanity checks while initializing module with params, in order to avoid silently overwriting the module definition with init parameters...I have a mixed feeling about the fact that the
keys
of the nested dict of params can represent two different things : thename
of a block of parameters in lower levels of the tree or thetype
of parameters in the first nodes (for instancebatch_stats
vsparams
). I love the fact that it is simple to handle, but in the end , I feel like I will have to write a couple of utility functions to handle the structure (such as split in N different trees, apply a function, and merge back - I actually had to write something like this in the transfert example). Maybe there is something to figure out here or maybe the current state is so easy to understand that it is the best way for doing things.Beta Was this translation helpful? Give feedback.
All reactions