Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequentials _maybe_rebuild does not make sense #19916

Closed
DLumi opened this issue Jun 25, 2024 · 7 comments
Closed

Sequentials _maybe_rebuild does not make sense #19916

DLumi opened this issue Jun 25, 2024 · 7 comments

Comments

@DLumi
Copy link

DLumi commented Jun 25, 2024

If we check the code for build it has a failsafe for cases when the first layer of the model is not an Input layer:

Here's the snippet from the source code

        if isinstance(self._layers[0], InputLayer):
            ...
        else:
            dtype = self._layers[0].compute_dtype
            self._layers = [
                InputLayer(batch_shape=input_shape, dtype=dtype)
            ] + self._layers

Yet the _maybe_rebuild's code ignores rebuilding entirely if input layer is missing:

    def _maybe_rebuild(self):
        self.built = False
        self._functional = None
        if isinstance(self._layers[0], InputLayer) and len(self._layers) > 1:
            input_shape = self._layers[0].batch_shape
            self.build(input_shape)

Due to that the rebuild=True argument in add() is basically ignored entirely.
The only option here is to call build() explicitly after you added all the layers.
Am I missing something? Is this the intended behavior? If so, it would be great to expand documentation on that.

@mehtamansi29 mehtamansi29 added the keras-team-review-pending Pending review by a Keras team member. label Jul 15, 2024
@mattdangerw
Copy link
Member

@DLumi is the issue that certain layers won't get rebuilt without the InputLayer on the model?

If you have a colab showing the actual issue this causes, that would help us!

@DLumi
Copy link
Author

DLumi commented Jul 18, 2024

@mattdangerw yes, you got it. The model will not be rebuilt on .add() if the first layer of Sequential subclass is not InputLayer.
It's not a super huge deal, and more of unnecessary inconvenience which also happens to be a bit obscure.
I could make a colab, but there's nothing to show, really. It's just a custom Sequential subclass and some layers added in __init__

@mattdangerw
Copy link
Member

mattdangerw commented Jul 19, 2024

This might be expected.

Thinking about the simple case.

class CustomSequential(...):
    __init__():
        ...
        self.add(Dense(10))
        self.add(Dense(10))
        self.add(Dense(10))

In this case, we have no idea what the input shape is. So we can't build yet.

I think this is how it's supposed to work.

  • When add() is called and the input shape is known via InputLayer or input_shape we can build.
  • When add() is called and the input shape is unknown we cannot build.
  • After build() is called, the first layer will be guaranteed to be an InputLayer. So subsequent call to add() after build() will always keep the model built.

But if we really don't know the input shape yet -- the model does not define it, the model has not been called or built -- then it is correct for maybe rebuild to just skip.

If this still seems like a bug, a colab might help understand the use case better.

@mattdangerw mattdangerw added stat:awaiting response from contributor and removed keras-team-review-pending Pending review by a Keras team member. labels Jul 19, 2024
@DLumi
Copy link
Author

DLumi commented Jul 19, 2024

@mattdangerw I thought that might be expected, but then I've noticed that there's a failsafe for that that implicitly inserts InputLayer if it's not provided by user. So if build ensures InputLayer as a first layer regardless of what there actually is, it makes no sense that _maybe_rebuild restricts the model from actually calling build.

That's why for me it's not exactly a bug, rather an inconsistent behavior. And to address consistency, I'd lean into one or the other. So either prompt the user to explicitly provide an InputLayer as the model cannot be built without it, or keep it implicit but remove restriction in _maybe_rebuild

@mattdangerw
Copy link
Member

mattdangerw commented Jul 19, 2024

@DLumi there is a reason for that difference actually. If you look at the top of the build() method, it will early return without doing anything if no shape is supplied. Calling build() without an input shape in _maybe_rebuild() would do absolutely nothing.

In add() and anything downstream, we don't know the real shape of the input, which is needed to build the model. Unless the user has specified it with InputLayer or input_shape.

Build can be triggered a few ways.

sequential.build(real_input_shape)
# or
sequential(real_input)

In both those cases we now know the shape that was previously unknown. And it will be propagated to the build call. So that's the answer to the first question of why can't we just always rebuild in add().

@mattdangerw
Copy link
Member

mattdangerw commented Jul 19, 2024

As for the second question. Why can't we just demand an InputLayer is first in the model?

Honestly I suspect we would if we could. That's a fairly simple approach, and means the model can always be built "eagerly" much like we do with functional models.

However, a lot of code out in the wild depends on the fact that you can make a sequential with an unknown shape, which will be built the first time it is called. We don't want to ship a change that breaking.

@mattdangerw
Copy link
Member

I think this is all works as intended. But feel free to reopen if above explanations seem incorrect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants