Sequentials `_maybe_rebuild` does not make sense #19916

DLumi · 2024-06-25T14:26:40Z

If we check the code for build it has a failsafe for cases when the first layer of the model is not an Input layer:

Here's the snippet from the source code

        if isinstance(self._layers[0], InputLayer):
            ...
        else:
            dtype = self._layers[0].compute_dtype
            self._layers = [
                InputLayer(batch_shape=input_shape, dtype=dtype)
            ] + self._layers

Yet the _maybe_rebuild's code ignores rebuilding entirely if input layer is missing:

    def _maybe_rebuild(self):
        self.built = False
        self._functional = None
        if isinstance(self._layers[0], InputLayer) and len(self._layers) > 1:
            input_shape = self._layers[0].batch_shape
            self.build(input_shape)

Due to that the rebuild=True argument in add() is basically ignored entirely.
The only option here is to call build() explicitly after you added all the layers.
Am I missing something? Is this the intended behavior? If so, it would be great to expand documentation on that.

The text was updated successfully, but these errors were encountered:

mattdangerw · 2024-07-18T17:16:40Z

@DLumi is the issue that certain layers won't get rebuilt without the InputLayer on the model?

If you have a colab showing the actual issue this causes, that would help us!

DLumi · 2024-07-18T17:35:23Z

@mattdangerw yes, you got it. The model will not be rebuilt on .add() if the first layer of Sequential subclass is not InputLayer.
It's not a super huge deal, and more of unnecessary inconvenience which also happens to be a bit obscure.
I could make a colab, but there's nothing to show, really. It's just a custom Sequential subclass and some layers added in __init__

mattdangerw · 2024-07-19T01:54:32Z

This might be expected.

Thinking about the simple case.

class CustomSequential(...):
    __init__():
        ...
        self.add(Dense(10))
        self.add(Dense(10))
        self.add(Dense(10))

In this case, we have no idea what the input shape is. So we can't build yet.

I think this is how it's supposed to work.

When add() is called and the input shape is known via InputLayer or input_shape we can build.
When add() is called and the input shape is unknown we cannot build.
After build() is called, the first layer will be guaranteed to be an InputLayer. So subsequent call to add() after build() will always keep the model built.

But if we really don't know the input shape yet -- the model does not define it, the model has not been called or built -- then it is correct for maybe rebuild to just skip.

If this still seems like a bug, a colab might help understand the use case better.

DLumi · 2024-07-19T06:49:43Z

@mattdangerw I thought that might be expected, but then I've noticed that there's a failsafe for that that implicitly inserts InputLayer if it's not provided by user. So if build ensures InputLayer as a first layer regardless of what there actually is, it makes no sense that _maybe_rebuild restricts the model from actually calling build.

That's why for me it's not exactly a bug, rather an inconsistent behavior. And to address consistency, I'd lean into one or the other. So either prompt the user to explicitly provide an InputLayer as the model cannot be built without it, or keep it implicit but remove restriction in _maybe_rebuild

mattdangerw · 2024-07-19T17:18:54Z

@DLumi there is a reason for that difference actually. If you look at the top of the build() method, it will early return without doing anything if no shape is supplied. Calling build() without an input shape in _maybe_rebuild() would do absolutely nothing.

In add() and anything downstream, we don't know the real shape of the input, which is needed to build the model. Unless the user has specified it with InputLayer or input_shape.

Build can be triggered a few ways.

sequential.build(real_input_shape)
# or
sequential(real_input)

In both those cases we now know the shape that was previously unknown. And it will be propagated to the build call. So that's the answer to the first question of why can't we just always rebuild in add().

mattdangerw · 2024-07-19T17:21:05Z

As for the second question. Why can't we just demand an InputLayer is first in the model?

Honestly I suspect we would if we could. That's a fairly simple approach, and means the model can always be built "eagerly" much like we do with functional models.

However, a lot of code out in the wild depends on the fact that you can make a sequential with an unknown shape, which will be built the first time it is called. We don't want to ship a change that breaking.

mattdangerw · 2024-07-23T21:57:20Z

I think this is all works as intended. But feel free to reopen if above explanations seem incorrect!

github-actions bot assigned mehtamansi29 Jun 25, 2024

mehtamansi29 added the type:Bug label Jul 4, 2024

mehtamansi29 added the keras-team-review-pending Pending review by a Keras team member. label Jul 15, 2024

sachinprasadhs added the stat:awaiting response from contributor label Jul 18, 2024

google-ml-butler bot removed the stat:awaiting response from contributor label Jul 18, 2024

mattdangerw added stat:awaiting response from contributor and removed keras-team-review-pending Pending review by a Keras team member. labels Jul 19, 2024

google-ml-butler bot removed the stat:awaiting response from contributor label Jul 19, 2024

mattdangerw added stat:awaiting response from contributor and removed type:Bug labels Jul 23, 2024

mattdangerw closed this as completed Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequentials `_maybe_rebuild` does not make sense #19916

Sequentials `_maybe_rebuild` does not make sense #19916

DLumi commented Jun 25, 2024

mattdangerw commented Jul 18, 2024

DLumi commented Jul 18, 2024 •

edited

Loading

mattdangerw commented Jul 19, 2024 •

edited

Loading

DLumi commented Jul 19, 2024

mattdangerw commented Jul 19, 2024 •

edited

Loading

mattdangerw commented Jul 19, 2024 •

edited

Loading

mattdangerw commented Jul 23, 2024

Sequentials _maybe_rebuild does not make sense #19916

Sequentials _maybe_rebuild does not make sense #19916

Comments

DLumi commented Jun 25, 2024

mattdangerw commented Jul 18, 2024

DLumi commented Jul 18, 2024 • edited Loading

mattdangerw commented Jul 19, 2024 • edited Loading

DLumi commented Jul 19, 2024

mattdangerw commented Jul 19, 2024 • edited Loading

mattdangerw commented Jul 19, 2024 • edited Loading

mattdangerw commented Jul 23, 2024

Sequentials `_maybe_rebuild` does not make sense #19916

Sequentials `_maybe_rebuild` does not make sense #19916

DLumi commented Jul 18, 2024 •

edited

Loading

mattdangerw commented Jul 19, 2024 •

edited

Loading

mattdangerw commented Jul 19, 2024 •

edited

Loading

mattdangerw commented Jul 19, 2024 •

edited

Loading