Model inference speed is slower than training speed #3786
-
Hi, I want to create an EncoderDecoder model and use the encoder part later. The code design looks like this. class Encoder(nn.Module):
...
class Decoder(nn.Module):
...
class EncoderDecoder(nn.Module):
@compact
def __call__(self, x):
y = Encoder()(x)
y = Decoder()(y)
...
return y The model state is saved using the checkpoint method described here. While training the But when I loaded the module, extracted the encoder part, and put it in another loop, I could only get about 6 it/s. This is how I used the encoder. for step, batch in tqdm(...):
y = encoder.apply(encode_state["params"], batch, ...) I checked the GPU monitor, and my GPU is busy, which suggests it is using the GPU. My |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
nvm... This post helped: huggingface/transformers#15581 So, I should |
Beta Was this translation helpful? Give feedback.
nvm... This post helped: huggingface/transformers#15581
So, I should
jit
theencoder.apply
function and use the jitted version in the loop.