Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction outputs differ for different batch sizes [BUG] #158

Open
melisande-c opened this issue Jun 21, 2024 · 0 comments
Open

Prediction outputs differ for different batch sizes [BUG] #158

melisande-c opened this issue Jun 21, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@melisande-c
Copy link
Member

Describe the bug
If different batch sizes are used the predictions will be subtly different. This might be caused by PyTorch. In the example below I turned off tta_transforms to make sure that wasn't the cause.

To Reproduce
Code snippet allowing reproducing the behaviour:

import numpy as np
import matplotlib.pyplot as plt

from careamics import CAREamist
from careamics.config import create_n2v_configuration

config = create_n2v_configuration(
    experiment_name="PredBatchingTest", 
    data_type="array",
    axes="SYX",
    patch_size=[8, 8],
    batch_size=1,
    num_epochs=1,
    n_channels=1,
    )
image = np.random.random((1, 32, 32))

engine = CAREamist(source=config)
engine.train(train_source=image)
pred1 = engine.predict(
    source=image, 
    batch_size=1, 
    tile_size=(8, 8), 
    tile_overlap=(2,2), 
    tta_transforms=False
)
pred2 = engine.predict(
    source=image, 
    batch_size=2, 
    tile_size=(8, 8), 
    tile_overlap=(2,2), 
    tta_transforms=False
)
plt.imshow(abs(pred2-pred1))
plt.colorbar()

This produces the image:
batchsize_non_matching_outputs

As you can see the last tile produces the same output and this is because for this tile the batch size is equal to 1 for both predictions.

Additional context
I added this test at one point but it is skipped for now.

@pytest.mark.skip(
reason=(
"This might be a problem at the PyTorch level during `forward`. Values up to "
"0.001 different."
)
)
def test_batched_prediction(tmp_path: Path, minimum_configuration: dict):
"Compare outputs when a batch size of 1 or 2 is used"
tile_size = (16, 16)
tile_overlap = (4, 4)
shape = (32, 32)
train_array = random_array(shape)
# create configuration
config = Configuration(**minimum_configuration)
config.training_config.num_epochs = 1
config.data_config.axes = "YX"
config.data_config.batch_size = 2
config.data_config.data_type = SupportedData.ARRAY.value
# instantiate CAREamist
careamist = CAREamist(source=config, work_dir=tmp_path)
# train CAREamist
careamist.train(train_source=train_array)
# predict with batch size 1 and batch size 2
pred_bs_1 = careamist.predict(
train_array, batch_size=1, tile_size=tile_size, tile_overlap=tile_overlap
)
pred_bs_2 = careamist.predict(
train_array, batch_size=2, tile_size=tile_size, tile_overlap=tile_overlap
)
assert np.array_equal(pred_bs_1, pred_bs_2)

@melisande-c melisande-c added the bug Something isn't working label Jun 21, 2024
@melisande-c melisande-c self-assigned this Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant