Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up histogram in plot options (alternative to #2763) #2771

Merged
merged 12 commits into from
May 8, 2024
3 changes: 3 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,9 @@ Other Changes and Additions
Bug Fixes
---------

- Histogram in Plot Options now uses random sampling to better
represent the data without sacrificing performance. [#2771]

Cubeviz
^^^^^^^

Expand Down
49 changes: 23 additions & 26 deletions jdaviz/configs/default/plugins/plot_options/plot_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,8 @@
import matplotlib
import numpy as np

from astropy.visualization import (
ManualInterval, ContrastBiasStretch, PercentileInterval
)
from astropy.visualization import ManualInterval, ContrastBiasStretch

from echo import delay_callback
from traitlets import Any, Dict, Float, Bool, Int, List, Unicode, observe

Expand Down Expand Up @@ -36,6 +35,8 @@

__all__ = ['PlotOptions']

RANDOM_SUBSET_SIZE = 10_000


def _register_random_cmap(
cmap_name,
Expand Down Expand Up @@ -973,13 +974,8 @@ def _update_stretch_histogram(self, msg={}):
x_max = x_limits.max()
y_min = max(y_limits.min(), 0)
y_max = y_limits.max()
arr = comp.data[y_min:y_max, x_min:x_max]
if self.config == "imviz":
# Downsample input data to about 400px (as per compass.vue) for performance.
xstep = max(1, round(arr.shape[1] / 400))
ystep = max(1, round(arr.shape[0] / 400))
arr = arr[::ystep, ::xstep]
sub_data = arr.ravel()

sub_data = comp.data[y_min:y_max, x_min:x_max]

else:
# spectrum-2d-viewer, for example. We'll assume the viewer
Expand All @@ -996,28 +992,29 @@ def _update_stretch_histogram(self, msg={}):
(y_data >= viewer.state.y_min) &
(y_data <= viewer.state.y_max))

sub_data = comp.data[inds].ravel()
sub_data = comp.data[inds]

else:
if self.config == "imviz":
# Downsample input data to about 400px (as per compass.vue) for performance.
xstep = max(1, round(data.shape[1] / 400))
ystep = max(1, round(data.shape[0] / 400))
arr = comp[::ystep, ::xstep]
else:
# include all data, regardless of zoom limits
arr = comp.data
sub_data = arr.ravel()

# filter out nans (or else bqplot will fail)
if np.any(np.isnan(sub_data)):
sub_data = sub_data[~np.isnan(sub_data)]
# include all data, regardless of zoom limits
sub_data = comp.data

self.stretch_histogram.viewer.state.random_subset = RANDOM_SUBSET_SIZE
self.stretch_histogram._update_data('histogram', x=sub_data)

if len(sub_data) > 0:
interval = PercentileInterval(95)
hist_lims = interval.get_limits(sub_data)

# Use glue to compute the statistics since this allows us to use
# a random subset of the data to compute the histogram.
# The 2.5 and 97.5 hardcoded here is equivalent to
# PercentileInterval(95).get_limits(sub_data)
glue_data = self.stretch_histogram.app.data_collection['histogram']
hist_lims = (
glue_data.compute_statistic('percentile', glue_data.id['x'],
percentile=2.5, random_subset=RANDOM_SUBSET_SIZE),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is hardcoding percentile here ok, since we also let users change percentile in Plot Options, i.e., would users care or notice anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was just writing something equivalent to what was there before, but we can easily change if you think users should be able to customise it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinion here. I always think of this histogram as quick look and should not be taken too seriously but @camipacifici or @kecnry might have different opinions, so let's see if they comment.

Copy link
Contributor

@camipacifici camipacifici Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know what this specific line does exactly, so not sure how to comment, sorry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something we'll want to deal with eventually since it isn't always ideal, but this was hardcoded before so I think its reasonable to keep the same assumption for now

glue_data.compute_statistic('percentile', glue_data.id['x'],
percentile=97.5, random_subset=RANDOM_SUBSET_SIZE)
)

# set the stepsize for vmin/vmax to be approximately 1% of the range of the
# histogram (within the percentile interval), rounded to 1-2 significant digits
# to avoid random step sizes. This logic is somewhat arbitrary and can be safely
Expand Down
8 changes: 4 additions & 4 deletions jdaviz/core/template_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -4631,13 +4631,13 @@ def _update_data(self, label, reset_lims=False, **kwargs):
data = self.app.data_collection[label]

# if not provided, fallback on existing data
length_mismatch = False
shape_mismatch = False
for component in self._viewer_components:
kwargs.setdefault(component, data[component])
if len(kwargs[component]) != len(data[component]):
length_mismatch = True
if np.asarray(kwargs[component]).shape != data[component].shape:
shape_mismatch = True

if not length_mismatch:
if not shape_mismatch:
# then we can update the existing entry
components = {c.label: c for c in data.components}
data.update_components({components[comp]: kwargs[comp]
Expand Down
7 changes: 6 additions & 1 deletion jdaviz/core/tests/test_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,11 @@ def test_stretch_bounds(imviz_helper):


def test_stretch_bounds_and_spline(imviz_helper):

# As the histogram randomly samples the array, we should make sure the
# values used here are reproducible
np.random.seed(42)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is there no way to localize it more within the histogram algorithm itself?

Maybe something like po._obj.stretch_histogram.set_seed(...) or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we actually want to do that right? it's more just for testing? In practice it's probably safer to not constrain the random numbers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to also set seed for testing on glue side?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but I can always do that in the relevant glue tests if needed. I think there shouldn't be a seed set at runtime.


image_1 = NDData(make_4gaussians_image(), unit=u.nJy)
imviz_helper.load_data(image_1)
po = imviz_helper.plugins["Plot Options"]
Expand All @@ -93,7 +98,7 @@ def test_stretch_bounds_and_spline(imviz_helper):

knots_after_drag_move = (
[0.0, 0.1, 0.21712585033417825, 0.7, 1.0],
[0.0, 0.05, 0.2900993441358025, 0.9, 1.0],
[0.0, 0.05, 0.2852214046563617, 0.9, 1.0],
)

stretch_tool.on_mouse_event(knot_move_msg)
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ dependencies = [
"traitlets>=5.0.5",
"bqplot>=0.12.37",
"bqplot-image-gl>=1.4.11",
"glue-core>=1.18.0",
"glue-jupyter>=0.20",
"glue-core>=1.20.0",
"glue-jupyter>=0.21.0",
"echo>=0.5.0",
"ipykernel>=6.19.4",
"ipyvue>=1.6",
Expand Down