Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regiondata #2442

Merged
merged 20 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
eabe4e3
Add ExtendedComponent and RegionData classes
jfoster17 Sep 13, 2023
5d7d0ee
Add action to convert subset of regions to subset over coordinates
jfoster17 Sep 13, 2023
26c062a
Change name for subset conversion action
jfoster17 Sep 13, 2023
a44a27e
Codestyle fixes
jfoster17 Sep 13, 2023
9db0dd2
Fix documentation links
jfoster17 Sep 14, 2023
7404528
Add extended type to data get_kind
jfoster17 Sep 18, 2023
8cead7a
Merge branch 'glue-viz:main' into add-regiondata
jfoster17 Sep 19, 2023
4ffc1c8
Move RegionData to own file and motivate use
jfoster17 Sep 22, 2023
7f6b846
Add convenience funcs for checking validity and transforming regions
jfoster17 Sep 22, 2023
60da317
Improve and test the validation and transformation logic
jfoster17 Sep 22, 2023
7bcd9dd
Make linked_to_center_comp work with MultiLinks
jfoster17 Sep 22, 2023
585469c
Update to work for MultiLink links
jfoster17 Sep 25, 2023
fbc88bb
Consolidate to a single function for 1D/2D transformation function
jfoster17 Sep 25, 2023
e22e6bc
Change transform signature to work with shapely transform
jfoster17 Sep 25, 2023
3d63e11
Codestyle fix
jfoster17 Sep 25, 2023
288f713
Move test file to parallel location
jfoster17 Sep 25, 2023
0005024
Save ComponentID uuids to allow region_data save/restore
jfoster17 Sep 26, 2023
08fc900
Add documentation for why we use RegionData
jfoster17 Sep 26, 2023
395602a
Fix naming for _save_shapely_geometry
jfoster17 Sep 26, 2023
75f8bff
Merge remote-tracking branch 'refs/remotes/origin/add-regiondata' int…
jfoster17 Sep 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
"astropy": ("https://docs.astropy.org/en/stable/", None),
"echo": ("https://echo.readthedocs.io/en/latest/", None),
"pandas": ("https://pandas.pydata.org/pandas-docs/stable/", None),
"shapely": ("https://shapely.readthedocs.io/en/stable/", None),
}

# -- Options for HTML output -------------------------------------------------
Expand Down
105 changes: 104 additions & 1 deletion glue/core/component.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import numpy as np
import pandas as pd
import shapely

from glue.core.coordinate_helpers import dependent_axes, pixel2world_single_axis
from glue.utils import shape_to_string, coerce_numeric, categorical_ndarray
Expand All @@ -13,7 +14,7 @@
DASK_INSTALLED = False

__all__ = ['Component', 'DerivedComponent', 'CategoricalComponent',
'CoordinateComponent', 'DateTimeComponent']
'CoordinateComponent', 'DateTimeComponent', 'ExtendedComponent']


class Component(object):
Expand Down Expand Up @@ -107,6 +108,13 @@
"""
return False

@property
def extended(self):
"""
Whether or not or not the datatype represents an extended region
"""
return False

Check warning on line 116 in glue/core/component.py

View check run for this annotation

Codecov / codecov/patch

glue/core/component.py#L116

Added line #L116 was not covered by tests

def __str__(self):
return "%s with shape %s" % (self.__class__.__name__, shape_to_string(self.shape))

Expand Down Expand Up @@ -549,3 +557,98 @@
@property
def datetime(self):
return False


class ExtendedComponent(Component):
"""
A data component representing an extent or a region.

This component can be used when a dataset describes regions or ranges
and is typically used with a :class:`~glue.core.data.RegionData` object.
jfoster17 marked this conversation as resolved.
Show resolved Hide resolved
For example, a :class:`~glue.core.data.RegionData` object might provide
properties of geographic regions, and the boundaries of these regions
would be an ExtendedComponent.

Data loaders are required to know how to convert regions to a list
of Shapely objects which can be used to initialize an ExtendedComponent.

A circular region can be represented as:

>>> circle = shapely.Point(x, y).buffer(rad)

A range in one dimension can be represented as:

>>> range = shapely.LineString([[x0,0],[x1,0]])

(This is a bit of an odd representation, since we are forced to specify a y
coordinate for this line. We adopt a convention of y == 0.)

ExtendedComponents are NOT used directly in linking. Instead, ExtendedComponents
always have corresponding ComponentIDs that represent the x (and y) coordinates
over which the regions are defined. If not specified otherwise, a
:class:`~glue.core.data.RegionData` object will create `representative points`
for each region, representing a point near the center of the reigon that is
guaranteed to be inside the region.

NOTE: that this implementation does not support regions in more than
two dimensions. (Shapely has limited support for 3D shapes, but not more).

Parameters
----------
data : list of `shapely.Geometry`` objects
The data to store.
center_comp_ids : list of :class:`glue.core.component_id.ComponentID` objects
The ComponentIDs of the `center` of the extended region. These do not
have to be the literal center of the region, but they must be in the x (and y)
coordinates of the regions. These ComponentIDs are used in the linking
framework to allow an ExtendedComponent to be linked to other components.
units : `str`, optional
Unit description.

Attributes
----------
x : ComponentID
The ComponentID of the x coordinate at the center of the extended region.
y : ComponentID
The ComponentID of the y coordinate at the center of the extended region.

Raises
------
TypeError
If data is not a list of ``shapely.Geometry`` objects
ValueError
If center_comp_ids is not a list of length 1 or 2
"""
def __init__(self, data, center_comp_ids, units=None):
if not all(isinstance(s, shapely.Geometry) for s in data):
raise TypeError(
"Input data for a ExtendedComponent should be a list of shapely.Geometry objects"
)
if len(center_comp_ids) == 2:
self.x = center_comp_ids[0]
self.y = center_comp_ids[1]
elif len(center_comp_ids) == 1:
self.x = center_comp_ids[0]
self.y = None
else:
raise ValueError(
"ExtendedComponent must be initialized with one or two ComponentIDs"
)
self.units = units
self._data = data

@property
def extended(self):
return True

Check warning on line 642 in glue/core/component.py

View check run for this annotation

Codecov / codecov/patch

glue/core/component.py#L642

Added line #L642 was not covered by tests

@property
def numeric(self):
return False

Check warning on line 646 in glue/core/component.py

View check run for this annotation

Codecov / codecov/patch

glue/core/component.py#L646

Added line #L646 was not covered by tests

@property
def datetime(self):
return False

Check warning on line 650 in glue/core/component.py

View check run for this annotation

Codecov / codecov/patch

glue/core/component.py#L650

Added line #L650 was not covered by tests

@property
def categorical(self):
return False

Check warning on line 654 in glue/core/component.py

View check run for this annotation

Codecov / codecov/patch

glue/core/component.py#L654

Added line #L654 was not covered by tests
171 changes: 169 additions & 2 deletions glue/core/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import numpy as np
import pandas as pd
import shapely

from fast_histogram import histogram1d, histogram2d

Expand Down Expand Up @@ -36,7 +37,7 @@
# Note: leave all the following imports for component and component_id since
# they are here for backward-compatibility (the code used to live in this
# file)
from glue.core.component import Component, CoordinateComponent, DerivedComponent
from glue.core.component import Component, CoordinateComponent, DerivedComponent, ExtendedComponent
from glue.core.component_id import ComponentID, ComponentIDDict, PixelComponentID

try:
Expand All @@ -45,7 +46,7 @@
except ImportError:
DASK_INSTALLED = False

__all__ = ['Data', 'BaseCartesianData', 'BaseData']
__all__ = ['Data', 'BaseCartesianData', 'BaseData', 'RegionData']
jfoster17 marked this conversation as resolved.
Show resolved Hide resolved


class BaseData(object, metaclass=abc.ABCMeta):
Expand Down Expand Up @@ -2057,3 +2058,169 @@
if 1 <= ndim <= 3:
label += " [{0}]".format('xyz'[ndim - 1 - i])
return label


class RegionData(Data):
"""
A glue Data object for storing data that is associated with a region.
jfoster17 marked this conversation as resolved.
Show resolved Hide resolved

This object can be used when a dataset describes 2D regions or 1D ranges. It
contains exactly one :class:`~glue.core.component.ExtendedComponent` object
which contains the boundaries of the regions, and must also contain
one or two components that give the center of the regions in whatever data
coordinates the regions are described in. Links in glue are not made
directly on the :class:`~glue.core.component.ExtendedComponent`, but instead
on the center components.

Thus, a subset that includes the center of a region will include that region,
but a subset that includes just a little part of the region will not include
that region. These center components are not the same pixel components. For
example, a dataset that is a table of 2D regions will have a single
:class:`~glue.core.component.CoordinateComponent`, but must have two of these center
components.

A typical use case for this object is to store the properties of geographic
regions, where the boundaries of the regions are stored in an
:class:`~glue.core.component.ExtendedComponent` and the centers of the
regions are stored in two components, one for the longitude and one for the
latitude. Additional components may describe arbitrary properties of these
geographic regions (e.g. population, area, etc).


Parameters
----------
label : `str`, optional
The label of the data.
coords : :class:`~glue.core.coordinates.Coordinates`, optional
The coordinates associated with the data.
**kwargs
All other keyword arguments are passed to the :class:`~glue.core.data.Data`
constructor.

Attributes
----------
extended_component_id : :class:`~glue.core.component_id.ComponentID`
The ID of the :class:`~glue.core.component.ExtendedComponent` object
that contains the boundaries of the regions.
center_x_id : :class:`~glue.core.component_id.ComponentID`
The ID of the Component object that contains the x-coordinate of the
center of the regions. This is actually stored in the component
with the extended_component_id, but it is convenient to have it here.
center_y_id : :class:`~glue.core.component_id.ComponentID`
The ID of the Component object that contains the y-coordinate of the
center of the regions. This is actually stored in the component
with the extended_component_id, but it is convenient to have it here.

Examples
--------

There are two main options for initializing a :class:`~glue.core.data.RegionData`
object. The first is to simply pass in a list of ``Shapely.Geometry`` objects
with dimesionality N, from which we will create N+1 components: one
:class:`~glue.core.component.ExtendedComponent` with the boundaries, and N
regular Component(s) with the center coordinates computed from the Shapley
method :meth:`~shapely.GeometryCollection.representative_point`:

>>> geometries = [shapely.geometry.Point(0, 0).buffer(1), shapely.geometry.Point(1, 1).buffer(1)]
>>> my_region_data = RegionData(label='My Regions', boundary=geometries)

This will create a :class:`~glue.core.data.RegionData` object with three
components: one :class:`~glue.core.component.ExtendedComponent` with label
"geo" and two regular Components with labels "Center [x] for boundary"
and "Center [y] for boundary".

The second is to explicitly create an :class:`~glue.core.component.ExtendedComponent`
(which requires passing in the ComponentIDs for the center coordinates) and
then use `add_component` to add this component to a :class:`~glue.core.data.RegionData`
object. You might use this approach if your dataset already contains points that
represent the centers of your regions and you want to avoid re-calculating them. For example:

>>> center_x = [0, 1]
>>> center_y = [0, 1]
>>> geometries = [shapely.geometry.Point(0, 0).buffer(1), shapely.geometry.Point(1, 1).buffer(1)]

>>> my_region_data = RegionData(label='My Regions')
>>> # Region IDs are created and returned when we add a Component to a Data object
>>> cen_x_id = my_region_data.add_component(center_x, label='Center [x]')
>>> cen_y_id = my_region_data.add_component(center_y, label='Center [y]')
>>> extended_comp = ExtendedComponent(geometries, center_comp_ids=[cen_x_id, cen_y_id])
>>> my_region_data.add_component(extended_comp, label='boundaries')

"""

def __init__(self, label="", coords=None, **kwargs):
self._extended_component_id = None
self._center_x_id = None
self._center_y_id = None
# __init__ calls add_component which deals with ExtendedComponent logic
super().__init__(label=label, coords=coords, **kwargs)

def __repr__(self):
return f'RegionData (label: {self.label} | extended_component: {self.extended_component_id})'

Check warning on line 2159 in glue/core/data.py

View check run for this annotation

Codecov / codecov/patch

glue/core/data.py#L2159

Added line #L2159 was not covered by tests

@property
def center_x_id(self):
return self.get_component(self.extended_component_id).x

Check warning on line 2163 in glue/core/data.py

View check run for this annotation

Codecov / codecov/patch

glue/core/data.py#L2163

Added line #L2163 was not covered by tests

@property
def center_y_id(self):
return self.get_component(self.extended_component_id).y

Check warning on line 2167 in glue/core/data.py

View check run for this annotation

Codecov / codecov/patch

glue/core/data.py#L2167

Added line #L2167 was not covered by tests

@property
def extended_component_id(self):
return self._extended_component_id

@contract(component='component_like', label='cid_like')
def add_component(self, component, label):
""" Add a new component to this data set, allowing only one :class:`~glue.core.component.ExtendedComponent`

If component is an array of Shapely objects then we use
:meth:`~shapely.GeometryCollection.representative_point`: to
create two new components for the center coordinates of the regions and
add them to the :class:`~glue.core.data.RegionData` object as well.

If component is an :class:`~glue.core.component.ExtendedComponent`,
then we simply add it to the :class:`~glue.core.data.RegionData` object.

We do this here instead of extending ``Component.autotyped`` because
we only want to use :class:`~glue.core.component.ExtendedComponent` objects
in the context of a :class:`~glue.core.data.RegionData` object.

Parameters
----------
component : :class:`~glue.core.component.Component` or array-like
Object to add. If this is an array of Shapely objects, then we
create two new components for the center coordinates of the regions
as well.
label : `str` or :class:`~glue.core.component_id.ComponentID`
The label. If this is a string, a new
:class:`glue.core.component_id.ComponentID`
with this label will be created and associated with the Component.

Raises
------
`ValueError`, if the :class:`~glue.core.data.RegionData` already has an extended component
"""

if not isinstance(component, Component):
if all(isinstance(s, shapely.Geometry) for s in component):
center_x = []
center_y = []
for s in component:
rep = s.representative_point()
center_x.append(rep.x)
center_y.append(rep.y)
cen_x_id = super().add_component(np.asarray(center_x), f"Center [x] for {label}")
cen_y_id = super().add_component(np.asarray(center_y), f"Center [y] for {label}")
ext_component = ExtendedComponent(np.asarray(component), center_comp_ids=[cen_x_id, cen_y_id])
self._extended_component_id = super().add_component(ext_component, label)
return self._extended_component_id

if isinstance(component, ExtendedComponent):
if self.extended_component_id is not None:
raise ValueError(f"Cannot add another ExtendedComponent; existing extended component: {self.extended_component_id}")

Check warning on line 2221 in glue/core/data.py

View check run for this annotation

Codecov / codecov/patch

glue/core/data.py#L2221

Added line #L2221 was not covered by tests
else:
self._extended_component_id = super().add_component(component, label)
return self._extended_component_id
else:
return super().add_component(component, label)
41 changes: 41 additions & 0 deletions glue/core/regions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"""
Functions to support data that defines regions
"""
import numpy as np

Check warning on line 4 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L4

Added line #L4 was not covered by tests

from glue.core.roi import PolygonalROI
from glue.core.data import RegionData

Check warning on line 7 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L6-L7

Added lines #L6 - L7 were not covered by tests

from glue.config import layer_action
from glue.core.subset import RoiSubsetState, MultiOrState

Check warning on line 10 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L9-L10

Added lines #L9 - L10 were not covered by tests


def reg_to_roi(reg):
if reg.geom_type == "Polygon":
ext_coords = np.array(reg.exterior.coords.xy)
roi = PolygonalROI(vx=ext_coords[0], vy=ext_coords[1]) # Need to account for interior rings
return roi

Check warning on line 17 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L13-L17

Added lines #L13 - L17 were not covered by tests


@layer_action(label='Subset of regions -> Subset over region extent', single=True, subset=True)
def layer_to_subset(layer, data_collection):

Check warning on line 21 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L20-L21

Added lines #L20 - L21 were not covered by tests
"""
This should be limited to the case where subset.Data is RegionData
and/or return a warning when applied to some other kind of data.
"""
if isinstance(layer.data, RegionData):

Check warning on line 26 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L26

Added line #L26 was not covered by tests

extended_comp = layer.data._extended_component_ids[0]
regions = layer[extended_comp]
list_of_rois = [reg_to_roi(region) for region in regions]

Check warning on line 30 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L28-L30

Added lines #L28 - L30 were not covered by tests

roisubstates = [RoiSubsetState(layer.data.ext_x,

Check warning on line 32 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L32

Added line #L32 was not covered by tests
layer.data.ext_y,
roi=roi
)
for roi in list_of_rois]
if len(list_of_rois) > 1:
composite_substate = MultiOrState(roisubstates)

Check warning on line 38 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L37-L38

Added lines #L37 - L38 were not covered by tests
else:
composite_substate = roisubstates[0]
_ = data_collection.new_subset_group(subset_state=composite_substate)

Check warning on line 41 in glue/core/regions.py

View check run for this annotation

Codecov / codecov/patch

glue/core/regions.py#L40-L41

Added lines #L40 - L41 were not covered by tests
Loading