Refactoring Spot Finding TAKE 2 #1500

shanaxel42 · 2019-08-20T00:18:26Z

Spot Finding

As I started to execute on the original small module design proposal #1450 I realized each method didn't fit quite as nicely as I'd hoped. The outputs of each spot finding method were two varied and it would have caused too much confusion around the difference between a located spot and a spot trace. After some more speculation I've been coming around to the idea that what spot finding needs is not smaller modules but instead more abstraction. This proposal represents the "more abstraction" approach:

starfish.spots would include the following two packages:

FindSpots
DecodeSpots

Their base contracts would be:

ImageStack -> FindSpots -> SpotFindingResults

codebook,
SpotFindingResults -> DecodeSpots -> DecodedIntensityTable

FindSpots will handle
-locating and measuring spots in every r/ch pair

DecodeSpots will handle
-building spot traces
-decoding the spots traces

To support each step the following two new data models are required:

SpotFindingResults : a dictionary of round/ch indices and their corresponding SpotAttribtues
DecodedIntensityTable : aka decoded spot traces. A representation of spots, their attributes and their corresponding traces, for mulitplexed methods this trace is a vector. For non multiplexed methods the trace is a singe value. Each spot also has a decoded target value

With this refactor we are abstracting the idea of an IntensityTable from the user. Instead spot traces will be built when needed as part of the decoding step. From a user perspective they will go straight from finding spots to decoding them.

The new workflows for each notebook would be:

osmFISH*
osmFish currently uses a LocalMaxPeakFinder with no reference image. It's new workflow would be:

lmp = FindSpots.LocalMaxPeakFinder(
    min_distance=6,
    stringency=0,
    min_obj_area=6,
    max_obj_area=600,
)
spot_results= lmp.run(imagestack)

decoder = Decode.PerRoundMaxChannel(codebook)
decoded_intensites = decoder.run(spot_results)

ISS
ISS currently uses a blob detector on a blobs images. It's new workflow would be:

p = FindSpots.BlobDetector(
    min_sigma=1,
    max_sigma=10,
    num_sigma=30,
    threshold=0.01,
    measurement_type='mean',
)
spot_results = p.run(reference_image=dots, image=imagestack)

decoder = Decode.PerRoundMaxChannel(codebook)
decoded_intensites = decoder.run(spot_results)

BaristaSeq
Uses PixelSpotDecoder. I think trying to rework PixelSpotDecoding into this modal is too much work for it's worth. But I do think that explaining it as just a Decoder that goes directly from an ImageStack to DecdodedIntensityTable will be clearer then what we currently have.

starMap
Starmap currently uses LocalSearchBlobDetector. It's new workflow would be:
We'll actually be able to delete LocalSearchBlobDetector after this refactor and instead use a combination of regular BlobDetector and trace building using a 'nearest_neighbors' method.

spot_finder = FindSpots.BlobDetector()
spot_results = spot_finder.run(imagestack)

decoder = Decode.PerRoundMaxChannel(codebook)
decoded_intensites = decoder.run(spot_locations, build_traces='nearest_neighbor')

smFISH
smFish currently used TrackpyLocalMaxPeakFinder with no reference image, it's new workflow would be:

spot_locater = FindSpots.LocalSearchBlob()
spot_results = spot_locater.run(imagestack)

decoder = Decode.PerRoundMaxChannel(codebook)
decoded_intensites = decoder.run(spot_results)

The high level plan of attack for the refactor is:

Create DecodedIntensityTable (already done Adding DecodedIntensityTable class #1489)
Create FindSpots module with contract ImageStack -> SpotFindingResults
Create DecodeSpots module with contract SpotFindingResults -> DecodedIntensityTable
Fix provenance logging with new modules
Move measure_spot_intensity_code to a utility file in the FindSpots module
Move existing decoders to DecodeSpots module.
Refactor existing spot finding methods to always take in an imagestack and use and output SpotFindingResults (I'll probably first just make copies of each one under the new FindSpots module so I don't break everything)
Refactor existing notebooks to use this new framework
Delete the DetectSpots package.

By abstracting the step of building spot traces to decoding we allow for more flexibility in the data structures and process used. For example Gabriel's graph based approach #1482

For SeqFish the workflow would be LocateSpots.LocalSearchBlob() to get the spot locations. And then a new Decoder that would handle creating the possible trace groupings and decoding.

The text was updated successfully, but these errors were encountered:

ttung · 2019-08-20T03:47:02Z

I have this notion that there's an assay that uses the max projected image to locate spots, but gets the intensities from original image. How would that work in this model?

ambrosejcarr · 2019-08-20T14:23:45Z

This looks great! I have a couple questions but overall the flow of the refactor looks solid.

Create LocateSpots module with contract ImageStack -> SpotAttributes

This is more properly a rename operation on DetectSpots, followed by exchanging IntensityTable construction with a container-like SpotAttributes class that mimics dict[Indices, DataFrame] right?

Move measure_spot_intensity_code to a utility file in the DecodeSpots module

Can you remind me why DecodeSpots needs the "measure" operation instead of simply putting it with LocateSpots? I get that the now defunct GroupSpots needs to be in DecodeSpots. If MeasureSpots stayed in LocateSpots, the ImageStack would always be present for measurement, and the SpotAttributes would always have an intensity field. This would address Tony's question too, I think.

[...] I think trying to rework PixelSpotDecoding into this modal is too much work for it's worth. But I do think that explaining it as just a Decoder that goes directly from an ImageStack to DecdodedIntensityTable will be clearer then what we currently have.

Agree, but the PixelSpotDecoder use the same decoders as the other spot finders. If you're modifying DecodeSpots to make it emit a 2D IntensityTable with shape (features, r * c), this will need to be ported over to the Pixel approach soon after this refactor.

Create DecodedIntensityTable (already done #1489)

Semantic question: If we're deleting the user-facing IntensityTable in favor of SpotAttributes, could we just use the existing IntensityTable as the output of decode (in lieu of the more complex name DecodedIntensityTable), but require it always have a gene field?

Raising for later: the same confusion that prompted you to address IntensityTable -> DecodedIntensityTable will make you want to create DecodedIntensityTableWithCellAssignments. Not sure what the right solution here is. :-)

shanaxel42 · 2019-08-20T16:37:06Z

I have this notion that there's an assay that uses the max projected image to locate spots, but gets the intensities from original image. How would that work in this model?

So you'd LocateSpots on the max projected image to get the spot locations then pass the imagestack into Decode for the measure_intensities step. It should work the same as if they did the spot finding on a reference image.

Can you remind me why DecodeSpots needs the "measure" operation instead of simply putting it with LocateSpots? I get that the now defunct GroupSpots needs to be in DecodeSpots. If MeasureSpots stayed in LocateSpots, the ImageStack would always be present for measurement, and the SpotAttributes would always have an intensity field. This would address Tony's question too, I think.

I wanted to keep LocateSpots completely in individual spot space (not traces). by adding the measure spot intensities step there you have spot traces as a potential output of LocateSpots, but not always (like in the case of LocalSearchBlobDetector). This is essentially the model we have now with IntensityTable being the output of Detect.

Semantic question: If we're deleting the user-facing IntensityTable in favor of SpotAttributes, could we just use the existing IntensityTable as the output of decode (in lieu of the more complex name DecodedIntensityTable), but require it always have a gene field?

We could. What I'd really like to do is actually just rename DecodedIntensityTable to DecodedSpotTraces to make things more clear. But I'm also wary of introducing too many changes, so open to just having IntensityTable.

ambrosejcarr · 2019-08-20T16:40:01Z

I wanted to keep LocateSpots completely in individual spot space (not traces). by adding the measure spot intensities step there you have spot traces as a potential output of LocateSpots, but not always (like in the case of LocalSearchBlobDetector). This is essentially the model we have now with IntensityTable being the output of Detect.

I either disagree or don't understand. The relationship between spot and the measured intensity is 1:1. There's no efficiency gained by building traces at the time of measurement; measurement and trace building should be safely separable. It should be possible for MeasureSpots to live in LocateSpots and GroupSpots to live in DecodeSpots unless I'm missing something important.

shanaxel42 · 2019-08-20T17:08:44Z

The relationship between spot and the measured intensity is 1:1

This is not always true. There's a difference between a spot found in a reference image and a spot found in an imagestack using local search. For the later, there's a need to treat these spots as unique entities you can group in varying ways to build your traces (seqfish). For the first I agree the spot and measured trace are 1:1. Maybe the confusion is around what the measure spot intensities code is doing?

ambrosejcarr · 2019-08-20T17:25:24Z

This is not always true. There's a difference between a spot found in a reference image and a spot found in an imagestack using local search. For the later, there's a need to treat these spots as unique entities you can group in varying ways to build your traces (seqfish)

Maybe! I think measure spot intensities is identifying the intensity of a spot given a set of indices.

Concretely I see it as a method that is run on each SpotAttributes table in cases where the Intensity is not generated by LocateSpots:

detect_spots(SpotAttributes[-intensity], ImageStack) -> SpotAttributes[+intensity]

And, I'm arguing this should be in LocateSpots because you've defined the chain of operations as:

def LocateSpots(ImageStack) -> SpotAttributes[?intensity]
    pass
def DecodeSpots(SpotAttributes, Optional[ImageStack]) -> IntensityTable
    pass

But if my suggestion is feasible, we'd instead have:

def LocateAndMeasureSpots(ImageStack) -> SpotAttributes
    pass
def DecodeSpots(SpotAttributes) -> IntensityTable
    pass

Addressing your question, I think that In both of your cases, the intensity of each spot can be identified during the localization. I agree that merging into traces is different, but don't see why that must be related to measuring the spot.

I recall during our earlier discussions we decided that the spot could be measured anywhere in the chain of operations because SeqFISH doesn't really use the spot intensity in its approach.

shanaxel42 · 2019-08-20T17:39:13Z

ah ok yes we're thinking of two different things. So the existing code within measure_spot_intensities builds your spot traces based off locations found in a reference or max projected image then outputs an intensityTable. The method should really be called build_spot_traces or something. And I'm arguing that anything related to building traces or grouping spots into traces should happen in the decoding step. So I think we agree, and yea in that scenario I could move the measuring of the individual spot intensities to the Locate step but it's slightly more work.

ambrosejcarr · 2019-08-20T18:09:56Z

Ok, great. I agree -- "trace" building should all go to DetectSpots.

I feel, based on how it simplifies the contracts, that the refactor to put MeasureSpots into LocateSpots is worthwhile, and in the long run will result in simpler, cleaner code.

ttung · 2019-08-20T18:30:34Z

I'm generally confused. This seems pretty similar to the original proposal in #1450 except there's no MeasureSpots.

Also, what is the distinction between SpotAttributes and IntensityTable?

ttung · 2019-08-20T18:31:42Z

I have to dig out my notes for what we talked about in Toronto...

ambrosejcarr · 2019-08-20T23:26:02Z

Had an offline conversation which concluded that we should de-risk the cost of the no-op "group spots by identical position" which arises if measurement of spot intensities is always executed during spot locating in the following case: Previously if you find spots in a Max projection, your measurement would directly populate an intensitytable. In the new model you'd make a bunch of spotattributes and have to match them back up.

…

On Tue, Aug 20, 2019, 2:31 PM Tony Tung ***@***.***> wrote: I have to dig out my notes for what we talked about in Toronto... — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#1500?email_source=notifications&email_token=ABH7C4DJOMN2EYKJYY6VX6LQFQ2A7A5CNFSM4INJMSNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4XH4OI#issuecomment-523140665>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABH7C4ENVWAYGXDE6WB7I7LQFQ2A7ANCNFSM4INJMSNA> .

shanaxel42 assigned ttung, dganguli and ambrosejcarr Aug 20, 2019

This was referenced Aug 26, 2019

Refactoring spot finding #1450

Closed

Implement SpotFinding Refactor #1514

Closed

shanaxel42 closed this as completed Nov 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring Spot Finding TAKE 2 #1500

Refactoring Spot Finding TAKE 2 #1500

shanaxel42 commented Aug 20, 2019 •

edited

Loading

ttung commented Aug 20, 2019

ambrosejcarr commented Aug 20, 2019 •

edited

Loading

shanaxel42 commented Aug 20, 2019

ambrosejcarr commented Aug 20, 2019 •

edited

Loading

shanaxel42 commented Aug 20, 2019 •

edited

Loading

ambrosejcarr commented Aug 20, 2019

shanaxel42 commented Aug 20, 2019 •

edited

Loading

ambrosejcarr commented Aug 20, 2019

ttung commented Aug 20, 2019

ttung commented Aug 20, 2019

ambrosejcarr commented Aug 20, 2019 via email

Refactoring Spot Finding TAKE 2 #1500

Refactoring Spot Finding TAKE 2 #1500

Comments

shanaxel42 commented Aug 20, 2019 • edited Loading

Spot Finding

ttung commented Aug 20, 2019

ambrosejcarr commented Aug 20, 2019 • edited Loading

shanaxel42 commented Aug 20, 2019

ambrosejcarr commented Aug 20, 2019 • edited Loading

shanaxel42 commented Aug 20, 2019 • edited Loading

ambrosejcarr commented Aug 20, 2019

shanaxel42 commented Aug 20, 2019 • edited Loading

ambrosejcarr commented Aug 20, 2019

ttung commented Aug 20, 2019

ttung commented Aug 20, 2019

ambrosejcarr commented Aug 20, 2019 via email

shanaxel42 commented Aug 20, 2019 •

edited

Loading

ambrosejcarr commented Aug 20, 2019 •

edited

Loading

ambrosejcarr commented Aug 20, 2019 •

edited

Loading

shanaxel42 commented Aug 20, 2019 •

edited

Loading

shanaxel42 commented Aug 20, 2019 •

edited

Loading