Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLEstimateComponentBasedNormalisation refactor #1499

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

robbietuk
Copy link
Collaborator

@robbietuk robbietuk commented Aug 29, 2024

Changes in this pull request

WIP

Refactor of ML_estimate_component_based_normalisation.

  • Created a class.
  • Simplified the system logic
  • Add the ability option to save the data or not

Remaining tasks

  • Add methods to get_efficiencies(), get_norm_geo_data() and get_norm_block_data()
  • Add methods to get and set the boolean values of do_geo, do_block, do_symmetry_per_block, do_KL, do_display, do_save_to_file
  • Improve documentation defining the efficiencies, norm_geo_data, norm_block_data variables (even after all this work I am still slightly confused).
  • Add unit tests

Testing performed

Added a new unit test for basic functionality.

Related issues

Fixes #1498

Checklist before requesting a review

  • [] I have performed a self-review of my code
  • [] I have added docstrings/doxygen in line with the guidance in the developer guide
  • [] I have implemented unit tests that cover any new or modified functionality (if applicable)
  • [] The code builds and runs on my machine
  • [] documentation/release_XXX.md has been updated with any functionality change (if applicable)

@robbietuk robbietuk changed the title Ml estimate component based normalisation redesign MLEstimateComponentBasedNormalisation refactor Aug 29, 2024
@robbietuk robbietuk self-assigned this Aug 29, 2024
@robbietuk
Copy link
Collaborator Author

Thus far, I have not (intentionally) changed any of the functionality of the find_ML_normfactors3D utility. I have only used refactoring to move the logic into a class, split functionality into component methods, created members.

The only new introduction is do_save_to_file.

Copy link
Collaborator Author

@robbietuk robbietuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. I am not 100% sure about updating to using shared_ptrs but the idea is that the data can be accessed without copying.

Additionally, I am not convinced there is any testing of this code.

Copy link
Collaborator

@KrisThielemans KrisThielemans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some initial comments. I didn't review everything.

I'm not sure if we should provide access to the efficiencies_sptr etc. shared_ptr are nice, but to be avoided when we can. If we have a const auto get_efficiencies() member, there seems tobe no point in making efficiencies etc into a shared_ptr.

@KrisThielemans
Copy link
Collaborator

Additionally, I am not convinced there is any testing of this code.

there are tests in the recon_test_pack I believe.

@robbietuk
Copy link
Collaborator Author

At this point, I want to add an interface with BinNormalisationPETFromComponents. Is the any specific reason for no direct setters of the eff, geo, block factors into BinNormalisationPETFromComponents?

This class does not actually set the relevant factors. That is left to external
methods via the crystal_efficiencies(), geometric_factors() and block_factors()
members. If they are not set, the factor is not applied (i.e. assumed to be 1).

apply_normfactors3D uses streams to set these values
if (do_eff)
{
char* in_filename = new char[in_filename_prefix.size() + 30];
sprintf(in_filename, "%s_%s_%d_%d.out", in_filename_prefix.c_str(), "eff", iter_num, eff_iter_num);
std::ifstream in(in_filename);
in >> norm.crystal_efficiencies();
if (!in)
{
warning("Error reading %s, using all 1s instead\n", in_filename);
do_eff = false;
}
delete[] in_filename;
}
. I would prefer not to do this.

Comment on lines 65 to 66
const ProjData& measured_projdata_v,
const ProjData& model_projdata_v,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a check that these ProjData are the same "shape". Are there any tests for that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be. Not sure if it's checked anywhere.

@KrisThielemans
Copy link
Collaborator

Is the any specific reason for no direct setters of the eff, geo, block factors into BinNormalisationPETFromComponents?

I was just running out of time...

Note that BinNormalisationPETFromComponents was intended to form a parent class for other component-based norms, including Siemens and GE. Not 100% sure how feasible that'll be, but it'd remove a lot of obscure/copied code. I'm not saying this because you need to do it, but just to be aware of the longer term plan. Clearly, adding setters/getters is not a problem.

On another note, the "block" stuff really doesn't work and shouldn't be used. It was intended to cope with timing-alignment factors (for many scanners these are per block) but it introduces too much freedom (which I didn't know when I wrote that code). Timing alignment should be parametrised by 1 factor per block, not a norm-factor for every block-pair. Just saying that such that you don't spend too much time on that (i.e. maybe don't expose it)

Comment on lines 176 to 189
void
BinNormalisationPETFromComponents::allocate(shared_ptr<const ProjDataInfo> pdi_sptr, MLEstimateComponentBasedNormalisation& normalization_estimator)
{
if (!normalization_estimator.has_processed_data())
{
error("BinNormalisationPETFromComponents: internal error: allocate called on MLEstimateComponentBasedNormalisation "
"without it having processed the data");
}
this->proj_data_info_sptr = pdi_sptr;
this->efficiencies = *normalization_estimator.get_efficiencies();
this->geo_data = *normalization_estimator.get_geo_data();
this->block_data = *normalization_estimator.get_block_data();
this->_already_allocated = true;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the cleanest way to implement all these changes. Here is a use case from my cpp program.

    MLEstimateComponentBasedNormalisation ml_estimator(out_filename_tmp, *measured_projdata, *model_activity_projdata,
                                                       num_eff_iterations, num_iterations, do_geo, do_block,
                                                       do_symmetry_per_block, do_KL, do_display, do_save_to_file);
    ml_estimator.process();

    // Apply the normalization factors
    BinNormalisationPETFromComponents bin_normalisation;
    bin_normalisation.allocate(measured_projdata->get_proj_data_info_sptr(), ml_estimator);
    bin_normalisation.set_up(measured_projdata->get_exam_info_sptr(), measured_projdata->get_proj_data_info_sptr());

    // Compute the projdata
    ProjDataInMemory normalization_projdata(*measured_projdata);
    normalization_projdata.fill(1.F);
    bin_normalisation.undo(normalization_projdata);

Copy link
Collaborator Author

@robbietuk robbietuk Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I acknowledge #1498 (comment), however this does work and provides an alternative allocation of the data.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather do it the other way around I'm afraid. Your way is very different from anything else in STIR.

I very much prefer to let MLEstimateComponentBasedNormalisation have a shared_ptr<BinNormalisationPETFromComponents> normalisation_sptr member, which we can get via get_output() (returning a shared_ptr<const BinNorm...>). The MLEstimateComponentBasedNormalisation::get_efficiencies() would then just forward to normalisation_sptr->get_efficiencies() (note that I wouldn't let those return shared_ptr then, but references). Any reason not to do it this way?

@robbietuk
Copy link
Collaborator Author

Currently seeing the oddest error. The minimum tangential position is filled with inf.

TEST(TestNormalizationCalculator, TBD)
{
    auto scanner = PETScanner();
    auto measured_projdata = get_empty_projdata(scanner, 1);
    measured_projdata->fill(1.F);

    auto model_projdata = std::make_shared<ProjDataInMemory>(*measured_projdata);
    model_projdata->fill(1.F);


    int num_eff_iterations = 1;
    int num_iterations = 1;
    bool do_geo = false;
    bool do_block = false;
    bool do_symmetry_per_block = false;
    bool do_KL = false;
    bool do_display = false;
    bool do_save_to_file = false;
    stir::MLEstimateComponentBasedNormalisation ml_estimator("", *measured_projdata, *model_projdata,
                                                             num_eff_iterations, num_iterations, do_geo, do_block,
                                                             do_symmetry_per_block, do_KL, do_display, do_save_to_file);
    ml_estimator.process();

    // Apply the normalization factors
    BinNormalisationPETFromComponents bin_normalisation;
    bin_normalisation.allocate(measured_projdata->get_proj_data_info_sptr(), ml_estimator);
    bin_normalisation.set_up(measured_projdata->get_exam_info_sptr(), measured_projdata->get_proj_data_info_sptr());

    // Compute the projdata
    ProjDataInMemory normalization_projdata(*measured_projdata);
    normalization_projdata.fill(1.F);
    bin_normalisation.apply(normalization_projdata);
    std::cout << "Complete calculating normalization factors" << std::endl;
    std::cout << "Max: " << normalization_projdata.find_max() << std::endl;
    std::cout << "Min: " << normalization_projdata.find_min() << std::endl;
    auto filename = "test_norm";
    normalization_projdata.write_to_file(filename);

It outputs

Complete calculating normalization factors
Max: inf
Min: 1

Visualization

image
All voxels are 1 for every slice, except the tangential_position = 0.


I don't think this is a preexisting issue but I will check the following later

find_ML_normfactors3D.exe test_exe .\ones.hs .\twos.hs 1 1
apply_normfactors3D.exe test_apply_exe test_exe .\ones.hs 0 1 1

@KrisThielemans
Copy link
Collaborator

Infinities tend to come from 1/0. A candidate would be the TEMP FIX stuff.

@KrisThielemans
Copy link
Collaborator

Note that you'll have to rename class and file to MLEstimateComponentBasedNormalisation (no _ in class names according to STIR conventions)

@robbietuk
Copy link
Collaborator Author

robbietuk commented Sep 4, 2024

This is failing because of a divide by 0. This because the projdata is 192 tangential positions but the fan size is calculated as 191 by

   const int half_fan_size = min(proj_data_info.get_max_tangential_pos_num(), -proj_data_info.get_min_tangential_pos_num()); 
   fan_size = 2 * half_fan_size + 1; 

in here. There seems to be some requirement for the fan size to be odd?


Regardless, during BindNormalisationPETFromComponents::set_up is called it constructs the projdata, which eventually reaches

/// **** This function make proj_data from fan_data while adding the intermodule gaps **** ////
/// *** fan_data doesn't have gaps, proj_data has gaps *** ///
template <class TProjDataInfo>
static void
set_fan_data_add_gaps_help(ProjData& proj_data,
int num_rings,
int num_detectors_per_ring,
int max_delta,
int fan_size,
const TProjDataInfo& proj_data_info,
const FanProjData& fan_data,
const float gap_value = 0.F)
{
const int half_fan_size = fan_size / 2;
const int num_virtual_axial_crystals_per_block = proj_data_info.get_scanner_sptr()->get_num_virtual_axial_crystals_per_block();
const int num_virtual_transaxial_crystals_per_block
= proj_data_info.get_scanner_sptr()->get_num_virtual_transaxial_crystals_per_block();
const int num_transaxial_crystals_per_block = proj_data_info.get_scanner_sptr()->get_num_transaxial_crystals_per_block();
const int num_axial_crystals_per_block = proj_data_info.get_scanner_sptr()->get_num_axial_crystals_per_block();
const int num_physical_transaxial_crystals_per_block
= num_transaxial_crystals_per_block - num_virtual_transaxial_crystals_per_block;
const int num_physical_axial_crystals_per_block = num_axial_crystals_per_block - num_virtual_axial_crystals_per_block;
#ifdef STIR_OPENMP
# pragma omp parallel for schedule(dynamic)
#endif
for (int segment = proj_data.get_min_segment_num(); segment <= proj_data.get_max_segment_num(); ++segment)
{
Bin bin;
shared_ptr<SegmentBySinogram<float>> segment_ptr;
bin.segment_num() = segment;
segment_ptr.reset(new SegmentBySinogram<float>(proj_data.get_empty_segment_by_sinogram(bin.segment_num())));
for (bin.axial_pos_num() = proj_data.get_min_axial_pos_num(bin.segment_num());
bin.axial_pos_num() <= proj_data.get_max_axial_pos_num(bin.segment_num());
++bin.axial_pos_num())
for (bin.view_num() = 0; bin.view_num() < num_detectors_per_ring / 2; bin.view_num()++)
for (bin.tangential_pos_num() = -half_fan_size; bin.tangential_pos_num() <= half_fan_size; ++bin.tangential_pos_num())
{
int ra = 0, a = 0;
int rb = 0, b = 0;
proj_data_info.get_det_pair_for_bin(a, ra, b, rb, bin);
(*segment_ptr)[bin.axial_pos_num()][bin.view_num()][bin.tangential_pos_num()] = gap_value;
proj_data_info.get_det_pair_for_bin(a, ra, b, rb, bin);
int a_in_block = a % num_transaxial_crystals_per_block;
if (a_in_block >= num_physical_transaxial_crystals_per_block)
continue;
int new_a = a - (a / num_transaxial_crystals_per_block) * num_virtual_transaxial_crystals_per_block;
int ra_in_block = ra % num_axial_crystals_per_block;
if (ra_in_block >= num_physical_axial_crystals_per_block)
continue;
int new_ra = ra - (ra / num_axial_crystals_per_block) * num_virtual_axial_crystals_per_block;
int b_in_block = b % num_transaxial_crystals_per_block;
if (b_in_block >= num_physical_transaxial_crystals_per_block)
continue;
int new_b = b - (b / num_transaxial_crystals_per_block) * num_virtual_transaxial_crystals_per_block;
int rb_in_block = rb % num_axial_crystals_per_block;
if (rb_in_block >= num_physical_axial_crystals_per_block)
continue;
int new_rb = rb - (rb / num_axial_crystals_per_block) * num_virtual_axial_crystals_per_block;
(*segment_ptr)[bin.axial_pos_num()][bin.view_num()][bin.tangential_pos_num()]
= fan_data(new_ra, new_a, new_rb, new_b);
}
proj_data.set_segment(*segment_ptr);
}
}

The issue is in the last for loop and the bin.tangential_pos_num() being initialized at -half_fan_size = -95 whereas the proj_data.get_min_tangential_pos_num() returns -96. Therefore, the first row tangential_pos is treated as a gap and set to 0. This 0 is later used in the denominator of a division that leads to the infinity values in the test.

@KrisThielemans
Copy link
Collaborator

There seems to be some requirement for the fan size to be odd?

yes, that is the case at the moment. Code could be enhanced, but I was too lazy (having 1% extra data won't make or break the efficiency estimation)

@KrisThielemans
Copy link
Collaborator

Therefore, the first row tangential_pos is treated as a gap and set to 0. This 0 is later used in the denominator of a division that leads to the infinity values in the test.

yes, so the TEMP FIX could/should handle this (the model has to be zero in that bin). Or you could trim the proj-data first (with SSRB)

@robbietuk
Copy link
Collaborator Author

Therefore, the first row tangential_pos is treated as a gap and set to 0. This 0 is later used in the denominator of a division that leads to the infinity values in the test.

yes, so the TEMP FIX could/should handle this (the model has to be zero in that bin). Or you could trim the proj-data first (with SSRB)

The TEMP FIX code sits in the MLEstimateComponentBasedNormalisation class. This class correctly populates the DetectorEfficiencies with uniform 0.707* values in the test. It also has the shape (24,384) = (num_rings, num_det_per_ring). I'm not sure that is the problem.

The issue is when the DetectorEfficiencies are applied to the invnorm_proj_data_sptr in BindNormalisationPETFromComponents::set_up. Because of the half_fan_size calculation, the min tangential position is not accounted for in the efficiencies.

I am not a huge fan of trimming the projdata with SSRB. I am trying to understand the code to figure out why the min_tangential_pos_num is not able to be populated

@KrisThielemans
Copy link
Collaborator

KrisThielemans commented Sep 5, 2024

It's not the efficiencies themselves that are the problem, but the FanData

fan_indices[ra][a][rb]
= IndexRange<1>(a + num_detectors_per_ring / 2 - half_fan_size, a + num_detectors_per_ring / 2 + half_fan_size);

Moreover, there's plenty of loops doing

for (bin.tangential_pos_num() = -half_fan_size; bin.tangential_pos_num() <= half_fan_size; ++bin.tangential_pos_num())

This isn't going to be so easy to fix.

@robbietuk robbietuk mentioned this pull request Sep 6, 2024
3 tasks
@KrisThielemans
Copy link
Collaborator

Note that #1430 updates the ML_norm files as well. We didn't manage to finish that one...

@robbietuk
Copy link
Collaborator Author

I am not up to date with the theory in the papers. Do you expect any issue if I fix the fan data by removing the half_fan_size and fan_size calculations and replace them with methods that better derive from get_num_tangential_poss(). I feel this is the problem now.

@robbietuk
Copy link
Collaborator Author

I changed the scanner to "Discovery 690", with an odd number of tangential positions (381 I believe). This allows the test to pass.

The previous issue with 0 and +inf values in the resulting projection data is pushed to issue #1511

Copy link
Collaborator Author

@robbietuk robbietuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few check before I believe this is done.

@@ -73,7 +73,7 @@ main(int argc, char** argv)
char* in_filename = new char[in_filename_prefix.size() + 30];
sprintf(in_filename, "%s_%s_%d_%d.out", in_filename_prefix.c_str(), "eff", iter_num, eff_iter_num);
std::ifstream in(in_filename);
in >> norm.crystal_efficiencies();
in >> norm.get_crystal_efficiencies();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns by reference. This allows backwards compatibility with this code snippet but in general get_crystal_efficiencies() method returning the efficiencies object by reference might be dangerous?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const & would do the trick

@robbietuk
Copy link
Collaborator Author

Pretty sure this PR is ready for review. I can't find any more outstanding issues. #1511 covers the other issue with the fan data and even numbers. Ill add to the release notes.

@robbietuk robbietuk added this to the v6.3 milestone Dec 22, 2024
@robbietuk
Copy link
Collaborator Author

This should be merged into v6.3. I have been using a branch derived from here for a while with no major issues.

Copy link
Collaborator

@KrisThielemans KrisThielemans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small changes required, but unfortunately quite a lot of them.

We cannot rename functions/files in 6.3, so I've flagged those up. Deprecations need to be listed in the release notes.

efficiencies = eff;
}

DetectorEfficiencies& get_crystal_efficiencies()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for changing to get/set pair, but then I think we should have

Suggested change
DetectorEfficiencies& get_crystal_efficiencies()
const DetectorEfficiencies& get_crystal_efficiencies() const

or would this break something?

However, cannot change backwards compatibility in a minor version update, so keep

#if STIR_VERSION < 070000
STIR_DEPRECATED DetectorEfficiencies& crystal_efficiencies()
...
#endif

Obviously the same for the others.

geo_data = geo;
}

GeoData3D& get_geometric_factors()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also const

block_data = block;
}

BlockData3D& get_block_factors()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also const

/*!
\file
\ingroup recon_buildblock
\brief Declaration of stir::ML_estimate_component_based_normalisation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\brief Declaration of stir::ML_estimate_component_based_normalisation
\brief Declaration of stir::ML_estimate_component_based_normalisation and
stir::MLEstimateComponentBasedNormalisation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obviously, need to adjust when keeping ML_estimate_component_based_normalisation in its original file.

\param do_display_v Whether to display the progress of the algorithm.
\param do_save_to_file_v Whether to save the each iteration of the efficiencies, geo data and block data to file.
*/
MLEstimateComponentBasedNormalisation(std::string out_filename_prefix_v,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally STIR uses const std::string&

\ingroup recon_test
\brief Test program to ensure MLEstimateComponentBasedNormalisation works as expected and
produces the correct normalization factors from components.
This is a very basic test that ensure the basic functionality.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is a very basic test that ensure the basic functionality.
This is a test on basic functionality only.

Comment on lines +46 to +47
const auto scanner = std::make_shared<Scanner>(*Scanner::get_scanner_from_name("Discovery 690"));
const auto exam_info = std::make_shared<ExamInfo>(ImagingModality::PT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to scanner_sptr and exam_info_sptr

const auto exam_info = std::make_shared<ExamInfo>(ImagingModality::PT);
exam_info->patient_position = PatientPosition(PatientPosition::HFS);

const shared_ptr<ProjDataInfo> projdata_info(ProjDataInfo::ProjDataInfoCTI(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ProjDataInfo::construct_proj_data_info (same signature)

/*views*/ scanner->get_num_detectors_per_ring() / 2,
/*tang_pos*/ scanner->get_default_num_arccorrected_bins(),
/*arc_corrected*/ false,
/*tof_mash_factor*/ -1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why -1? Set to 0 for non-TOF.

const auto model_projdata = std::make_shared<ProjDataInMemory>(*measured_projdata);
model_projdata->fill(2.F);

constexpr int num_eff_iterations = 6;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this smaller (to run faster)?

@KrisThielemans
Copy link
Collaborator

Also, you'll need to run pre-commit run --all (see the devel doc). Also check the Codacy report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ML_estimate_component_based_normalisation and apply_norm_factors3D cannot be used without saving to disk
2 participants