Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Large scale Refactoring #1232

Closed
wants to merge 10 commits into from
Closed

WIP: Large scale Refactoring #1232

wants to merge 10 commits into from

Conversation

Hckjs
Copy link
Collaborator

@Hckjs Hckjs commented Feb 14, 2024

WIP: It is the first very very draft on refactoring the whole lstchain code to

  • Move to the standard ctapipe data format (by preferably using the DataWriter)
  • Switch to the ctapipe config system
  • Rewrite and untangle the spaghettis by using/importing as much as possible from ctapipe - also preparing to integrate code step by step into ctapipe.

The first idea is to implement a LSTProcessor-Tool analogous to ctapipe's Processor Tool with LST-specific Components inheriting as much as possible from ctapipe classes. The aim here would be to mainly substitute the r0_to_dl1, dl1_ab and dl1_to_dl2 scripts to have just one tool that you can feed with different configs regarding your desired analysis step. The new analysis flow could look like:

1) R0 to DL1

lstchain_process --input r0_events.fits.fz --config config.yaml --output events.dl1.h5

Processing R0 data up to dl1 with Cat-A calibrations all defined in a base config.yaml. It can optionally write out muons and interleaved events (maybe also allow to only write out the interleaved pedestal events directly for Cat-B calibrations)

2) Reprocess DL1 (DL1ab)

lstchain_process --input events_cat_A.dl1.h5 --cat-b-calibrations cat_B-calibrations.h5  --config config.yaml config_dlab.yaml --output events_cat_B.dl1.h5

Not only for reprocessing dl1 data (e.g. with different cleaning settings...), but also for applying Cat-B calibrations and including pedestal cleaning. All defined in an additional config_dl1ab.yaml.

3) DL1 to DL2

lstchain_process --input Cat_B.dl1.h5 --config config_dl2.yaml --output events.dl2.h5

Process DL1 to DL2 with a specific (src (in)dependet) config_dl2.yaml.

Or directly from R0 to DL2

lstchain_process --input r0_events.fits.fz --cat-b-calibrations cat_b_calibrations.h5 --config config.yaml --output events.dl2.h5

To process directly from R0 to DL2 (with Cat-B calibs) you can first write out the interleaved events with the 'interleaved-only'-mode and compute the cat-B calibrations with the separate tool. After that you hand in your config, the cat-B calibrations and RF-Models (maybe trained by ctapipe's ML module - ctapipe-train-...) to the processor tool and process directly to DL2.

  • MC processing should be analogous to the above scheme.

Tasks

The tool would basically look like ctapipe's Processor tool:

for event in source:

    if event.trigger.event_type in {
        EventType.FLATFIELD,
        EventType.SKY_PEDESTAL,
    }:
        self.process_interleaved(event)

    if self.should_tune_waveforms:
        self.tune_waveforms(event)

    if self.should_calibrate:
        self.calibrate(event)

    if self.should_compute_dl1:
        self.process_images(event)

    if self.should_compute_muon_parameters:
        self.process_muons(event)

    if self.should_compute_lhfit_parameters:
        self.process_lhfit(event)

    if self.should_compute_dl2:
        self.process_shower(event)

    if not self.write_interleaved_only:
        self.write(event)

with different Components passing and filling the ArrayEventContainer. That includes:

  • LSTInterleavedProcessor component which process the interleaved pedestal and flatfield events with the CalibrationCalculator. There should also be the possibility of writing only the r1 waveforms to an extra .h5 file to calculate the cat-B coefficients. For that it would be also nice to add a flag in ctapipe_io_lst for the gain selection of calibration events, so we don't have to calibrate at this level.
  • WaveformNSBTuner component to calculate (based on a target file) and apply NSB on waveform-level to MC-data.
  • LSTCameraCalibrator component which inherits from ctapipe's CameraCalibrator. It needs to be adapted for applying the cat-B calibrations consisting of either filling the DL1CameraCalibrationContainer and applying them in the ._calibrate_dl1 step or applying them afterwards on peak times and charges if no waveforms are available (for 'dl1ab').
  • LSTImageProcessor component inheriting from ctapipe's ImageProcessor which uses its own LSTImageCleaner. It needs to be adapted regarding the API of the ImageCleaner to allow pedestal cleaning.
  • LSTMuonProcessor component inheriting from ctapipe's MuonProcessor adding an own CameraCalibrator instance which can use a different extractor (e.g. GlobalPeakWindowSum) for the muon analysis. It also creates its on DataWriter instance for writing promising muon events to an extra .h5 file.
  • LHFitProcessor component which computes lhfit parameters, fills its own container an write it to the output file.
  • ShowerProcessor component. I did not yet dive deep enough into ctapipe's ML module and lstchains dl1_to_dl2 step to know exactly what we can use/inherit from. WIP.

A first basic config can be found in lstchain/data/lstchain_base_config.yaml
For contributing please open new PR's to this branch and reference them in the associated tasks above to not overload this PR for reviewers. I would suggest at least one PR per component

Additional tasks:

  • Update datachecks
  • Update (onsite)-scripts (and maybe convert some of them to tools)
  • Update notebooks
  • Delete obsolete modules/functions

I'm looking forward to all your ideas and comments. Feel free to edit/adapt the tasks.

@Hckjs Hckjs added enhancement New feature or request help wanted Extra attention is needed workinprogress labels Feb 14, 2024
@moralejo
Copy link
Collaborator

Hi @Hckjs , I don't understand why to implement this in lstchain a la ctapipe, rather than adapting lstchain to create (for the existing data) DL0 files that can then be fully processed with ctapipe. That is the idea behind #1176 (nothing done yet as far as I know, just informal discussion off-github). Doesn't that make more sense?

@maxnoe
Copy link
Member

maxnoe commented Feb 16, 2024

@moralejo this is the more long standing issue of #972

Are you saying we should skip that step and directly go converting DL0 files and using only ctapipe? I think this will take longer to get everything we currently do differently / additionally into ctapipe, and this exercise here is actually a large part of what is needed to do that (identify what is different, convert code into how ctapipe would do things, then start moving those to ctapipe itself).

I think this is largely independent of the question if we read already calibrated DL0 using ctapipe_io_zfits or do the calibraation on the fly using ctapipe_io_lst from R0 / R1 files. This is about the steps that come after.

@moralejo
Copy link
Collaborator

@moralejo this is the more long standing issue of #972

Are you saying we should skip that step and directly go converting DL0 files and using only ctapipe?

I think that is a better long-term solution for dealing with the existing LST-1 data.

I think this will take longer to get everything we currently do differently / additionally into ctapipe, and this exercise here is actually a large part of what is needed to do that (identify what is different, convert code into how ctapipe would do things, then start moving those to ctapipe itself).

Those additional developments (to process LST DL0 data in the way we want to, i.e. with the features currently implemented in lstchain) should be done in ctapipe. Why implementing them in lstchain before, and then export them to ctapipe? Let's focus instead in producing LST1-DL0 which can be used in the improvement of the ctapipe pipeline for real data. I see no advantage in doing it first in lstchain, and at the same time I see a large potential for accidentally introducing problems in our working (though cumbersome) system.

I think this is largely independent of the question if we read already calibrated DL0 using ctapipe_io_zfits or do the calibration on the fly using ctapipe_io_lst from R0 / R1 files. This is about the steps that come after.

Indeed, but if one gets to CTA-standard DL0 with lstchain, the later steps can be done in ctapipe.

Also, doing the developments in ctapipe will make more clear that they belong (as they should) to the DPPS IKCs, rather than being LST-internal developments.

@Hckjs Hckjs closed this Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed workinprogress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants