-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the NXdata scaling_factor and offset fields #1333
Conversation
I purport this doesn't need a vote as it doesn't change functionality, and the change to NXmx only clarifies functionality that was already there. Feedback welcome! |
The original description for the gain sounds like I know you confirmed the order of the gain and the offset, but I don't know if the definition of the gain was confirmed. This post is just to double check it. |
Ack, I think you might be right and I misinterpreted this field!
So I think I now agree this is saying the stored value has already had the scaling factor applied. But that reads differently from offset:
That implies the offset hasn't been implied yet! So, @nexusformat/developers, how do you use scaling_factor and offset in NXdata, if at all? |
Bump! @nexusformat/developers :) How do you use scaling_factor and offset in NXdata, if at all? |
I don't use these fields but it seems that if you want to document the stored values then |
I have always interpreted these fields as things that need to be done to the |
I second #1333 (comment) by @benajamin with the exception that I always think about stored value vs. plotted value (meaning the coordinate in the plotting coordinate system) since NXdata is meant to represent "data to be plotted"
|
Ok I like switching the discussion to "stored" vs. "plotted". Based on that, just to recap without math, there are two general ways these can be interpreted:
I hope the answer is 1. to match @biochem-fan's use case, but it seems to line up with some of the comments above. I do note that both @benajamin and @woutdenolf switched the order of operations to match the "bias" version that @PeterC-DLS suggested, compared to the pedestal version that I originally suggested: a. a. makes more intuitive sense to me as it more closely matches the equation of a line |
Change NXdata scaling_factor to refer to "plotted" data Change NXdata to refer to "corrected" data, in addition to "physical" data, since it describes units of photons
I've made the change to use the "plotted" nomenclature for NXdata, but I've kept the pedestal formula for now, until we get a bit more discussion here ( |
LGTM. For clarity's sake in your comments, I think |
This resolves ambiguity if there is more than one signal For NXmx specify data_scaling_factor and data_offset since the field data is named in the NXdata group
Feedback from Telco addressed. If there's no more comments we'll send this to a vote. |
Hello, please vote by providing an emoji on this comment. Thanks. |
Not happy with the term |
@prjemian |
Since |
Wait it's not being changed in the PR. For NXdata it's Pedestal is just a documentation term. Does that help? |
|
||
This formula will derive the corrected value, when necessary. | ||
|
||
Use these fields to specify gain and/or pedestal constants that need to be applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use these fields to specify gain and/or pedestal constants that need to be applied | |
Use these fields to specify gain and/or offset constants that need to be applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add more clarification in a different PR after this vote.
to the data to correct it to physical values. For example, if the detector gain | ||
is 10 counts per photon and a constant background of 400 needs to be subtracted | ||
off the pixels, specify data_scaling_factor as 0.1 and data_offset as -400 to | ||
specifiy the required conversion from raw counts to pedestal-corrected photons. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specifiy the required conversion from raw counts to pedestal-corrected photons. It | |
specify the required conversion from raw counts to offset-corrected photons. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add more clarification and fix the typo in a different PR after this vote.
Depending on the field, various terms are used: "offset", "pedestal" and "bias". For example, ThermoFisher's electron detectors use "bias". Perhaps keeping "offset" for NXdata is a good idea but I would prefer to have other term mentioned as well in the NXmx documentation for better searchability. |
Should reserved suffixes be updated too? |
Yes
…On Wed, Jan 31, 2024, 3:35 AM Peter Chang ***@***.***> wrote:
Should reserved suffixes
<https://manual.nexusformat.org/datarules.html#reserved-suffixes> be
updated too?
—
Reply to this email directly, view it on GitHub
<#1333 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARMUMAQXZF42W4M7HS25JLYRIF6LAVCNFSM6AAAAAA7VKHPNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJYG4ZDMOJVGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Happy to modify reserved suffixes in a different PR after this vote. |
|
||
.. code-block:: | ||
|
||
plotted_data = (data + offset) * scaling_factor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly disagree with the use of "plotted_data" and "data" in this equation. Firstly, "data" is ambiguous and should be something like "stored values", "dataset values", or "recorded values". Secondly "plotted data" makes no sense because nothing is plotted (past tense) at the time when one is considering this equation and it is also meaningless because one could plot any old values. The NeXus manual says that we strive to record physically meaningful values - this equation is to be used when we do not record physically meaningful values and so its purpose should be to convert to physically meaningful values. Therefore, I would argue that "physical values" should definitely be used instead of "plotted values".
</doc> | ||
</field> | ||
|
||
<field name="scaling_factor" type="NX_FLOAT" deprecated="Use FIELDNAME_scaling_factor instead"> | ||
<doc> | ||
Due to scaling_factor being ambiguous in the case of multiple signals, use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of points (both minor)
-
I suggest having some throw-away comment describing the intended semantics (e.g., "Had similar semantics to FIELDNAME_scaling_factor"). This is to allow someone reading the spec to understand how to interpret existing data, where
scaling_factor
has already been used.This comment also applies to the
offset
field. -
Since the ambiguity comes in the case where multiple signals are present (per the proposed doc), I'm assuming the ambiguity doesn't exist if there is a single signal. For single signal data, is
scaling_factor
still deprecated? I would imagine so, but perhaps the wording could be made more explicit; for example, by making the statement "use FIELDNAME_scaling_factor instead" a separate sentence, perhaps qualifying it by saying something like "all future data should use ..." .This comment also applies to the
offset
field.
I consider neither comment blocking
LGTM 😄 |
Vote did not pass (got 12 votes, needed 13 for quorum), which is fine given all the discussion. For the sake of clarity I'm closing this PR, removing it and the associated issue from the milestone, and I will make a new PR that addresses the feedback here. We'll try again then. |
Closes #1332
Also adds clarification in NXmx that these fields can be used as pedestal and gain correction fields, as well as elaborates on the possible rank options. These rank options were implied (in my opinion) in the original wording, but in NXmx I made it more explicit.
Tagging @biochem-fan for reference