Simplify feature bounds using np.nan instead of masking #283

jdtsmith · 2024-05-09T01:43:31Z

Partially masked arrays are not really well supported, and required elaborate workarounds for pretty printing. Here we switch to using a structured array type (val, min, max fields), and represent "fixed" bounds with a pair of np.nan values for min and max, instead of masking them. I.e. nan is interpreted as the same as negative or positive infinity, unless both bounds are nan, which indicates a fixed parameter.

"Fully open" (unconstrained) bounds are expressed just as before, with (-np.inf, np.inf). But in general: don't do those ;).

Partially masked arrays are not well supported, and required elaborate workaround for pretty printing. Here we use a structured array type, and represent "fixed" bounds with np.nan instead of masking. I.e. non is interpreted as the same as negative or positive infinity, unless both bound are nan, which indicates a fixed parameter.

A normal masked column using structured numpy arrays and nan for both bounds when fixed now suffices.

drvdputt

I fully support the main concept of this pull request. The ('val' 'min' 'max') datatype feels nice and explicit. Much better than the cryptic 3-tuples from before (where you always had to index on 0, 1, or 2 to addres the needed value)

There are some mistakes, but I fixed them myself to see how the code would continue.

I got it working up to where the Features table is loaded. But the new data type and way of addressing the values, makes it so the code in base.py no longer works. I /could/ try to fix the base.py code. But would very much like to avoid this, as we will remove it soon...

pahfit/features/features.py

jdtsmith · 2024-05-09T18:59:12Z

While working on the representation, I also stamped out the dtype printing on __repr__ (which is oddly on by default), and allow now a meta['pahfit_format'] on the table and/or individual columns, to affect the printed representation. E.g.

f.meta['pahfit_format']='.2f'
f['wavelength'].meta['pahfit_format']='.3f'

yielding:

...
    H2_S(7)     H2_lines           line              --             --  5.511 (fixed) 0.00 (0.00, ∞)              --              --       --
    H2_S(6)     H2_lines           line              --             --  6.109 (fixed) 0.00 (0.00, ∞)              --              --       --
    H2_S(5)     H2_lines           line              --             --  6.909 (fixed) 0.00 (0.00, ∞)              --              --       --
    H2_S(4)     H2_lines           line              --             --  8.026 (fixed) 0.00 (0.00, ∞)              --              --       --
...

To be documented somewhere.

drvdputt · 2024-05-09T20:41:03Z

It looks like all of this is working as intended. At least until the actual until the Features -> astropy model part.

We cannot really make all the tests succeed for this, without revising all the pieces of code that directly address the Features table (I tried to make things work for a bit, but there are more of these than I expected). Especially, I cannot justify revising all of base.py just to make the tests work, since I aim to delete it soon, when replaced by APFitter.

If we merge this, then we will have to continue in this broken state until I have integrated and adapted APFitter.

I suggest we merge the units pull request first (the extended version I just suggested, #284 ). In that case, the tests are still working, at least on my machine. And the plot rework #281 too.

After that, we can do the big break: merge this, and then include my internal API rework (which will be easier to adapt to the new Features format).

jdtsmith · 2024-05-09T20:52:58Z

This sounds good. What type of updates to APFitter do you mean? Features is still accessible in the same way, and in fact p[0] and p['val'] both work.

drvdputt · 2024-05-09T21:01:54Z

Things that do not work anymore

access like this: f['fwhm'][:, 0] now has to be f['fwhm']['val']. So I expect there to be more cases where not all the ways of accessing still work.
access to the .mask attribute does not seem to work in some cases

The new columns are a structured array, instead of a regular 2D array, so there are some restrictions.

jdtsmith · 2024-05-09T21:14:51Z

OK. Overall I think the structured array is much cleaner. I intend to merge your extend units PR #284 once finalized, then #281, then this #283.

jdtsmith · 2024-05-09T21:15:48Z

Also do not worry about soon-to-be-forgotten base.py.

jdtsmith mentioned this pull request May 9, 2024

Obscure bug in Features table formatter #273

Closed

jdtsmith changed the base branch from master to dev May 9, 2024 13:08

jdtsmith added 4 commits May 9, 2024 09:14

Add missing colon

3017dda

Remove BoundedMaskedColumn

ec97932

A normal masked column using structured numpy arrays and nan for both bounds when fixed now suffices.

Fix bounded_is_missing arg usage

afc0bc9

features: doc improvements

ef094af

This was referenced May 9, 2024

features.util.bounded_is_fixed() does not work for dust_feature-only science packs #274

Closed

Workarounds for issues with Features column definition and formatting #282

Closed

drvdputt suggested changes May 9, 2024

View reviewed changes

pahfit/features/features.py Outdated Show resolved Hide resolved

pahfit/features/features.py Outdated Show resolved Hide resolved

pahfit/features/features.py Show resolved Hide resolved

pahfit/features/features.py Outdated Show resolved Hide resolved

jdtsmith added 8 commits May 9, 2024 12:56

Fix dtype var mention

4fefb0a

_construct_table: fix dtypes formulation

3c9fb7a

Make linter happier

2729f89

_construct_table: hard-code str for column types for no-bounds cols

ce49a70

Fix dtype and format for structured array Features cells

1cd8404

Suppress mention of [val, min, max] in column headers

cf180ee

Features: Strip dtype from default __repr__

be7cca1

Allow meta['pahfit_format'] on table and columns

19e9c25

jdtsmith mentioned this pull request May 9, 2024

Standardize units (extended) #284

Merged

jdtsmith merged commit 58038bc into PAHFIT:dev May 13, 2024
3 of 15 checks passed

jdtsmith mentioned this pull request May 13, 2024

Crash in astropy.table.pprint when using lines-only science pack #266

Closed

This was referenced May 13, 2024

New internal API: agnostic functional model setup / results retrieval #257

Closed

Piecewise/partial integration of drvdputt/experimental #280

Closed

Fitter API and make test cases work #289

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify feature bounds using np.nan instead of masking #283

Simplify feature bounds using np.nan instead of masking #283

jdtsmith commented May 9, 2024 •

edited

Loading

drvdputt left a comment

jdtsmith commented May 9, 2024

drvdputt commented May 9, 2024 •

edited

Loading

jdtsmith commented May 9, 2024

drvdputt commented May 9, 2024 •

edited

Loading

jdtsmith commented May 9, 2024

jdtsmith commented May 9, 2024

Simplify feature bounds using np.nan instead of masking #283

Simplify feature bounds using np.nan instead of masking #283

Conversation

jdtsmith commented May 9, 2024 • edited Loading

drvdputt left a comment

Choose a reason for hiding this comment

jdtsmith commented May 9, 2024

drvdputt commented May 9, 2024 • edited Loading

jdtsmith commented May 9, 2024

drvdputt commented May 9, 2024 • edited Loading

jdtsmith commented May 9, 2024

jdtsmith commented May 9, 2024

jdtsmith commented May 9, 2024 •

edited

Loading

drvdputt commented May 9, 2024 •

edited

Loading

drvdputt commented May 9, 2024 •

edited

Loading