Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workarounds for issues with Features column definition and formatting #282

Closed
wants to merge 2 commits into from

Conversation

drvdputt
Copy link
Contributor

@drvdputt drvdputt commented May 7, 2024

There were a series of strange issues with the Features class in certain edge cases

To solve #274, both the Column and MaskedColumn definitions of Features were set to BoundedMaskedColumn.
The formatting hacks for Features were revised to deal with the above change and with #273 and #266.

The new formatting method, takes the default output of the formatter (using if/else to cover all possible strings created for a 3-tuple), and modifies it through string replacements.

This pull request is independent from #281, so one can go in after the other (at most a simple rebase required).

Dries Van De Putte added 2 commits May 7, 2024 17:11
Avoids the hack where the shape() property is temporarily redefined, a
source of a few crashed I encountered. The new method works by parsing
and replacing the standard output of the pretty printer.
The differences between BoundedMaskedColumn and Column were causing
hard-to-debug crashes under very specific circumstances.
@jdtsmith
Copy link
Contributor

jdtsmith commented May 8, 2024

A very relevant conversation about the pretty-printing is here. You can see me dealing with the fact that the default printer omits the middle of a 3-tuple (like our bounded param).

Update: I'm pretty queasy about adding string hacks on top of hacks like this. Using structured arrays for the data without anything else will work, but right now any masked value makes Table mask the entire row's value (i.e. all "fixed" values show up as --). I think that's why I ended up with the current hack (temporarily lying about the shape of the underlying data). Let's see if we can resolve the issue you saw more simply. Update: simple solution in #283.

@jdtsmith
Copy link
Contributor

jdtsmith commented May 9, 2024

I think this is entirely superseded by the much simpler #283; please have a look at that. Here's my proposal: let's merge

  1. Standardize units #259
  2. Rewrite of Model.plot() #281 (once review is addressed)
  3. Simplify feature bounds using np.nan instead of masking #283 (pending your review)

to dev, test that combo. If all's well on dev proceed with the next steps (targeting dev until we're happy).

@drvdputt
Copy link
Contributor Author

drvdputt commented May 9, 2024

Ok, I will review the above three pull requests. I'll post another comment here once I'm done with this, and then we'll pull those into dev. I will then test test some of my edge cases in dev, to see if we still need any extra hacks on top of #283.

@jdtsmith
Copy link
Contributor

I think this has been superseded by #283 and can be closed, yes?

@drvdputt
Copy link
Contributor Author

Yes, we have a different approach in the other pull request now. Closing.

@drvdputt drvdputt closed this Jun 25, 2024
@drvdputt drvdputt deleted the features_bugs branch July 3, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants