You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Niels from NannyML engineering here to tell you about our 0.6.2 release.
Installing / upgrading
You can get this latest version by using pip:
pip install -U nannyml
Or conda:
conda install -c conda-forge nannyml
What's new?
In this release we've focused on lowering the threshold for trying out NannyML even more. We've made the timestamp_column_name optional. A quick refresher maybe?
The timestamp column is a column in your dataset that represents the time your model was invoked, the time at which your model made a prediction for a given set of features. Having this timestamp allows NannyML to calculate, visualize and track your model performance over time. In a production setting you'll most likely have access to this information, since you're gathering the inputs and outputs of a model deployed somewhere.
However, not everyone is using NannyML within a production context. You might just want to evaluate it. From this version on you'll no longer have to craft an artificial timestamp column just to be able to use NannyML.
There are some side-effects of doing this of course.
We've provided an alternative way of plotting the results; no longer relying on a time-based X-axis but now using the index of each chunk as an X-axis. By splitting your data into chunks, you impose an ordering onto your data. Metrics will be plotted in that order.
You can no longer use the PeriodBasedChunker to chunk your data according to a particular date offset when no timestamp was given.
So, what does that mean for you?
Any code you currently have will still work. 🥳
You can drop the timestamp_column_name argument from any calculator or estimator initializer.
# This will still work, as before
estimator_with_timestamp = nml.CBPE(
timestamp_column_name='timestamp',
y_pred_proba='y_pred_proba',
y_pred='y_pred',
y_true='work_home_actual',
metrics=['roc_auc'],
chunk_size=chunk_size,
problem_type='classification_binary',
)
# But this is also valid now!
# initialize, specify required data columns, fit estimator and estimate
estimator = nml.CBPE(
y_pred_proba='y_pred_proba',
y_pred='y_pred',
y_true='work_home_actual',
metrics=['roc_auc'],
chunk_size=chunk_size,
problem_type='classification_binary',
)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey everybody,
Niels from NannyML engineering here to tell you about our 0.6.2 release.
Installing / upgrading
You can get this latest version by using pip:
pip install -U nannyml
Or conda:
conda install -c conda-forge nannyml
What's new?
In this release we've focused on lowering the threshold for trying out NannyML even more. We've made the
timestamp_column_name
optional. A quick refresher maybe?The timestamp column is a column in your dataset that represents the time your model was invoked, the time at which your model made a prediction for a given set of features. Having this timestamp allows NannyML to calculate, visualize and track your model performance over time. In a production setting you'll most likely have access to this information, since you're gathering the inputs and outputs of a model deployed somewhere.
However, not everyone is using NannyML within a production context. You might just want to evaluate it. From this version on you'll no longer have to craft an artificial timestamp column just to be able to use NannyML.
There are some side-effects of doing this of course.
We've provided an alternative way of plotting the results; no longer relying on a time-based X-axis but now using the index of each chunk as an X-axis. By splitting your data into chunks, you impose an ordering onto your data. Metrics will be plotted in that order.
You can no longer use the
PeriodBasedChunker
to chunk your data according to a particular date offset when no timestamp was given.So, what does that mean for you?
Any code you currently have will still work. 🥳
You can drop the
timestamp_column_name
argument from any calculator or estimator initializer.We documented this behavior a bit more in our data requirements docs.
What's changed?
We've added the missing
s3fs
dependency that caused our CLI to fail when trying to work with S3 buckets for reading/writing data.We've fixed some outdated plotting kind constants being used in the
Runner
class used by the CLI, causing some plot renders to fail.Some documentation fixes.
We've added a load of tests, mainly concerning plotting and the Runner class.
What's up next?
We're now officially in 🌴 downtime 🍸 , meaning we get to work on some of our passion projects ❤️
The results of those will be announced soon.
I wish you all a fully recharging weekend 🔋
Niels
Beta Was this translation helpful? Give feedback.
All reactions