-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update revision_summary
#540
Update revision_summary
#540
Conversation
97fdc29
to
052854f
Compare
592c3a2
to
1353df9
Compare
95b3c02
to
02a679c
Compare
e9b86d4
to
bde7d83
Compare
to ease epipredict transition.
* Produce error rather than default selection when user provides a tidyselection in ... but it selects zero columns. * Change time_within_x_latest to take `values` as a vector * Use `.data` instead of `pick` etc. in some places
So it is not misinterpreted as "the amount of time that it has been near the latest".
and avoid unnecessary `abs()`
* This may fix some behaviors and emit more sensible error messages on yearmonths given yearmonth-incompatible settings. * This should express time differences for weekly data in terms of weeks, and may emit errors given weekly-"incompatible" settings. * This appears to be computationally faster (vs. `as.integer(version) - as.integer(time_value)`).
It's probably best to immediately ungroup after performing grouped operations in our documentation, as leaving things grouped accidentally is a source of errors. Sometime we should consider an overhaul to use `by =` and `.by =` where appropriate (sorting effects not needed) and available (not all operations support this syntax yet). There were already 0s in the example data, so "highlight" with words the effects of completion + note one potential surprise in other applications.
- Note `...` optional in args of slide comp fn. - Push toward computations returning tibbles rather than vanilla data.frames. - Highlight `na.rm = TRUE`'s operation (not the only type of 7dsum/7dav), mention we also show sum. - Immediately ungroup output + save a line using new autogroup-ungroup behavior of epi_slide_opt&co.
So that naming, docs, and implementation all match.
yes, I'm no longer in 100% forecast pipeline mode. I'll put this as high priority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a commit with some minor logic refactors and comments for things I found unclear (some of which came from stuff I wrote long enough ago). lag
is definitely a better name for a bunch of these columns than time
.
I think committing to using week
s or month
s for version
comes with some assumptions about versioning behavior that we may not want to be making. What happens with weekly data released at a daily cadence, for example?
I do think we could probably put these utilities to use elsewhere, but I'm not sure this is the PR for them.
Definitely. But that's what we've already assumed in the archive construction. I was hoping to just make multiple time types work given our current assumptions and handle relaxing this assumption later. Maybe there are two options here:
Preferences / other options? |
oh right, I forgot that typically when I'm working with weekly data I'm actually working with "daily" data with 7 day gaps. Given that, I don't see strong reasons to not include it. |
Fix length -> vec_size. Combining (logical/generic) NA with Dates is apparently slow. Slice with NA_integer_ index instead. Fix docs: dplyr 1.1.0 should also have a generalized dplyr::lag. Removing some dplyr::lag features for speed might be another motivation, but we seem to be faster for some ptypes x sizes and slower for others. Also, don't export this function; we don't need to.
075b608
to
0208a96
Compare
Various iterations of vec_position_lag seem to be trading off performance and whether they beat dplyr::lag for different classes. dplyr::lag appears to be better-performing than many/all variants tested so far for lagging very long Date and character vectors, like we would do during compactification. We might try speeding up compactification by iterating on some of these variants, something inspired by `check_ukey_unique()`, etc., but that's not the present goal, so just use `dplyr::lag()` for now.
e.g., `distribution`s
Plus add some pluralization and capitalization features.
db4688e
to
a4f498b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! A few docs suggestions you can take or leave
Co-authored-by: David Weber <david.weber2@pm.me>
…ary-age_agg-updates' into lcb/update-key_colnames.epi_archive
81582cb
into
lcb/key_colnames-revision_summary-age_agg-updates
Checklist
Please:
PR).
brookslogan, nmdefries.
DESCRIPTION
. Always incrementthe patch version number (the third number), unless you are making a
release PR from dev to main, in which case increment the minor version
number (the second number).
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
1.7.2, then write your changes under the 1.8 heading).
process.
Change explanations for reviewer
revision_summary
to use newkey_colnames.epi_archive
.revision_summary
.revision_summary
.Other work
revision_summary()
adjustmentsMagic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch
epi_archive
's compactify doesn't support distributions #541key_colnames.epi_archive
: double-check intention, fix implementation #539revision_summary
if there are multiple valid ones #571