Fix age group rate aggregation example #591

brookslogan · 2025-01-10T19:35:12Z

Checklist

This is part 1 of a 3-part PR. (Checks and lints connect it to the other parts.) A recombined PR will be made to dev. Some of these process things will be handled there. I don't want to do more git surgery. Probably should have just had a couple reviewers for the original PR and pointed to different files.

Please:

[-] Make sure this PR is against "dev", not "main" (unless this is a release
PR).
[-] Request a review from one of the current main reviewers:
brookslogan, nmdefries.
[-] Makes sure to bump the version number in DESCRIPTION. Always increment
the patch version number (the third number), unless you are making a
release PR from dev to main, in which case increment the minor version
number (the second number).
[-] Describe changes made in NEWS.md, making sure breaking changes
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
1.7.2, then write your changes under the 1.8 heading).
See DEVELOPMENT.md for more information on the development
process.

Change explanations for reviewer

See epi_df calls key_colnames incorrectly and aggregates rates incorrectly #587

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Resolves epi_df calls key_colnames incorrectly and aggregates rates incorrectly #587

brookslogan · 2025-01-10T19:50:59Z

CHECK or lints were yelling at me about something in these lines in another PR, but there were also issues with how we did rate aggregation. @nmdefries or @dshemetov are you able to ~~relatively quickly~~ review this ~~so it can unblock the other PRs depending on it~~? ~~If not I guess I could attempt some more git surgery but I think I may have already done too much trying to isolate this + other parts of that the original PR.~~ [Other PRs' review will probably be slower, so this probably isn't the blocker.]

brookslogan · 2025-01-10T19:55:25Z

after merging: open PR here to hold the recombined PR with the intended changes and various CHECK+lint+adjacent fixes that were split off into dependent PRs. [j~~ust making this one merge to dev; it doesn't need to be part of the recombined PR]~~

nmdefries

I switched to summing population-weighted rates in b569131 because I think it's clearer. Please feel free to revert if you don't like it.

I also separated the code into more chunks to better indicate the spot where we actually use sum_groups_epi_df and where we do the check against the reported rate_overall. We should keep the more-separated format whichever calculation approach we use.

nmdefries · 2025-01-11T00:26:43Z

vignettes/epi_df.Rmd

+  group_by(geo_value, time_value) %>%
+  mutate(count = rate * pop / 100e3) %>%
+  ungroup() %>%
+  sum_groups_epi_df(c("count", "pop"), group_cols = group_cols) %>%


suggestion: Since the point of this section in the vignette is to give an example use of sum_groups_epi_df, I think this line should be at the beginning of a new code chunk so it's easier for the reader to see.

Second, this approach seems pretty roundabout. Why not calculate the pop-fraction rate for each age group and then sum?

nmdefries · 2025-01-11T00:27:12Z

vignettes/epi_df.Rmd

+  # compare to published overall rates:
+  inner_join(
+    flu_data_api %>%
+      select(geo_value = location, time_value = epiweek, rate_overall),
+    by = c("geo_value", "time_value"),
+    relationship = "one-to-one", unmatched = "error"
+  )
+# What's our maximum error vs. the official overall estimates?
+max(abs(rate_overall_recalc_edf$rate_overall - rate_overall_recalc_edf$rate_overall_recalc))


suggestion: also separate this out into another chunk.

Fix age group rate aggregation example

c03221e

brookslogan changed the base branch from dev to lcb/key_colnames-revision_summary-age_agg-updates January 10, 2025 19:36

brookslogan requested review from nmdefries and dshemetov January 10, 2025 19:37

brookslogan changed the base branch from lcb/key_colnames-revision_summary-age_agg-updates to dev January 10, 2025 19:59

brookslogan mentioned this pull request Jan 10, 2025

Rework key_colnames #592

Open

7 tasks

brookslogan changed the base branch from dev to lcb/key_colnames-revision_summary-age_agg-updates January 10, 2025 20:10

switch to summing pop-weighted rates

b569131

nmdefries approved these changes Jan 11, 2025

View reviewed changes

style: styler (GHA)

ee04efa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix age group rate aggregation example #591

Fix age group rate aggregation example #591

brookslogan commented Jan 10, 2025 •

edited

Loading

brookslogan commented Jan 10, 2025 •

edited

Loading

brookslogan commented Jan 10, 2025 •

edited

Loading

nmdefries left a comment

nmdefries Jan 11, 2025

nmdefries Jan 11, 2025

Fix age group rate aggregation example #591

Are you sure you want to change the base?

Fix age group rate aggregation example #591

Conversation

brookslogan commented Jan 10, 2025 • edited Loading

Checklist

Change explanations for reviewer

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

brookslogan commented Jan 10, 2025 • edited Loading

brookslogan commented Jan 10, 2025 • edited Loading

nmdefries left a comment

Choose a reason for hiding this comment

nmdefries Jan 11, 2025

Choose a reason for hiding this comment

nmdefries Jan 11, 2025

Choose a reason for hiding this comment

brookslogan commented Jan 10, 2025 •

edited

Loading

brookslogan commented Jan 10, 2025 •

edited

Loading

brookslogan commented Jan 10, 2025 •

edited

Loading