Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Variable grouping of results #95

Merged
merged 14 commits into from
Apr 24, 2024
Merged

Variable grouping of results #95

merged 14 commits into from
Apr 24, 2024

Conversation

jacobvjk
Copy link
Member

@jacobvjk jacobvjk commented Apr 12, 2024

depends on RMI-PACTA/pacta.multi.loanbook.analysis#34
depends on RMI-PACTA/pacta.multi.loanbook.plot#30

  • updates calls of pacta.multi.loanbook.* functions to adjust to the new variable interface
  • gains a parameter BY_GROUP that can be used to generate aggregate metrics by any user defined variable, provided the variable exists in the matched_prioritized data set
  • in the given example, the aggregate alignment metric is calculated for:
    • an aggregate loan book across the six fake input loan books, giving a meta loan book view
    • grouped by loan book, as indicated by "group_id"
    • grouped by another random user-provided variable above the loan book level, as indicated by "foo"
    • grouped by a combination of "group_id" and "foo", giving a sub loan book view
  • at this point, calculating the aggregations at the level of multiple combined dimensions is only done for the calculation of results. Plots are currently limited to one level of aggregation, because a combination across variables is not in all cases straight-forward

NOTE:

  • calculation of the aggregate metric for corporate benchmarks is temporarily suspended, as it turns out too resource intense to calculate these on standard machines and we have not seen heavy use of the benchmark in the aggregate metric at this point

EXAMPLE OUTPUTS with new grouping functionality

Sankey plot at the aggregate level:
plot_sankey_sector

Sankey plot calculated based on the foo split:
plot_sankey_sector_foo

Sankey plot calculated based on the group_id split:
plot_sankey_sector_group_id

Scatter plot alignment by exposure based on foo split:
plot_scatter_alignment_exposure_foo

Scatter plot alignment by exposure based on group_id split:
plot_scatter_alignment_exposure_group_id

@jacobvjk jacobvjk requested a review from jdhoffa April 17, 2024 11:19
@jacobvjk jacobvjk changed the title Variable groups Variable grouping of results Apr 17, 2024
@jacobvjk jacobvjk marked this pull request as ready for review April 17, 2024 11:20
Copy link
Member

@jdhoffa jdhoffa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming by_groups to by_group otherwise LGTM (i think, it's a pretty mammoth PR so I may have missed something)

plot_aggregate_loanbooks.R Outdated Show resolved Hide resolved
plot_aggregate_loanbooks.R Outdated Show resolved Hide resolved
@MonikaFu
Copy link
Collaborator

I run it with test data provided and using the two versions of the supporting packages mentioned. It seems to work correctly in general. For some plots the axis labels overlap with the title but since this is only a demo I am not sure if it is worth it to spend time on it. I guess you would need to play around with figure size when saving. Also - the sankey plot per company is rather unreadable which is to be expected with a big number of companies.

@jacobvjk
Copy link
Member Author

I run it with test data provided and using the two versions of the supporting packages mentioned. It seems to work correctly in general. For some plots the axis labels overlap with the title but since this is only a demo I am not sure if it is worth it to spend time on it. I guess you would need to play around with figure size when saving. Also - the sankey plot per company is rather unreadable which is to be expected with a big number of companies.

I agree, especially re sankey plot with companies. In the end we need to decide what we want to maintain there. At the same time, we could just as well show examples using other variables with less categories. In any case, this seems like a topic for discussing the standardized P4S offering

@jacobvjk jacobvjk merged commit b652a3e into main Apr 24, 2024
@jacobvjk jacobvjk deleted the variable-groups branch April 24, 2024 16:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants