diff_package_list in Rex::Pkg::Redhat, for multiple pks with same name #1620

akarelas · 2024-08-25T23:08:05Z

This PR is a proposal to fix #1149 by creating a diff_package_list method for the Rex::Pkg::Redhat module.

Please review and merge, or let me know how to improve it further.

Checklist

ferki · 2024-08-26T06:10:41Z

Thanks for contributing, @akarelas!

Skimming over the content, and running the full test suite locally indicates this is a great starting point. Let's iterate towards merging together about any remaining details. I'll leave first review notes separately soon.

ferki

Apparently the GitHub workflow for full testing failed to trigger due to merge conflicts. It looks like the new code is based on top of an earlier version of the default branch, which had new commits since that point.

Due to this, I unticked the related review checklist items for now. We'll track those based on the actual state of PR and test results.

Please rebase your changes on top of the current state of the default branch, and (force) push. With the conflicts solved, I expect to get the automated full test suite results from the GitHub workflows.

Let me know if you need help with the rebase and pushing again. While technically GitHub allows me to do this on your PR branch, that would overwrite your history there, so I prefer do it more as a last resort than first.

akarelas · 2024-08-26T06:47:15Z

I based it on the latest default branch now.

akarelas · 2024-08-26T06:49:56Z

Reworded by Test commit

ferki

Thanks for rebasing!

Now the full workflow has finished, we are getting closer, and see that solely the extra tests failing. This is highly promising, as the functionality seems to pass everywhere without breaking anything we already know.

The extra tests are about the code itself instead of what it does: code convention, code formatting, minimum required perl syntax, pod syntax, etc. The currently established incremental improvement idea regarding code conventions is "each change should make the code base cleaner, or at least keep it the same". In other words, while we don't need to fix everything that is already present, we need to ensure all new changes are clean.

Here are the single test failure reported about increased number of perlcritic violations compared to the default branch: https://github.com/RexOps/Rex/actions/runs/10554972139/job/29237791411#step:12:17

# ControlStructures::ProhibitPostfixControls: Got 313 violation(s).  Expected no more than 309.
# References::ProhibitDoubleSigils: Got 84 violation(s).  Expected no more than 81.
# TooMuchCode::ProhibitDuplicateLiteral: Got 3449 violation(s).  Expected no more than 3428.
# Too many Perl::Critic violations...
# Got a total of 461. Expected no more than 19433.

Let's fix those perlcritic feedback by iterating on the commits. It's fine to send additional commits on top which we can collapse into the current ones later, or just edit the commits locally, and (force) push again until everything is green.

Please see the perlcritic docs for each policy for their description (perlcritic --doc $policy_name), and run perlcritic to get detailed line info about what it complains about and where (perlcritic --single-policy $policy_name $filename). You may need to install the perlcritic policies locally (see dzil listdeps --develop.)

Let me know how much help you may need with tracking down and fixing these perlcritic issues. I'm happy to provide feedback accordingly, for example some ideas:

you iterate on this PR on your own until everything is green
I propose how to fix each, optionally with a diff you may apply
I fix it completely on my own later

Redhat can have multiple packages installed with the same name (eg kernels), a fact that Rex::Pkg::Base::diff_package_list is not prepared to deal with, resulting in calling the on_change hook when you run the update_system command, even if nothing changed in the packages installed (see also bug RexOps#1149). This commit fixes that.

akarelas · 2024-08-26T14:06:38Z

Finished with extra tests. Can you please recheck?

ferki

Thanks for the patience, while I went through everything in a way to enable meaningful feedback. I carefully read the code, and played with different situations, or alternative approaches.

Overall, on top of a review, it seems we are already through the red-green (fail-pass, test-code) steps of a standard TDD cycle, and it's time to refactor (improve based on what we learned) too.

The key code change seems to be treating the version and release metadata of a package together as a concatenated string to identify unique instances of the package. On Debian the criteria seemed to be more like name+arch, not sure arch plays a role on Red Hat yet. Do you have any experience or info about that?

The chosen implementation may perform better too, though I did not made any benchmarks yet. Mainly due to the Base and Redhat implementations do different things, which makes them harder to compare. Let's focus on correctness first.

Apart from that, anything else I noted down was related to the test approach. Since I felt showing a proposed change would be easier instead of only describing it, I decided to push a proposed commit to this branch.

What I also learned is the related on_change failure only seems to happen when the new version of the package is listed before the old version of the package with the same name.

Let me know what you think, and whether would you like to change anything else.

ps.: I'm not yet sure about how we may be able to test the actual on_change hook instead of only the underlying diff_package_list() implementation 🤔 I'd like to think a bit more about that part, while everything else gets addressed.

ferki · 2024-09-02T11:19:08Z

t/package.t

@@ -62,4 +65,75 @@ subtest 'local package installation' => sub {
 lives_ok { $pkg->update('test_package') }, 'update test package';
 };

+subtest 'redhat package list diffs' => sub {
+ ## no critic (ProhibitDuplicateLiteral)


I find the scope of this ## no critic hint too wide. Comparing to the above text example, the exemptions also cover quite some code, instead of solely the data used to drive the test cases.

I'd like to move this to wrap only the data, and use the revealed perlcritic complaints to see how we may improve the implementation. I expect we could use this to simplify the test code.

ferki · 2024-09-02T11:22:18Z

t/package.t

+
+ plan tests => 4;
+
+ my $rh_pkg = Rex::Pkg::Redhat->new;


Nitpick: the other instantiation of a Rex::Pkg class is outside the test cases, and this introduces a different convention.

Long run I'm fine with both, depending where the evolution of t/package.t takes us over time. Normally I would lean towards consistency, while it's also fine for now to not know exactly which approach will serve us best later (so we allow ourselves to decide later.)

It may not be necessary to discover the answer or fix it now, I just thought it's worth mentioning while reading the changes carefully.

ferki · 2024-09-02T11:46:25Z

t/package.t

+ my @plist1 = @{ dclone( \@orig ) };
+ my @plist2 = @{ dclone( \@orig ) };
+ my @expected = ();
+
+ my @mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
+ cmp_deeply( \@mods, \@expected,
+ 'expected package modifications when nothing changed' );
+
+ @plist1 = @{ dclone( \@orig ) };
+ pop @plist1;
+ @plist2 = @{ dclone( \@orig ) };
+ @expected = ( { %{ $orig[2] }, action => 'installed' } );
+
+ @mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
+ cmp_deeply( \@mods, \@expected,
+ 'expected package modifications when new kernel release is installed' );
+
+ @plist1 = @{ dclone( \@orig ) };
+ pop @plist1;
+ @plist2 = @{ dclone( \@orig ) };
+ shift @plist2;
+ @expected = ( { %{ $orig[2] }, action => 'updated' } );
+
+ @mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
+ cmp_deeply( \@mods, \@expected,
+ 'expected package modifications when new kernel release is installed and an old one removed'
+ );
+
+ @plist1 = @{ dclone( \@orig ) };
+ @plist2 = @{ dclone( \@orig ) };
+ shift @plist2;
+ @expected = ( { %{ $orig[0] }, action => 'removed' } );
+
+ @mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
+ cmp_deeply( \@mods, \@expected,
+ 'expected package modifications when only a kernel release is removed' );


I noticed two things about this section:

It feels quite a lot of code compared to the above purely data-driven approach. I find it a lot harder to follow what exactly gets tested now, compared to "given these inputs, this is the expected output".

I'd like to check how that approach would fit us here too. I expect we could remove the Storable dependency, and also let ## no critic only cover the data then.

These tests seem to only test a single modification at a time, meaning 1 no-change, 1 removal, 1 update, 1 installation. The above tests follows a more complete "multiple changes at once, one of each".

While certainly fine to test single changes too, I'd prefer the complete one to make the comparison between current and proposed implementations easier.

For example the Base and Debian implementations of diff_package_list() strictly orders changes by their type: first all installs, then all removes, then all updates. The Redhat one seems to follow "either installs or updates mixed, then removes" (presumably trading strict ordering for performance.)

I believe the Base and Debian behavior is merely a side-effect, instead of intentionally implemented design criteria. I also doubt mixed ordering would break any use cases – though it's already part of the public interface for a long time, so Hyrum's law may apply 🙃

I'm fine to allow for flexibility in the ordering, with the note that we may need to come back to this later if we find out users do rely on the order of change types somehow. Currently I think nobody should depend on this, and unlocking efficient diff algorithms may warrant mixed orders too. The improved Redhat implementation sounds like a good opportunity to experiment more and gather new info.

In that sense, using cmp_bag instead of cmp_deeply may also fit better here to signify "we don't promise strict order of changes on Rex Hat package diffs".

ferki · 2024-09-02T11:52:08Z

t/package.t

+ 'expected package modifications when only a kernel release is removed' );
+
+ ## use critic
+};


One more note overall: we seem to aim fixing on_change for pkg calls, and I found no tests for that. I agree it may be tricky to add, though I'd like to think of a good way to include that too (=a test to prove we can detect on_change bugs in the future, and then add code to prove we fixed the known cases for now.)

ferki · 2024-09-02T13:37:19Z

Oh, ouch, yes, the test workflow currently fails on older Strawberry Perl versions due to having outdated Mozilla::CA preinstalled while actions-setup-perl switched to "secure by default" cpanm calls. Let's see if that gets resolved upstream, or it's better to apply a workaround downstream here (and then rebase this.)

Let's track that separately, just keep in mind here, as the rest still applies.

akarelas · 2024-09-02T14:40:32Z

I think dclone is needed, because the old diff_package_list subroutine (which we want to fail) edits the pck entries, and without dclone the test would pass on the old code. But let's check again (when I find some time).

akarelas force-pushed the pr-akarelas branch from d2dfaec to 4446d43 Compare August 25, 2024 23:10

ferki requested changes Aug 26, 2024

View reviewed changes

akarelas force-pushed the pr-akarelas branch 2 times, most recently from 2404749 to f3dff6d Compare August 26, 2024 06:45

ferki requested changes Aug 26, 2024

View reviewed changes

akarelas added 2 commits August 26, 2024 13:52

Test diff_package_list on real-world Redhat data

d09963f

akarelas force-pushed the pr-akarelas branch from 5025521 to 862e308 Compare August 26, 2024 14:01

Refine Red Hat package modification tests

7f45387

ferki reviewed Sep 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

diff_package_list in Rex::Pkg::Redhat, for multiple pks with same name #1620

diff_package_list in Rex::Pkg::Redhat, for multiple pks with same name #1620

akarelas commented Aug 25, 2024 •

edited by ferki

Loading

ferki commented Aug 26, 2024

ferki left a comment

akarelas commented Aug 26, 2024

akarelas commented Aug 26, 2024

ferki left a comment

akarelas commented Aug 26, 2024

ferki left a comment

ferki Sep 2, 2024

ferki Sep 2, 2024

ferki Sep 2, 2024

ferki Sep 2, 2024

ferki commented Sep 2, 2024

akarelas commented Sep 2, 2024

diff_package_list in Rex::Pkg::Redhat, for multiple pks with same name #1620

Are you sure you want to change the base?

diff_package_list in Rex::Pkg::Redhat, for multiple pks with same name #1620

Conversation

akarelas commented Aug 25, 2024 • edited by ferki Loading

Checklist

ferki commented Aug 26, 2024

ferki left a comment

Choose a reason for hiding this comment

akarelas commented Aug 26, 2024

akarelas commented Aug 26, 2024

ferki left a comment

Choose a reason for hiding this comment

akarelas commented Aug 26, 2024

ferki left a comment

Choose a reason for hiding this comment

ferki Sep 2, 2024

Choose a reason for hiding this comment

ferki Sep 2, 2024

Choose a reason for hiding this comment

ferki Sep 2, 2024

Choose a reason for hiding this comment

ferki Sep 2, 2024

Choose a reason for hiding this comment

ferki commented Sep 2, 2024

akarelas commented Sep 2, 2024

akarelas commented Aug 25, 2024 •

edited by ferki

Loading