Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diff_package_list in Rex::Pkg::Redhat, for multiple pks with same name #1620

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

akarelas
Copy link

@akarelas akarelas commented Aug 25, 2024

This PR is a proposal to fix #1149 by creating a diff_package_list method for the Rex::Pkg::Redhat module.

Please review and merge, or let me know how to improve it further.

Checklist

  • based on top of latest source code
  • changelog entry included
  • automated tests pass
  • git history is clean
  • git commit messages are well-written

@ferki
Copy link
Member

ferki commented Aug 26, 2024

Thanks for contributing, @akarelas!

Skimming over the content, and running the full test suite locally indicates this is a great starting point. Let's iterate towards merging together about any remaining details. I'll leave first review notes separately soon.

Copy link
Member

@ferki ferki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently the GitHub workflow for full testing failed to trigger due to merge conflicts. It looks like the new code is based on top of an earlier version of the default branch, which had new commits since that point.

Due to this, I unticked the related review checklist items for now. We'll track those based on the actual state of PR and test results.

Please rebase your changes on top of the current state of the default branch, and (force) push. With the conflicts solved, I expect to get the automated full test suite results from the GitHub workflows.

Let me know if you need help with the rebase and pushing again. While technically GitHub allows me to do this on your PR branch, that would overwrite your history there, so I prefer do it more as a last resort than first.

@akarelas akarelas force-pushed the pr-akarelas branch 2 times, most recently from 2404749 to f3dff6d Compare August 26, 2024 06:45
@akarelas
Copy link
Author

I based it on the latest default branch now.

@akarelas
Copy link
Author

Reworded by Test commit

Copy link
Member

@ferki ferki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rebasing!

Now the full workflow has finished, we are getting closer, and see that solely the extra tests failing. This is highly promising, as the functionality seems to pass everywhere without breaking anything we already know.

The extra tests are about the code itself instead of what it does: code convention, code formatting, minimum required perl syntax, pod syntax, etc. The currently established incremental improvement idea regarding code conventions is "each change should make the code base cleaner, or at least keep it the same". In other words, while we don't need to fix everything that is already present, we need to ensure all new changes are clean.

Here are the single test failure reported about increased number of perlcritic violations compared to the default branch: https://github.com/RexOps/Rex/actions/runs/10554972139/job/29237791411#step:12:17

# ControlStructures::ProhibitPostfixControls: Got 313 violation(s).  Expected no more than 309.
# References::ProhibitDoubleSigils: Got 84 violation(s).  Expected no more than 81.
# TooMuchCode::ProhibitDuplicateLiteral: Got 3449 violation(s).  Expected no more than 3428.
# Too many Perl::Critic violations...
# Got a total of 461. Expected no more than 19433.

Let's fix those perlcritic feedback by iterating on the commits. It's fine to send additional commits on top which we can collapse into the current ones later, or just edit the commits locally, and (force) push again until everything is green.

Please see the perlcritic docs for each policy for their description (perlcritic --doc $policy_name), and run perlcritic to get detailed line info about what it complains about and where (perlcritic --single-policy $policy_name $filename). You may need to install the perlcritic policies locally (see dzil listdeps --develop.)

Let me know how much help you may need with tracking down and fixing these perlcritic issues. I'm happy to provide feedback accordingly, for example some ideas:

  • you iterate on this PR on your own until everything is green
  • I propose how to fix each, optionally with a diff you may apply
  • I fix it completely on my own later

Redhat can have multiple packages installed with the same name (eg
kernels), a fact that Rex::Pkg::Base::diff_package_list is not
prepared to deal with, resulting in calling the on_change hook when
you run the update_system command, even if nothing changed in the
packages installed (see also bug RexOps#1149). This commit fixes that.
@akarelas
Copy link
Author

Finished with extra tests. Can you please recheck?

Copy link
Member

@ferki ferki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patience, while I went through everything in a way to enable meaningful feedback. I carefully read the code, and played with different situations, or alternative approaches.

Overall, on top of a review, it seems we are already through the red-green (fail-pass, test-code) steps of a standard TDD cycle, and it's time to refactor (improve based on what we learned) too.

The key code change seems to be treating the version and release metadata of a package together as a concatenated string to identify unique instances of the package. On Debian the criteria seemed to be more like name+arch, not sure arch plays a role on Red Hat yet. Do you have any experience or info about that?

The chosen implementation may perform better too, though I did not made any benchmarks yet. Mainly due to the Base and Redhat implementations do different things, which makes them harder to compare. Let's focus on correctness first.

Apart from that, anything else I noted down was related to the test approach. Since I felt showing a proposed change would be easier instead of only describing it, I decided to push a proposed commit to this branch.

What I also learned is the related on_change failure only seems to happen when the new version of the package is listed before the old version of the package with the same name.

Let me know what you think, and whether would you like to change anything else.

ps.: I'm not yet sure about how we may be able to test the actual on_change hook instead of only the underlying diff_package_list() implementation 🤔 I'd like to think a bit more about that part, while everything else gets addressed.

t/package.t Outdated
@@ -62,4 +65,75 @@ subtest 'local package installation' => sub {
lives_ok { $pkg->update('test_package') }, 'update test package';
};

subtest 'redhat package list diffs' => sub {
## no critic (ProhibitDuplicateLiteral)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the scope of this ## no critic hint too wide. Comparing to the above text example, the exemptions also cover quite some code, instead of solely the data used to drive the test cases.

I'd like to move this to wrap only the data, and use the revealed perlcritic complaints to see how we may improve the implementation. I expect we could use this to simplify the test code.


plan tests => 4;

my $rh_pkg = Rex::Pkg::Redhat->new;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: the other instantiation of a Rex::Pkg class is outside the test cases, and this introduces a different convention.

Long run I'm fine with both, depending where the evolution of t/package.t takes us over time. Normally I would lean towards consistency, while it's also fine for now to not know exactly which approach will serve us best later (so we allow ourselves to decide later.)

It may not be necessary to discover the answer or fix it now, I just thought it's worth mentioning while reading the changes carefully.

t/package.t Outdated
Comment on lines 99 to 134
my @plist1 = @{ dclone( \@orig ) };
my @plist2 = @{ dclone( \@orig ) };
my @expected = ();

my @mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
cmp_deeply( \@mods, \@expected,
'expected package modifications when nothing changed' );

@plist1 = @{ dclone( \@orig ) };
pop @plist1;
@plist2 = @{ dclone( \@orig ) };
@expected = ( { %{ $orig[2] }, action => 'installed' } );

@mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
cmp_deeply( \@mods, \@expected,
'expected package modifications when new kernel release is installed' );

@plist1 = @{ dclone( \@orig ) };
pop @plist1;
@plist2 = @{ dclone( \@orig ) };
shift @plist2;
@expected = ( { %{ $orig[2] }, action => 'updated' } );

@mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
cmp_deeply( \@mods, \@expected,
'expected package modifications when new kernel release is installed and an old one removed'
);

@plist1 = @{ dclone( \@orig ) };
@plist2 = @{ dclone( \@orig ) };
shift @plist2;
@expected = ( { %{ $orig[0] }, action => 'removed' } );

@mods = $rh_pkg->diff_package_list( \@plist1, \@plist2 );
cmp_deeply( \@mods, \@expected,
'expected package modifications when only a kernel release is removed' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed two things about this section:

  1. It feels quite a lot of code compared to the above purely data-driven approach. I find it a lot harder to follow what exactly gets tested now, compared to "given these inputs, this is the expected output".

    I'd like to check how that approach would fit us here too. I expect we could remove the Storable dependency, and also let ## no critic only cover the data then.

  2. These tests seem to only test a single modification at a time, meaning 1 no-change, 1 removal, 1 update, 1 installation. The above tests follows a more complete "multiple changes at once, one of each".

    While certainly fine to test single changes too, I'd prefer the complete one to make the comparison between current and proposed implementations easier.

    For example the Base and Debian implementations of diff_package_list() strictly orders changes by their type: first all installs, then all removes, then all updates. The Redhat one seems to follow "either installs or updates mixed, then removes" (presumably trading strict ordering for performance.)

    I believe the Base and Debian behavior is merely a side-effect, instead of intentionally implemented design criteria. I also doubt mixed ordering would break any use cases – though it's already part of the public interface for a long time, so Hyrum's law may apply 🙃

    I'm fine to allow for flexibility in the ordering, with the note that we may need to come back to this later if we find out users do rely on the order of change types somehow. Currently I think nobody should depend on this, and unlocking efficient diff algorithms may warrant mixed orders too. The improved Redhat implementation sounds like a good opportunity to experiment more and gather new info.

    In that sense, using cmp_bag instead of cmp_deeply may also fit better here to signify "we don't promise strict order of changes on Rex Hat package diffs".

'expected package modifications when only a kernel release is removed' );

## use critic
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more note overall: we seem to aim fixing on_change for pkg calls, and I found no tests for that. I agree it may be tricky to add, though I'd like to think of a good way to include that too (=a test to prove we can detect on_change bugs in the future, and then add code to prove we fixed the known cases for now.)

@ferki
Copy link
Member

ferki commented Sep 2, 2024

Oh, ouch, yes, the test workflow currently fails on older Strawberry Perl versions due to having outdated Mozilla::CA preinstalled while actions-setup-perl switched to "secure by default" cpanm calls. Let's see if that gets resolved upstream, or it's better to apply a workaround downstream here (and then rebase this.)

Let's track that separately, just keep in mind here, as the rest still applies.

@akarelas
Copy link
Author

akarelas commented Sep 2, 2024

I think dclone is needed, because the old diff_package_list subroutine (which we want to fail) edits the pck entries, and without dclone the test would pass on the old code. But let's check again (when I find some time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pkg on_change runs even when package state did not change.
2 participants