Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Use real mouse events in tests on Firefox #20889

Merged
merged 10 commits into from
Aug 20, 2024

Conversation

martinpitt
Copy link
Member

@martinpitt martinpitt commented Aug 14, 2024

Our years-old ph_mouse() helper is cheating: It completely side-steps the browser UI and directly synthesizes MouseEvents in JavaScript. This allowed funny things like clicking on a main page element while a dialog is open (which you can't normally do as the dialog is modal), clicking an element which is disabled in some non-standard way, or clicking through a tooltip.

Make this more realistic by using the BiDi API for synthesizing mouse events. This works fine with Firefox, but is unfortunately completely broken in iframes with Chromium:
https://issues.chromium.org/issues/359616812

Specifying precise coordinates is not currently implemented. ph_mouse() always clicks on the top left corner, while webdriver's pointerMove puts (0,0) in the center of the element. This can be done with a little extra calculation of the getBoundingRect(), but only very few tests like TestStorageLvm2.testMaxLayoutGrowth rely on that. For these tests, continue to use ph_mouse().

https://issues.redhat.com/browse/COCKPIT-1158


This spotted a few bugs in code and tests, see the additional commits. Given that, even with these initial restrictions this is IMHO worth the effort.

@martinpitt martinpitt added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Aug 14, 2024
@martinpitt
Copy link
Member Author

First round failures: storage, expensive, other (networking passed)

@martinpitt martinpitt force-pushed the bidi-mouse branch 3 times, most recently from ecabff6 to 7d6feed Compare August 16, 2024 08:49
@martinpitt
Copy link
Member Author

martinpitt commented Aug 16, 2024

Getting closer.. expensive and storage are green. The three TestServices failures are not related to mouse, but the test is not working repeatedly with nondestructive - the second time "No logs" is false as the service already ran previously. Need to check if that can be done with restore_dir()/restarting journald/vacuuming etc., or whether these need to become destructive.

TestJournal.testBasic and TestFirewall.testAddServices are the only real regressions from this mouse change now. They both reproduce fine locally.

Update: These were all unrelated bugs, fixed in separate commits.

@martinpitt martinpitt removed the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Aug 16, 2024
@martinpitt
Copy link
Member Author

Eeeeeexcellent, all green! So now the full test suite.

@martinpitt martinpitt force-pushed the bidi-mouse branch 2 times, most recently from 34ce2d1 to 03bb1ab Compare August 18, 2024 07:43
@martinpitt
Copy link
Member Author

Still fighting with this flake -- the "Make TestServices repeatable" approach is still a bit wonky due to

umount: /var/log/journal: target is busy.

The TF failure is also interesting -- no persistent journal apparently. That's easier to fix.

@martinpitt martinpitt force-pushed the bidi-mouse branch 3 times, most recently from 9c6b00a to 2b9e45a Compare August 18, 2024 14:37
@martinpitt
Copy link
Member Author

Grrrrr TestServices -- one more try, then I give up and make the tests destructive.

@martinpitt
Copy link
Member Author

martinpitt commented Aug 18, 2024

This is an unmitigated disaster. Destructive it is. FTR, my most recent/most elaborate attempt was commit 703e000

# HACK: Chromium clicks don't work with iframes; use our old "synthesize MouseEvent" approach
# https://issues.chromium.org/issues/359616812
# TODO: x and y are not currently implemented: webdriver (0, 0) is the element's center, not top left corner
if self.browser == "chromium" or x != 0 or y != 0:
Copy link
Member

@jelly jelly Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda lean to making the X/Y API more logical in the future:

test/verify/check-storage-lvm2:            b.mouse(slider, "click", width, 0)
test/verify/check-storage-partitions:        b.mouse(slider, "click", about_half_way, 0)

So this usually relates to the thing you want to click. So once the chromium bug is resolved I'd be for changing this to the BiDi API as well and making it the x/y coordinate relative to the clicked element (this feels very logical).

Copy link
Member Author

@martinpitt martinpitt Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The x/y coords are relative to the clicked element of course -- but for MouseEvent, i.e. via ph_mouse(), (0, 0) is the top left, while with webdriver (0, 0) is the element's center.

Once we can run chromium with BiDi as well, then indeed I'm all for just redefining x/y here, and adjusting tests to specify the delta relative to the center. That's just impractical with this "half pm_mouse/half BiDi" state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we are on the same page 👍


# For Firefox and regular clicks, use the BiDi API, which is more realistic -- it doesn't
# sidestep the browser
element = self.call_js_func('ph_find_scroll_into_view', selector)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this gives us a 200ms penalty per click, isn't that quite noticeable when running the tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we don't always have to scroll into view I assume? If the element is visible we don't have to scroll do we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right -- most tests should be fine without scrolling, but some (like the services or logs page) need it. There is https://developer.mozilla.org/en-US/docs/Web/API/Element/scrollIntoViewIfNeeded , but it's only implemented in Chromium. But this would not actually help with having to wait for it.

The extra time isn't noticeable when you look at how long a test runs. Most of the BiDi churn is actually waiting for stuff (and our ph_wait_cond() is super inefficient, plus we have a lot of redundancy in our abstract API (like calling wait_present multiple times).

I'm not proud of this, but I tried different approaches and nothing else was reliable 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scrollIntoViewIfneeded sounds nice, should we add a future note about using that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already use it for pixel tests. But it would only give us a minor optimization (page jumping around less if you look at it in an interactive browser). The problem is knowing when it's done, and *IfNeeded() doesn't help us there in the slightest.

What I'd need would be some query if the element is currently visible. I didn't find one a few days ago, but I can do some more experiments and look harder. Then we can do this conditionally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented with .getBoundingClientRect() a bit in an interactive browser, on the hwinfo page:

  • If the element is visible, the x/y coordinates are positive.
  • If the element scrolled off to the top, y aka. top is negative.
  • If the element is scrolled off to the bottom, y is very large, e.g. "3600". We could compare .bottom to window.innerHeight. Like here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this approach in commit b0c1084 but it causes several failures. None of them fail due to "move target out of bounds" any more, so that proves that the approach generally works -- the extra sleep just papered over more bugs in our tests.

I'll try to track them down.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some failures are shallow (and I'll commit the fixes anyway), but this failure shows why this bounding rect approach doesn't work: Here the element is in a scrollable dialog pane, and hidden by defaullt. You have to scroll the dialog, not the page, to make it visible. This failure is robust (can be run interactively, with sleeps and sits).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a way around that, and scrollIntoViewIfNeeded() wouldn't help us either as it does not have a return value. So this was an interesting experiment, but failed 😢 Back to the original.

These tests assume "No log entries" for test.service initially. This
isn't true when running `testLogs{,User}` or `testBasic{,User}` more
than once on the same machine. After an hour of effort trying various
tricks with overlaying /{run,var}/log/journal and restarting journald it
turns out that there just simply isn't any reliable way of resetting the
journal. So give in, and make these four tests destructive.
This was introduced in commit 66ff88e mostly for cockpit-machines.
Since then most of these hacks disappeared, but there's two left.

`StorageControl` avoids that problem by wrapping the `Tooltip` contents
in an extra span, so that tooltips work for disabled elements.

This avoids a race condition: When the `excuse` is empty, but there's a
spinner, then the button is not formally disabled, but clicks go into
the void. This fixes the flaky TestStorageLuksDestructive.testLuks1Slots.
A normal user can't click on "Logout" while the modal superuser dialog
is open. Close the dialog first.

This worked because our `ph_mouse()` approach is cheating, but it won't
work when we do real clicks.
This was previously doing a wild mix of closing the upper (file-ac)
selector while the lower (file-ac-preselected) was already open, which
isn't something that a user can do with a real mouse (only with our
`ph_mouse()` fake API).

It also clicked around wildly without waiting for the changes, which was
racy.

Fix that by closing the upper selector first to conclude that check, and
then extend the clicks into the lower selector with comments what they
are supposed to do, and assertions what should happen after them.

The multiple clicks paper over a bug in that widget: The first time you
open it after the directory got created it still says "no such file or
directory", and only reloads after closing and re-opening it. We should
fix that, but not in this commit -- just capture the status quo.
When moving the mouse into the On/Off switch for clicking, the tooltip
appears. That lies on top of the kebab, which hence becomes not directly
clickable. Avoid that by moving the mouse out of the On/Off switch again
after clicking.
Don't remove the pop3 service simultaneously via the CLI *and* Cockpit
-- that's a clear race condition.

We already check that the UI reacts to `firewall-cmd --remove-service`,
so just drop that bit.
Clicking the popover in the advanced search form closes the whole form
with a real human mouse interaction, as well as with a simulated click
through BiDi. This doesn't happen with `ph_mouse()` though (so that
doesn't test reality).

This is some PatternFly bug, and not entirely obvious how to fix it. For
now, close the popover the same way as opening it, with the (i) icon.
@martinpitt martinpitt marked this pull request as draft August 19, 2024 19:42
@martinpitt martinpitt added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Aug 19, 2024
@martinpitt
Copy link
Member Author

I'm doing a test run without any scrolling at all. I unfortunately don't remember which exact tests need it, let's get a reference on commit a0841e2

Our years-old `ph_mouse()` helper is cheating: It completely side-steps
the browser UI and directly synthesizes `MouseEvent`s in JavaScript.
This allowed funny things like clicking on a main page element while a
dialog is open (which you can't normally do as the dialog is modal),
clicking an element which is disabled in some non-standard way, or
clicking through a tooltip.

Make this more realistic by using the BiDi API for synthesizing mouse
events. This works fine with Firefox, but is unfortunately completely
broken in iframes with Chromium:
https://issues.chromium.org/issues/359616812

Specifying precise coordinates is not currently implemented.
`ph_mouse()` always clicks on the top left corner, while webdriver's
`pointerMove` puts (0,0) in the center of the element. This can be done
with a little extra calculation of the `getBoundingRect()`, but only
very few tests like `TestStorageLvm2.testMaxLayoutGrowth` rely on that.
For these tests, continue to use `ph_mouse()`.

https://issues.redhat.com/browse/COCKPIT-1158
@martinpitt martinpitt removed the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Aug 20, 2024
@martinpitt martinpitt marked this pull request as ready for review August 20, 2024 01:50
@martinpitt martinpitt requested a review from jelly August 20, 2024 05:56
Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the one to +1 this but I gave it a read-through and everything there looks fine.

My first thought is that you're trying to send the command to the wrong browsing context (ie: the iframe is a subcontext) but I guess that's not the case? ie: is this about targetting normal toplevel stuff in a cockpit page (ie: the page is an iframe of the shell)? Or is it about iframes inside of Cockpit pages themselves (ie: additional level)?

This could use a bit more clarity... some sort of statement like "...broken in iframes with Chromium... and since every Cockpit page is running in an iframe (under the shell) this means that we can't ever click on anything at all." Maybe that's what "completely broken" means...

I think an answer in the PR would be enough for future research purposes :)

Copy link
Member

@jelly jelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, @allisonkarlitskaya still had a small question.

@martinpitt
Copy link
Member Author

My first thought is that you're trying to send the command to the wrong browsing context (ie: the iframe is a subcontext) but I guess that's not the case?

Yes, this isn't the case. The context (aka. frame) is correct. It finds the element, and the click event does go to the frame, but it never goes to the actual element. I spent 3 hours creating a standalone reproducer, all the details are in https://issues.chromium.org/issues/359616812.

ie: is this about targetting normal toplevel stuff in a cockpit page (ie: the page is an iframe of the shell)? Or is it about iframes inside of Cockpit pages themselves (ie: additional level)?

Clicks work fine on the top-level, but with that we could only test the login page and the Shell. And I didn't feel like introducing a "current frame == top frame" special case into mouse().

This could use a bit more clarity... some sort of statement like "...broken in iframes with Chromium... and since every Cockpit page is running in an iframe (under the shell) this means that we can't ever click on anything at all." Maybe that's what "completely broken" means...

Yes, that's pretty much what it means... Well, "almost" completely broken (shell and login page do work).

@martinpitt martinpitt merged commit c50bf49 into cockpit-project:main Aug 20, 2024
74 of 76 checks passed
@martinpitt martinpitt deleted the bidi-mouse branch August 20, 2024 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants