Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build and Test Windows workflows are timing out (OSOE-840) #358

Closed
DemeSzabolcs opened this issue Apr 27, 2024 · 9 comments · Fixed by #360
Closed

Build and Test Windows workflows are timing out (OSOE-840) #358

DemeSzabolcs opened this issue Apr 27, 2024 · 9 comments · Fixed by #360
Assignees
Labels
bug Something isn't working

Comments

@DemeSzabolcs
Copy link
Member

DemeSzabolcs commented Apr 27, 2024

See dev: https://github.com/Lombiq/Open-Source-Orchard-Core-Extensions/actions/runs/8846335566/job/24292006305

It seems like the first timeout happened here: https://github.com/Lombiq/Open-Source-Orchard-Core-Extensions/actions/runs/8824643963

The Linux workflow still works fine in every case.

On Windows increasing the timeout results in a lot of:

Exception: OpenQA.Selenium.WebDriverException: Cannot start the driver service on http://127.0.0.1:51661/

error.

Jira issue

@DemeSzabolcs DemeSzabolcs added the bug Something isn't working label Apr 27, 2024
@github-actions github-actions bot changed the title Build and Test Windows workflows are timing out Build and Test Windows workflows are timing out (OSOE-842) Apr 27, 2024
@Piedone
Copy link
Member

Piedone commented Apr 28, 2024

@dministro isn't this the return of #228? The workflow logs aren't really loading for me strangely (though I'm on my potato laptop now) so can't check where exactly they stopped.

@Piedone
Copy link
Member

Piedone commented Apr 29, 2024

"OpenQA.Selenium.WebDriverException: Cannot start the driver service on..." is not necessarily an issue though. Please link where you increased the timeout.

@sarahelsaig @wAsnk see this, don't just rerun Windows builds.

@dministro
Copy link
Member

@dministro isn't this the return of #228? The workflow logs aren't really loading for me strangely (though I'm on my potato laptop now) so can't check where exactly they stopped.

I'm not sure, a lot of exceptions in logs as below:

2024-04-27T00:48:36.1161605Z [xUnit.net 00:14:10.88]         2024-04-26 23:48:06.0632 - The Orchard Core instance was started.
2024-04-27T00:48:36.1165022Z [xUnit.net 00:14:10.88]         2024-04-26 23:48:29.6296 - While creating the web driver failed with the following exception, it'll be retried 1 more time(s). Exception: OpenQA.Selenium.WebDriverException: Cannot start the driver service on http://127.0.0.1:51661/
2024-04-27T00:48:36.1167315Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.DriverService.Start()
2024-04-27T00:48:36.1168782Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.Remote.DriverServiceCommandExecutor.Execute(Command commandToExecute)
2024-04-27T00:48:36.1171210Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.WebDriver.Execute(String driverCommandToExecute, Dictionary`2 parameters)
2024-04-27T00:48:36.1172870Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.WebDriver.StartSession(ICapabilities desiredCapabilities)
2024-04-27T00:48:36.1174507Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.WebDriver..ctor(ICommandExecutor executor, ICapabilities capabilities)
2024-04-27T00:48:36.1177211Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.Chromium.ChromiumDriver..ctor(ChromiumDriverService service, ChromiumOptions options, TimeSpan commandTimeout)
2024-04-27T00:48:36.1179471Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.Chrome.ChromeDriver..ctor(ChromeDriverService service, ChromeOptions options, TimeSpan commandTimeout)
2024-04-27T00:48:36.1184170Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.WebDriverFactory.<>c__DisplayClass1_0.<CreateChromeDriverAsync>g__CreateDriverInnerAsync|0(ChromeDriverService service) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\WebDriverFactory.cs:line 51
2024-04-27T00:48:36.1190272Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.WebDriverFactory.CreateChromeDriverAsync(BrowserConfiguration configuration, TimeSpan pageLoadTimeout) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\WebDriverFactory.cs:line 59
2024-04-27T00:48:36.1195964Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.AtataFactory.<CreateDriverAsync>g__FromAsync|2_0[T](Func`3 factory, <>c__DisplayClass2_0&) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\AtataFactory.cs:line 63
2024-04-27T00:48:36.1202045Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.AtataFactory.CreateDriverAsync(BrowserConfiguration browserConfiguration, TimeoutConfiguration timeoutConfiguration, ITestOutputHelper testOutputHelper) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\AtataFactory.cs:line 80
2024-04-27T00:48:36.1207711Z [xUnit.net 00:14:10.88]         2024-04-26 23:48:55.7539 - The test failed with the following exception: OpenQA.Selenium.WebDriverException: Cannot start the driver service on http://127.0.0.1:51668/
2024-04-27T00:48:36.1210234Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.DriverService.Start()
2024-04-27T00:48:36.1211692Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.Remote.DriverServiceCommandExecutor.Execute(Command commandToExecute)
2024-04-27T00:48:36.1213468Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.WebDriver.Execute(String driverCommandToExecute, Dictionary`2 parameters)
2024-04-27T00:48:36.1215740Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.WebDriver.StartSession(ICapabilities desiredCapabilities)
2024-04-27T00:48:36.1217393Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.WebDriver..ctor(ICommandExecutor executor, ICapabilities capabilities)
2024-04-27T00:48:36.1220691Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.Chromium.ChromiumDriver..ctor(ChromiumDriverService service, ChromiumOptions options, TimeSpan commandTimeout)
2024-04-27T00:48:36.1222983Z [xUnit.net 00:14:10.88]            at OpenQA.Selenium.Chrome.ChromeDriver..ctor(ChromeDriverService service, ChromeOptions options, TimeSpan commandTimeout)
2024-04-27T00:48:36.1227667Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.WebDriverFactory.<>c__DisplayClass1_0.<CreateChromeDriverAsync>g__CreateDriverInnerAsync|0(ChromeDriverService service) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\WebDriverFactory.cs:line 51
2024-04-27T00:48:36.1233736Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.WebDriverFactory.CreateChromeDriverAsync(BrowserConfiguration configuration, TimeSpan pageLoadTimeout) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\WebDriverFactory.cs:line 59
2024-04-27T00:48:36.1239523Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.AtataFactory.<CreateDriverAsync>g__FromAsync|2_0[T](Func`3 factory, <>c__DisplayClass2_0&) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\AtataFactory.cs:line 63
2024-04-27T00:48:36.1245603Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.AtataFactory.CreateDriverAsync(BrowserConfiguration browserConfiguration, TimeoutConfiguration timeoutConfiguration, ITestOutputHelper testOutputHelper) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\AtataFactory.cs:line 80
2024-04-27T00:48:44.4224663Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.AtataFactory.StartAtataScopeAsync(ITestOutputHelper testOutputHelper, Uri baseUri, OrchardCoreUITestExecutorConfiguration configuration) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\AtataFactory.cs:line 33
2024-04-27T00:48:49.1712525Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.UITestExecutionSession.CreateContextAsync() in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\UITestExecutionSession.cs:line 659
2024-04-27T00:48:49.1717684Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.UITestExecutionSession.SetupAsync() in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\UITestExecutionSession.cs:line 523
2024-04-27T00:48:49.1722549Z [xUnit.net 00:14:10.88]            at Lombiq.Tests.UI.Services.UITestExecutionSession.ExecuteAsync(Int32 retryCount, String dumpRootPath) in C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.UITestingToolbox\Lombiq.Tests.UI\Services\UITestExecutionSession.cs:line 100
2024-04-27T00:48:53.9168637Z [xUnit.net 00:14:10.88]         2024-04-26 23:48:58.1286 - The test was attempted 3 time(s) and won't be retried anymore. You can see more details on why it's failing in the FailureDumps folder: C:\a\Open-Source-Orchard-Core-Extensions\Open-Source-Orchard-Core-Extensions\test\Lombiq.OSOCE.Tests.UI\bin\Debug\net8.0\FailureDumps\Lombiq_OSOCE_Tests_UI_Tests_ThemeTests_BehaviorBaseThemeTests_ThemeFeaturesShouldWork

With edge, the exceptions above were not observable, but the runner was extremely slow.

@dministro
Copy link
Member

"OpenQA.Selenium.WebDriverException: Cannot start the driver service on..." is not necessarily an issue though. Please link where you increased the timeout.

@sarahelsaig @wAsnk see this, don't just rerun Windows builds.

Increasing timeouts was not helped
https://github.com/Lombiq/Open-Source-Orchard-Core-Extensions/actions/runs/8854340814/job/24317175955
https://github.com/Lombiq/Open-Source-Orchard-Core-Extensions/actions/runs/8855376611/job/24320109467

@Piedone Piedone changed the title Build and Test Windows workflows are timing out (OSOE-842) Build and Test Windows workflows are timing out (OSOE-840) Apr 29, 2024
@Piedone
Copy link
Member

Piedone commented Apr 29, 2024

Thanks!

@Piedone Piedone self-assigned this Apr 29, 2024
@Piedone
Copy link
Member

Piedone commented Apr 29, 2024

Some notes:

  • Last successful workflow, and first failing one.
  • This isn't related to any code changes: if we rerun the last succeeding dev workflow, then it also times out.
  • The default (4-core) and 8-core runners also exhibit this.
  • The runner version changed from 2.315.0 for the last passing workflow to 2.316.0 for today but the first failing workflow was still on 2.315.0.
  • The last good run was started on 23 April 4:31 PM CEST, the first bad one on the 25th at 1:29 AM.
  • There was an image release on the 22nd at 7:49 PM CEST. While this is before the last good run, this can still might be related, since it takes days until 100% of the runners have it. In it, Chrome was updated (from 123.0.6312.123 to 124.0.6367.61), as was Chrome Driver (though we don't rely on that, Atata downloads the matching one).
  • And indeed, the last good run used the 20240414.1.0 image, the first failing one 20240421.1.0.
  • The tests work fine for me locally under Windows with Chrome 124.0.6367.91, i.e. more recent than the runners'.
  • The "Cannot start the driver service on..." may be a red herring: while there are a couple of these in the output, the driver (or rather, the browser) not starting randomly is expected (that's why we have a retry for it), for this to be an actual issue, it should be there for every test.
  • The runners seem aren't in general slower (checking out the last 5 successful and 4 failed dev runs, though there was a fluke with a 21-minute Build and Static Code Analysis here).
  • Chrome processes remaining after concurrent headless tests finish (OSOE-839) #356 might be related or the same issue. However, not sure if that started recently or we just noticed now (I'm pretty sure it wasn't always the case though since I did work on similar things before).

@Piedone
Copy link
Member

Piedone commented Apr 30, 2024

@sarahelsaig could you please try v9.0.1-alpha.7.osoe-840 for OrchardCMS/OrchardCore.Commerce#433? This should fix the Windows timeout.

@Piedone
Copy link
Member

Piedone commented Apr 30, 2024

The performance of the tests doesn't seem to suffer.

Fixed the tests. Now both Ubuntu and Windows pass:

@Piedone
Copy link
Member

Piedone commented Apr 30, 2024

So I see this also fixed the timeout for OrchardCMS/OrchardCore.Commerce#434.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants