-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Host randomly crashing with message "##[error]The active test run was aborted. Reason: Test host process crashed" #4819
Comments
@jakubch1 this looks like a CodeCoverage issue. Testhost just crashes and there is no error reported. Please have a look. |
@ibiglari you are using dynamic native instrumentation which is not fully stable. After doing above you should not see in logs such errors:
|
@jakubch1 so if I add |
@jakubch1 I'm afraid adding Linker command line as extracted from build pipeline logs: |
Yes. Static instrumentation is better and more reliable so we have it enabled by default but if Please try to configure below flags in runsettings:
Full runsettings samples: https://github.com/microsoft/codecoverage/blob/main/docs/configuration.md |
@jakubch1 Even after updating
|
@ibiglari could you please share logs? |
@ibiglari something is wrong as I don't see
inside logs. Can you double check that this was added into Did you configure:
inside |
Added
New logs attached. |
@ibiglari currently you are instrumenting only 1 file:
(inside |
@jakubch1 No joy. Still seeing the crash. Logs attached. Also, not sure what you mean by instrumenting only 1 file? Our test project is
vstestlog.log |
@ibiglari could you please try to run it without code coverage enabled? We should check if this is even coverage related. Based on log currently your build doesn't make any instrumentation. |
@jakubch1 I find that hard to believe, as I can see code coverage in SonarCloud. However, I will try and get back to you. |
@jakubch1 Turning off code coverage leads to successful execution of the test. Logs attached |
@ibiglari thank you for info. Could you please experiment with below 4 flags:
I expect if all flags are False you don't have issue. Then you could 1 by 1 enable and check. If you can share your binaries I can also make this testing on my side. You can share privately binaries here: https://developercommunity.visualstudio.com/home |
@jakubch1 I'm afraid I had a crash with all four set to false. |
@ibiglari any idea what is "Waiting for HQ Thread to Terminate..." ? |
@jakubch1 Yes we set a mutex and wait for a thread to terminate... That's one of the log messages from the test routine. Why? |
I saw this in Test Platform logs. Could you please try this: |
@jakubch1 Now this is funny. If I add The problem is, since this error happens randomly, it might be pure luck. I have to run the queue multiple times to ascertain whether a change in parameters causes a crash or not |
@jakubch1 Disregard previous comment. Test host crashed again, and this time this message was added to execution log:
How do I make this available to my DevOps build agent? Do I need to manually include the binary? |
@ibiglari please install proc dump (https://github.com/microsoft/vstest/blob/main/docs/troubleshooting.md#collect-process-dump-using-procdump-on-windows-ie-outofmemory) and set PROCDUMP_PATH when running your tests. |
@jakubch1 Found an easier way. However, this meant the test ran for 40 minutes instead of 9 minutes, and then the pipeline timed out. Is that to be expected when using |
@jakubch1 After adjusting job's timeout, I can confirm test host does not crash when proc dump is attached to it. However, once the test method is finished, there is a delay of about 30 minutes before test host returns and the pipeline can continue. I can only surmise that proc dump somehow changes internal state of test host. Any advise? |
@ibiglari in that case please remove
and after VSTest@2 and step:
then please reproduce crash issue and provide us dumps. Please try to repro with:
|
@jakubch1 Will advise. I was installing procdump using a PowerShell task:
I guess I need to remove it, right? |
@jakubch1 Just an update, I have not been able to replicate the crash for the past week. The random nature of this crash makes it rather hard to investigate, but as soon as it happens, I will provide the logs here. |
Likely related to which is intermittent for me to (although it fails most of the time). Super frustrating that the vstest team does not seem to think the lack of diagnosabiliy here is a priority. I've wasted hours and hours on this. |
@tig what tools do you lack to diagnose the problem? In the linked issue you are getting full call stack on screen and you have a memory dump. |
You're kidding right? Have you read all the reports where people are complaining about how hard these issues are to diagnose? This is not the fault of the customer but the vstest infrastructure and documentation. No matter what I tried I could not get a dump to be created on Linux. Only because I decided to try the mac runner did I get a dump. I have no experience reading crash dumps, like 99.99% of dotnet devs. It seems the team would rather tell customers "you're holding it wrong" than actually fix the infrastructure to make these problems easy to diagnose. |
The "Test host process crashed" error means that the hosting process crashed. In many cases it is the user's test code crashing the process for a reason we cannot control. If there was no testhost, but just a plain other .NET process, it would crash the same. I was not kidding, just honestly asking, what would be required to make this easy to diagnose? Stack trace and memory dumps seems the closest we can get to interactively debugging the problem. |
Here's one obvious example: Making it so that crash dumps actually get generated when they should would be helpful too: #2952 (comment) Making is so that a sequence file can be generated regardless of whether the tests failed, or a crash happened would help. |
Sequence file just tells you which tests did not finish (you can also find that in the diag logs), when all tests finish the file is barely useful.
You told me that memory dumps have no value to you, because you cannot understand them as 99% of other developers. Are there other tools that would help you diagnose the problem? We are building a new platform for running tests and if you think there is a better way that is not memory dumps we can take that into account. :) For creating the dumps (on .NET) we rely on the built-in .NET memory dump reporting, but because vstest_dump_path is set we don't upload any of the dumps from the run even when it is created. |
Closing in favor of #2952 I would still like to hear your opinion on how to make it easier for devs to debug crashes, when not using memory dumps or stack trace @tig I do agree that the blame collector is far from perfect, but at least on .NET we are offloading all the work to the standard .NET functionality. |
We have an MFC application and are using
Microsoft::VisualStudio::CppUnitTestFramework
for testing.VSTest@2
step is failing almost 50% of the time with the message ##[error]The active test run was aborted. Reason: Test host process crashed after the last step of the test routine is finished, as apparent from the log messages printed byMicrosoft::VisualStudio::CppUnitTestFramework::Logger::WriteMessage()
. The test successfully finishes when being executed directly via Visual Studio 17.8.4.Pipeline Step:
Steps to reproduce
Since this is a random occurrence, re-queuing the job usually resolves the issue.
Expected behavior
No crash during test execution
Actual behavior
Random crashes after last step of the test routine
Diagnostic logs
vstestlog.datacollector.24-01-11_08-49-42_80953_4.diag.log
vstestlog.diag.log
vstestlog.host.24-01-11_08-49-51_57026_4.diag.log
Environment
The text was updated successfully, but these errors were encountered: