Closed
Bug 1024141
Opened 10 years ago
Closed 10 years ago
Firefox shutdown crash in testHomeButton.js
Categories
(Mozilla QA Graveyard :: Mozmill Tests, defect, P1)
Mozilla QA Graveyard
Mozmill Tests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: lizzard, Unassigned)
Details
Attachments
(3 files)
From this list of test reports, http://mozmill-release.blargon7.com/#/functional/reports?app=All&branch=31.0&platform=All&from=2014-06-10&to=2014-06-11 I sorted on failures and was going through the list of failures. Some of the individual reports for the failed tests are blank, like this one: http://mozmill-release.blargon7.com/#/functional/report/3ed251269efbb25c65b0fc5d9d384e59 This may be from firefox crashing (process-crash here: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/5968/console) "10:32:33 PROCESS-CRASH | c:\jenkins\workspace\ondemand_functional\data\mozmill-tests\firefox\tests\functional\testToolbar\testHomeButton.js | application crashed [Unknown top frame]" whimboo is explaining to me that there is a missing .extra file so we can't put this in crash-stats. I'm not sure how to dig further into this right now, but here is a placeholder for this problem.
Reporter | ||
Comment 1•10 years ago
|
||
This appears to have crashed/failed in several other locales. Across several platforms.
Reporter | ||
Updated•10 years ago
|
OS: Mac OS X → All
Priority: -- → P1
Hardware: x86 → All
Reporter | ||
Comment 2•10 years ago
|
||
Here are the process-crash tests that come out with blank reports. Mozmill ondemand_functional testrun for Firefox 31.0 ms on mm-win-7-64-2 (2014-06-11_10-36-48) completed with 1 failures: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/6003/console Mozmill ondemand_functional testrun for Firefox 31.0 zu on mm-win-7-64-3 (2014-06-11_10-36-10) completed with 1 failures. http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/6002/ Mozmill ondemand_functional testrun for Firefox 31.0 lij on mm-win-7-64-4 (2014-06-11_10-22-04) completed with 1 failures. http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/5971/ Mozmill ondemand_functional testrun for Firefox 31.0 ff on mm-win-7-32-1 (2014-06-11_10-21-43) completed with 1 failures. http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/5966/ Mozmill ondemand_functional testrun for Firefox 31.0 kn on mm-win-7-32-2 (2014-06-11_10-21-46) completed with 1 failures. http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/5967/
Comment 3•10 years ago
|
||
This bug should actually be about the crash when running our tests. It's not about the non-visible report in the dashboard. Therefore an issue on the github repository (http://github.com/whimboo/mozmill-dashboard) has to be filed. Sadly none of this crashes caused Firefox to create the .extra file for the minidump. That way we will not be able to send the report to crashstats. :( So we will have to analyze the minidump with windbg on Windows. Anthony also mentioned crashes on OS X, so gdb should also be helpful there. But Liz didn't add those in her last comment. So maybe a look into the email archive might be necessary. For how to work with WinDbg we have good docs on MDN: https://developer.mozilla.org/en-US/docs/How_to_get_a_stacktrace_with_WinDbg Andrei, can you have a look at this please? I would appreciate that. Thanks.
Flags: needinfo?(andrei.eftimie)
Summary: Blank report in mozmill from crash in testHomeButton.js → Firefox shutdown crash in testHomeButton.js
Version: Version 2 → unspecified
Comment 4•10 years ago
|
||
The MDN docs are for a hooking up to a process and reproducing the crash. I've set up Visual Studio, installed the correct symbols and WinDBG, and tried investigating the minidump itself, but it fails with the following message: > Failure when opening dump file > 'C:\Users\mozauto\AppData\Roaming\Mozilla\Firefox\Crash > Reports\pending\d55ccfe6-5a92-4c5f-b926-e52f89e3416b.dmp', > HRESULT 0x800004005 > It mai be corrupt or in a format not understood by the debugger. > > Unspecified error I can open up older dmp files, even those without the .extra file. It might be that the recent crash dumps are corrupt. I don't know how to investigate this further...
Flags: needinfo?(andrei.eftimie)
Comment 5•10 years ago
|
||
(In reply to Andrei Eftimie from comment #4) > I've set up Visual Studio, installed the correct symbols and WinDBG, and > tried investigating the minidump itself, but it fails with the following > message: > > > Failure when opening dump file > > 'C:\Users\mozauto\AppData\Roaming\Mozilla\Firefox\Crash > > Reports\pending\d55ccfe6-5a92-4c5f-b926-e52f89e3416b.dmp', > > HRESULT 0x800004005 > > It mai be corrupt or in a format not understood by the debugger. > > > > Unspecified error > > I can open up older dmp files, even those without the .extra file. > It might be that the recent crash dumps are corrupt. That's very strange. Maybe we produce invalid minidumps sometimes? Benjamin or Ted, do you have an idea what could be the problem here? Would it help to give you the minidump? Andrei, I would suggest while we are waiting for the feedback from Benjamin and Ted, that you are uploading the minidump as attachment to this bug.
Comment 6•10 years ago
|
||
There were 2 crashes yesterday. I'm uploading both .dmp files
Comment 7•10 years ago
|
||
Comment 8•10 years ago
|
||
Also a crash on staging: http://mm-ci-staging.qa.scl3.mozilla.com:8080/job/mozilla-central_functional/2316/console Why is the WIFI GEO: shutdown called message here.. this also appears earlier (a couple tests _after_ the Geolocation test. > 07:52:49 TEST-START | testToolbar\testHomeButton.js | teardownModule > 07:52:49 TEST-END | testToolbar\testHomeButton.js | finished in 260ms > 07:52:52 *** WIFI GEO: shutdown called > 07:52:52 > 07:52:52 ###!!! [Child][DispatchAsyncMessage] Error: Route error: message sent to unknown actor ID > 07:52:52 > 07:52:52 > [...] // 12 other messages like this > 07:52:52 > 07:52:52 > 07:52:52 ###!!! [Child][DispatchAsyncMessage] Error: Route error: message sent to unknown actor ID > 07:52:52 > 07:52:53 PROCESS-CRASH | c:\jenkins\workspace\mozilla-central_functional\data\mozmill-tests\firefox\tests\functional\testToolbar\testHomeButton.js | application crashed [Unknown top frame] > 07:52:53 Crash dump filename: c:\jenkins\workspace\mozilla-central_functional\data\profile\minidumps\77cd8f88-2e08-4756-83ea-70005144343a.dmp > 07:52:53 No symbols path given, can't process dump. > 07:52:53 MINIDUMP_STACKWALK not set, can't process dump. > 07:52:53 mozcrash INFO | Saved minidump as C:\Users\mozauto\AppData\Roaming\Mozilla\Firefox\Crash Reports\pending\77cd8f88-2e08-4756-83ea-70005144343a.dmp
Comment 9•10 years ago
|
||
Both of those minidumps are malformed. They have a valid header but are missing the data directory which actually points to the contents. I'm not sure why this would be happening, if it's crashing during shutdown it's possible that it's a race condition of some sort.
Comment 10•10 years ago
|
||
This also fails with esr24, so it's not a bug in beta, also it failed only on nodes we made flash and java updates. I ran the testrun for 10 times on win 7 node only with flash update and then with java too but I couldn't reproduce the crash.
Comment 11•10 years ago
|
||
Comment on attachment 8439233 [details] jenkins_esr24.txt >04:59:49 Crash dump filename: c:\jenkins\workspace\mozilla-esr24_functional\data\profile\minidumps\55933401-5ca0-402b-b4ad-1cb49cbe32d9.dmp >04:59:49 No symbols path given, can't process dump. >04:59:49 MINIDUMP_STACKWALK not set, can't process dump. >04:59:49 mozcrash INFO | Saved minidump as C:\Users\mozauto\AppData\Roaming\Mozilla\Firefox\Crash Reports\pending\55933401-5ca0-402b-b4ad-1cb49cbe32d9.dmp >04:59:49 mozcrash INFO | Saved app info as C:\Users\mozauto\AppData\Roaming\Mozilla\Firefox\Crash Reports\pending\55933401-5ca0-402b-b4ad-1cb49cbe32d9.extra This log shows an .extra file. Have you done any investigation for that crash, Cosmin?
Comment 12•10 years ago
|
||
Here the crash report: bp-fe4a1171-4af8-45a6-aa1d-63f412140612 So this is actually bug 980938. Cosmin, do you have installed the release version of Flash on all those Windows boxes? This is not what we should do! You should know that from all the trouble we had with those flash crashes!
Flags: needinfo?(cosmin.malutan)
Comment 13•10 years ago
|
||
(In reply to Andrei Eftimie from comment #4) > The MDN docs are for a hooking up to a process and reproducing the crash. > > I've set up Visual Studio, installed the correct symbols and WinDBG, and > tried investigating the minidump itself, but it fails with the following > message: Can I ask where you have installed Visual Studio? I see that mm-win-7-32-3 got massive amounts of new packages installed today. I really hope you haven't done this on the production machine!
Flags: needinfo?(cosmin.malutan) → needinfo?(andrei.eftimie)
Comment 14•10 years ago
|
||
Liz, it would be great if you could re-build all the crashed jobs. I downgraded Flash on all the boxes to 13.0.0.124 debug, so we should not see those crashes anymore. Please let us know about the results. thanks.
Reporter | ||
Comment 15•10 years ago
|
||
whimboo, OK, I will do that this afternoon!
Reporter | ||
Comment 16•10 years ago
|
||
OK, I rebuilt them all, including the Mac failures, and they aren't failing now! So I think you found the culprit.
Comment 17•10 years ago
|
||
That's great to hear. So I'm going to close this bug now given that the debug version of Flash fixed the issue. Ted, regarding the broken minidumps, where should I file that bug? I think it would be good to get this investigated and fixed if it is a core issue.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(ted)
Resolution: --- → FIXED
Comment 18•10 years ago
|
||
Here's the thing: on Windows we use a Microsoft API to write minidumps, so we don't have any control over that code. It's possible that the bug here is in the way we inject Breakpad into Adobe's Flash processes, bsmedberg could comment on that.
Flags: needinfo?(ted)
Comment 19•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #13) > (In reply to Andrei Eftimie from comment #4) > > The MDN docs are for a hooking up to a process and reproducing the crash. > > > > I've set up Visual Studio, installed the correct symbols and WinDBG, and > > tried investigating the minidump itself, but it fails with the following > > message: > > Can I ask where you have installed Visual Studio? I see that mm-win-7-32-3 > got massive amounts of new packages installed today. I really hope you > haven't done this on the production machine! Of course I've done it on mm-win-7-32-3. We had some crashes which we didn't knew if we could reproduce and a couple of dmp files. If those dmp files led to nothing we might have needed to actually use windbg _while_ generating the crash. Where else would be the best place to have everything set up then the machine where the crash happened in the first place. Now that we sorted things out, I've cleaned the machine (removed Visual Studio and all its dependencies). I've left the symbols on FS1-QA since they can be linked from a network drive, so If we'll later need them they are there.
Flags: needinfo?(andrei.eftimie)
Comment 20•10 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #18) > Here's the thing: on Windows we use a Microsoft API to write minidumps, so > we don't have any control over that code. It's possible that the bug here is > in the way we inject Breakpad into Adobe's Flash processes, bsmedberg could > comment on that. Benjamin, or Georg what do you both think about that? Something we can improve and I should file a bug for? Or is there already one around, and the problem known. Hm, maybe this could be the underlying issue for bug 981641, and we can cover it there?
Flags: needinfo?(georg.fritzsche)
Flags: needinfo?(benjamin)
Comment 21•10 years ago
|
||
(In reply to Andrei Eftimie from comment #19) > Of course I've done it on mm-win-7-32-3. > We had some crashes which we didn't knew if we could reproduce and a couple > of dmp files. When it comes to analyzing crashes you don't have to do this on the exact same machine. Any other installation will be sufficient. Also in case of windbg you do not have to install the whole Visual Studio, the debugging tools are totally enough. So next time please think if you can do stuff locally first, and if there is no other way around use one of the CI machines, better a staging than a production one. > If those dmp files led to nothing we might have needed to actually use > windbg _while_ generating the crash. Where else would be the best place to > have everything set up then the machine where the crash happened in the > first place. I'm not afraid about the debugging tools being installed, but Visual Studio can leave traces behind. So I don't feel good when doing those things on production machines even. > Now that we sorted things out, I've cleaned the machine (removed Visual > Studio and all its dependencies). > I've left the symbols on FS1-QA since they can be linked from a network > drive, so If we'll later need them they are there. Thanks
Comment 22•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #20) > (In reply to Ted Mielczarek [:ted.mielczarek] from comment #18) > > Here's the thing: on Windows we use a Microsoft API to write minidumps, so > > we don't have any control over that code. It's possible that the bug here is > > in the way we inject Breakpad into Adobe's Flash processes, bsmedberg could > > comment on that. > > Benjamin, or Georg what do you both think about that? Something we can > improve and I should file a bug for? Or is there already one around, and the > problem known. Hm, maybe this could be the underlying issue for bug 981641, > and we can cover it there? Are there clear STR? Can you file a bug on it with steps?
Comment 23•10 years ago
|
||
Given the nature of the bug here, I don't think this is high value, and debugging/fixing it will probably require significant effort. So if you have specific STR and want to file, that's fine, but expect it to sit.
Flags: needinfo?(georg.fritzsche)
Flags: needinfo?(benjamin)
Comment 24•10 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #22) > > Benjamin, or Georg what do you both think about that? Something we can > > improve and I should file a bug for? Or is there already one around, and the > > problem known. Hm, maybe this could be the underlying issue for bug 981641, > > and we can cover it there? > > Are there clear STR? Can you file a bug on it with steps? It happens on and off. No clear steps yet but we have a high reproduction rate for it. But lets move this discussion over to bug 981641.
Updated•5 years ago
|
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•