Closed
Bug 545195
Opened 14 years ago
Closed 14 years ago
topcrash [@ @0x0 | nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, int, unsigned int) ]
Categories
(Core :: General, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.3a2
People
(Reporter: dbaron, Assigned: bent.mozilla)
References
Details
(Keywords: crash, topcrash)
Crash Data
Attachments
(1 file)
649 bytes,
patch
|
sicking
:
review+
sicking
:
superreview+
|
Details | Diff | Splinter Review |
There's a new #1 topcrash (over 400 crashes for yesterday's builds, which is huge for nightlies) in mozilla-central nightlies starting in yesterday's builds: http://crash-stats.mozilla.com/report/list?product=Firefox&branch=1.9.3&platform=windows&query_search=signature&query_type=exact&query=&date=&range_value=4&range_unit=weeks&process_type=all&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&signature=%400x0%20|%20nsBaseAppShell%3A%3AOnProcessNextEvent%28nsIThreadInternal*%2C%20int%2C%20unsigned%20int%29 Since there were no Windows nightlies on Sunday (the build crashed?), the regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=173248959f01&tochange=05c983938253
Reporter | ||
Comment 1•14 years ago
|
||
It's hard for me to see what might have caused this. My top guesses would probably be bug 527659 / bug 535649, bug 542318, or bug 517553.
Comment 2•14 years ago
|
||
All crashes are on Windows, none on Linux, none on Mac. Bug 542318 is a Windows specific bug, that seems to make it more likely to be the cause.
Comment 3•14 years ago
|
||
If I download the yesterday's trunk nightly, click Help > Check for Updates, close the Update window, close Firefox, and then repeat this one more time, I get this crash on close.
Reporter | ||
Comment 4•14 years ago
|
||
(In reply to comment #3) > If I download the yesterday's trunk nightly, click Help > Check for Updates, > close the Update window, close Firefox, and then repeat this one more time, I > get this crash on close. Did you install any updates, or close before the update finished downloading?
Comment 5•14 years ago
|
||
I just opened and then closed it with the close button. I didn't download or install updates.
I can confirm Ria's info and have a reliable (at least for me) STR: 1. Download Minefield 20100209 zip distribuition and extract it: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2010-02-09-07-mozilla-central/firefox-3.7a2pre.en-US.win32.zip 2. Create a new profile. 3. Launch Minefield build from Step 1 using the new profile from Step 2 4. Click Help --> Check for Updates... 5. Wait for the update to be listed then close the "Software Udate" dialog *by clicking "Ask Later"* 6. File --> Exit to close Minefield 7. Repeat steps 3 to 5 two more times. 8. Firefox crashes http://crash-stats.mozilla.com/report/index/9c87296b-a12b-4f02-9477-789ab2100210 Crashing Thread Frame Module Signature [Expand] Source 0 @0x0 1 xul.dll nsBaseAppShell::OnProcessNextEvent widget/src/xpwidgets/nsBaseAppShell.cpp:293 2 nspr4.dll _MD_CURRENT_THREAD nsprpub/pr/src/md/windows/w95thred.c:308 3 nspr4.dll nspr4.dll@0xcccf 4 xul.dll NS_ProcessPendingEvents_P obj-firefox/xpcom/build/nsThreadUtils.cpp:200 5 xul.dll mozilla::ShutdownXPCOM xpcom/build/nsXPComInit.cpp:769 6 xul.dll ScopedXPCOMStartup::~ScopedXPCOMStartup toolkit/xre/nsAppRunner.cpp:1042 7 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3521 8 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:120 9 firefox.exe __tmainCRTStartup obj-firefox/memory/jemalloc/crtsrc/crtexe.c:591 10 kernel32.dll BaseProcessStart
Comment 7•14 years ago
|
||
Looks like DoProcessNextNativeEvent triggers an event that releases some reference that also releases the nsBaseAppShell instance. On next loop we crash. I would not say bug 542318 is the culprit. Just updating the tree and building to check the provided STR.
Comment 9•14 years ago
|
||
Its probably best to start testing each hourly in the range to narrow it down. Also, checking the stats, looks they are all definitely crashing at line 293 inside nsBaseAppShell::OnProcessNextEvent: http://hg.mozilla.org/mozilla-central/annotate/19dbabe331ad/widget/src/xpwidgets/nsBaseAppShell.cpp#l293 --> keepGoing = DoProcessNextNativeEvent(PR_FALSE); Also some reports confirming the same, shutting down was involved in the crash. FYI: The changeset for bug 517553 is quite large: http://hg.mozilla.org/mozilla-central/rev/53308118abed checked in at: Sun Feb 07 10:52:43 2010 -0500 (at Sun Feb 07 10:52:43 2010 -0500)
Reporter | ||
Comment 10•14 years ago
|
||
(In reply to comment #9) > Its probably best to start testing each hourly in the range to narrow it down. Do we have hourly builds archived somewhere I don't know about?
Reporter | ||
Comment 11•14 years ago
|
||
Another possibility: this code is quite close to the DLL blocklisting code, so it could be related to http://hg.mozilla.org/mozilla-central/rev/0ddf975663a0
Comment 12•14 years ago
|
||
I can reproduce it in a debugger, UI, thanks for STR. However, it's not that simple to figure out at which moment 'this' pointer dies (it it is that). If vksaver.dll is an installable plug-in, then I wouldn't say it's related to this bug. I don't have this file on my machine.
Comment 13•14 years ago
|
||
Sorry, UI -> IU ;)
Reporter | ||
Comment 14•14 years ago
|
||
If you can reproduce it in a debugger (presumably in your own build?), can you bisect to figure out what changeset caused it?
Comment 15•14 years ago
|
||
I am using the nightly build. I don't think I can reproduce in a debug build, as it seems to be somehow related to updates, but I can try to spoof the updater somehow, I was doing that ones in the past. BTW it doesn't seems that nsBaseAppShell is released prior the crash, I am not getting it's destructor call before the crash. On a normal non-crashing exit I do.
Reporter | ||
Comment 16•14 years ago
|
||
Using the hourly archive at: http://hourly-archive.localgho.st/hourly-archive2/mozilla-central-win32/ (which is not reflected on the hourly-archive homepage!), I reduced the range to: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=be94483da3b4&tochange=b21188a34531 which makes no sense at all.
Reporter | ||
Comment 17•14 years ago
|
||
One way I can make it make sense: if the cause was actually an earlier change, but there was a dependency bug, and the change to widget/public/Makefile.in caused it to actually get built for the first time. It would be good if someone else could confirm that range, though.
Reporter | ||
Comment 18•14 years ago
|
||
[2010-02-10 16:39:13] <philor> along the lines of your dependency thought, another good question to ask would be whether your first bad build was a clobber, and if so, how many builds before that were not clobbers [2010-02-10 16:58:09] <philor> dbaron: and if I'm right about telling the difference, yours was, and that would take your range back to 8c84037f3ad9
Comment 19•14 years ago
|
||
If I'm right about how to tell clobbers from depends ("does the compile step start off running configure or not?"), your first-bad was a clobber, and the two before (plus the one that was red, since it doesn't count) were not, which opens your range for something that didn't actually take effect until a clobber up to http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=872dcf34dab3&tochange=b21188a34531
Reporter | ||
Comment 20•14 years ago
|
||
Except that's forgetting one thing: there are multiple build slaves and they clobber at different times. I pulled the build from the earlier changeset that you identified as a clobber: 872dcf34dab3 -- and it had the crash too. So the range we want is earlier. Any chance you could figure out which builds prior to that one were also clobbers? (Beware that there could be more than one build for some changesets.)
Reporter | ||
Comment 21•14 years ago
|
||
Prior to that build, I tested builds from 943afcbad1ac, 2caefaaa7d77, ed857569fabf, b234c7370793, 21d980d9b3a4, fc3d32011d31, and 62ade428367b, and none of them crashed.
Reporter | ||
Comment 22•14 years ago
|
||
Given where this is crashing, that range makes me want to stick to my theory that this is somehow a regression from http://hg.mozilla.org/mozilla-central/rev/0ddf975663a0 .
Blocks: 540692
Reporter | ||
Comment 23•14 years ago
|
||
I backed bug 540692 out: http://hg.mozilla.org/mozilla-central/rev/83adba230467 http://hg.mozilla.org/mozilla-central/rev/096332cd6d39 to test the theory that it's the cause of this bug. If this bug doesn't go away in tomorrow's nightly, we should reland it.
Comment 24•14 years ago
|
||
As usual, my first thought was crap: while a clobber will certainly run configure, a dep will too, if it happens to feel like it. However, it not happening to feel like it when it should have would be another way of expanding the range, so I went back and looked up which dep builds did and didn't. The clobber information is coming out of the buildbot json that nthomas was kind enough to give me, so I think I'm not mixing up which are what (assuming as seems to be the case that the first number in the filename on localgho.st is the directory where it was on stage.m.o): 1265607751-20100207214231-b21188a34531 - dep, configure 1265601266-20100207195426-be94483da3b4 - dep, no configure 1265600287-20100207193807-b76ad6cdd76e - dep, no configure 1265579422-20100207135022-872dcf34dab3 - forced clobber 1265562604-20100207091004-943afcbad1ac - dep, configure 1265562201-20100207090321-2caefaaa7d77 - dep, configure 1265558645-20100207080405-ed857569fabf - dep, no configure 1265549605-20100207053325-b234c7370793 - dep, no configure 1265549412-20100207053012-21d980d9b3a4 - dep, no configure 1265521871-20100206215111-fc3d32011d31 - dep, no configure 1265499628-20100206154028-62ade428367b - purged clobber 1265488028-20100206122708-16d4bba25a84 - dep, configure 1265485828-20100206115028-e544343970b4 - dep, configure 1265471973-20100206075933-bc6f2b598ff9 - forced clobber 1265463227-20100206053347-2e9d8868efc6 - forced clobber 1265461349-20100206050229-173248959f01 - nightly 1265454003-20100206030003-ada1992ccacb - forced clobber 1265453052-20100206024412-d72639947a60 - forced clobber 1265446534-20100206005534-2d873df39b6a - purged clobber 1265428968-20100205200248-72d91445b838 - purged clobber So for the it takes a clobber theory, not crashing in 62ade428367b puts the range at http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=62ade428367b&tochange=872dcf34dab3, and for the configure theory not crashing in 943afcbad1ac puts the range at the same "last build that doesn't crash to first that does" http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=943afcbad1ac&tochange=872dcf34dab3
Comment 25•14 years ago
|
||
Yes, Phil is correct. And excuse for being unclear in comment 3. I didn't realize that there was room for misunderstanding. I have been busy with it for at least 15 minutes until I realized where the crash came from (a crash in a new profile is rare that's why I went on searching) and spending 1 minute more for more detailed STR wouldn't have been too much effort to finish this properly. A possible next time just ask for detailed STR because if I know how to reproduce it no-one needs to do guesses, that's waste of time.
Reporter | ||
Comment 26•14 years ago
|
||
So it seems like backing out the vksaver.dll change didn't fix this. My next most likely explanation is that something in the NSS upgrade has a bad interaction with our DLL blocklisting code. Did NSS or PSM change anything about the loading or unloading of shared libraries? That said, I think I have an idea about how to repro in a debug build, but that might not be practical over the Internet connection I have right now.
Comment 27•14 years ago
|
||
Bug 527659 asks to upgrade mozilla-central from beta to a release candidate of NSS 3.12.6 However, before I do that, I consider to back out my recent NSS/PSM landings, for a period of 2-3 hours. I'd hope this give us a sufficient period of time to have hourly builds that allow us to confirm whether NSS is the culprit of this bug.
Comment 28•14 years ago
|
||
I have not yet done what I proposed in comment 27. Philor proposed on IRC that a 2-3 hours period during european timezone might not give me what I want. I decided to do reproduce myself using nightly builds. I used a fresh profile, but I disabled check-for-default-browser, and to minimize activity, I've changed the startup page to about:blank. I've used the build mentioned in comment 6 (20100209) and the most recent I could find (20100211). I have two Desktop shortcuts to both builds, both using args -P thatprofile. Here are my results: - I can often reproduce using build 0209 - using 0209 I sometimes crash on exit, even without having checked for update - even after running "check-updates" several times, I can never reproduce using build 0211 I conclude: - either the bug has to do with status "updates are available" - or the bug is gone in 0211
Reporter | ||
Comment 29•14 years ago
|
||
The number of crashes seems to have gone down dramatically in the builds of Feb. 10 and 11, and then disappeared in the builds of Feb. 12 (assuming there were any; not necessarily a reliable assumption). The current histogram of this crash is: Feb. 8 703 Feb. 9 618 Feb. 10 30 Feb. 11 16 It's not clear why this would have happened, though.
Comment 30•14 years ago
|
||
There were some Windows updates deployed on 2/10/2010 (at least what is on my xp machine): http://support.microsoft.com/?kbid=978251 http://support.microsoft.com/?kbid=978037 http://support.microsoft.com/?kbid=977914 http://support.microsoft.com/?kbid=975713 http://support.microsoft.com/?kbid=975560 http://support.microsoft.com/?kbid=978262 http://support.microsoft.com/?kbid=978706 http://support.microsoft.com/?kbid=977165 http://support.microsoft.com/?kbid=971468
Comment 31•14 years ago
|
||
Also the following STR produce the bug: - start a new profile, deny permanenty the "default browser" dialog - once the startpage http://www.mozilla.org/projects/minefield/ has loaded, wait until the processor is quiet and close the browser (choose Quit in the "quit browser" dialog) Repeat these steps. Between the second and the fourth close it will crash. This is a typical "very clean profile" crash. My default profile does not crash. I discovered accidentally, that if I put this pref in the profile: user_pref("browser.bookmarks.autoExportHTML", true); , it will not crash. There has been a temporary stop in these crashes. With the latest STR: 1eb1668ed9f6 crash 4ba8ccb0cadc no crash Query: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1eb1668ed9f6&tochange=4ba8ccb0cadc
Comment 32•14 years ago
|
||
And also: with the latest hourly 92a84cecf4f1 and the STR from comment 31 it is still crashing. The last build without crash (here): d43741a452c8 And the crash started again with: 92a84cecf4f1 Query: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d43741a452c8&tochange=92a84cecf4f1
Comment 33•14 years ago
|
||
Phil Ringnalda's range http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=943afcbad1ac&tochange=872dcf34dab3 is still correct also with the steps from comment 31.
Comment 34•14 years ago
|
||
I found some tiny ranges in between! I hope that someone still understands it: Not crashing: 46663814d764 Crashing: eafd8a60dfd8 Crashing: c492fb6295d1 Not crashing: 11006dbfb80e I re-checked every non-crasher thoroughly a couple of times.
Comment 35•14 years ago
|
||
(In reply to comment #34) > Not crashing: 46663814d764 > Crashing: eafd8a60dfd8 Query: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=46663814d764&tochange=eafd8a60dfd8 > > Crashing: c492fb6295d1 > Not crashing: 11006dbfb80e Query: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c492fb6295d1&tochange=11006dbfb80e
Reporter | ||
Comment 36•14 years ago
|
||
This appears to be back in today's nightly.
Comment 37•14 years ago
|
||
Would be cool to figure out which slave (vm) was making which build (non-crashing and crashing). Maybe it is a hw failure or config flaw of a particular build machine...?
Comment 38•14 years ago
|
||
Tryserver builds don't crash, iac not the ones I tried.
Comment 39•14 years ago
|
||
but all 7 latest hourlies are crashing.
Comment 40•14 years ago
|
||
crashed, fresh install, into test profile with no addons except the ridiculous MS .net assistant. bp-cb0811a9-6df3-482a-bb93-c7e712100217
Comment 41•14 years ago
|
||
I have the impression that it is the speed. If Firefox can close very fast, it crashes. If it has more tasks to do while closing, it does not crash. If I open and close, open and close, open and close, in quick succession, it does not crash, because the processor is very busy with more tasks. Only if the processor has not much to do at the moment, it crashes.
Comment 42•14 years ago
|
||
Can't reproduce it anymore with the latest nightly on Windows Vista.
Comment 43•14 years ago
|
||
Crash still exists. Seems this crash is related to more than just the updater. Just crashed today simply restarting. I had installed an extension about an hour before and continued surfing. When I finally restarted, it crashed. http://crash-stats.mozilla.com/report/index/8d8b4b55-4086-48b8-a15f-5e82b2100222
Comment 44•14 years ago
|
||
crash today too. bp-cb0811a9-6df3-482a-bb93-c7e712100217 fwiw, I have FF updates disabled, but not add-ons
Comment 45•14 years ago
|
||
ignore comment 44. crash reporter helper tricked me with an old crash. my last one was rather, MirrorWrappedNativeParent bp-d0c54c42-53b4-4425-8f7b-cf9e42100222
Comment 46•14 years ago
|
||
If it helps at all, this appears to be only a shutdown crash. I'm trying to load a minidump to see if I get a better stack (we're skipping one frame).
Comment 47•14 years ago
|
||
bent, can you record with the steps in comment #31 and such?
Assignee: nobody → bent.mozilla
Assignee | ||
Comment 48•14 years ago
|
||
Oy. Score another for record and replay!
Attachment #428587 -
Flags: superreview?(jonas)
Attachment #428587 -
Flags: review?(jonas)
Comment on attachment 428587 [details] [diff] [review] Patch Sorry!
Attachment #428587 -
Flags: superreview?(jonas)
Attachment #428587 -
Flags: superreview+
Attachment #428587 -
Flags: review?(jonas)
Attachment #428587 -
Flags: review+
Assignee | ||
Comment 50•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/e2fe146316cf
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 51•14 years ago
|
||
This presumably also fixes bug 537417.
Updated•14 years ago
|
Target Milestone: --- → mozilla1.9.3a2
Updated•13 years ago
|
Crash Signature: [@ @0x0 | nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, int, unsigned int) ]
You need to log in
before you can comment on or make changes to this bug.
Description
•