270.32 KB, text/plain
5.20 KB, text/plain
6.39 KB, patch
|Details | Diff | Splinter Review|
Brian, this is another webconsole intermittent that is pretty high in the lists. Seems to have gotten worse starting around Jan 6: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1236316&startday=2016-01-04&endday=2016-01-11&tree=trunk
Sample stack trace from the crash (from https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=6409821)
Talked with :jandem about this, and sounds like it's possibly a gc-related bug somewhere (likely debugger-related). Given that, plus since this started failing more frequently around Jan 6 it could well be related to bug 1132501. I haven't been able to reproduce yet with: `./mach mochitest devtools/client/webconsole/test/browser_webconsole_split.js --run-until-failure` but I'll keep trying.
Flags: needinfo?(bgrinstead) → needinfo?(jdemooij)
See Also: → 1132501
FYI I wasn't able to reproduce locally on Win8 opt or debug builds with multiple tries of --run-until-failure
I just got this backtrace, but I don't think it is the one we are hitting here: #0 arena_bin_nonfull_run_get (bin=0x7ffff6c00598, arena=0x7ffff6c00040) at /mnt/desktop/gecko/memory/mozjemalloc/jemalloc.c:3992 #1 arena_bin_malloc_hard (bin=0x7ffff6c00598, arena=0x7ffff6c00040) at /mnt/desktop/gecko/memory/mozjemalloc/jemalloc.c:4042 #2 arena_malloc_small (zero=false, size=160, arena=0x7ffff6c00040) at /mnt/desktop/gecko/memory/mozjemalloc/jemalloc.c:4232 #3 arena_malloc (arena=0x7ffff6c00040, size=<optimized out>, zero=<optimized out>) at /mnt/desktop/gecko/memory/mozjemalloc/jemalloc.c:4304 #4 0x00000000004142a3 in je_malloc (size=160) at /mnt/desktop/gecko/memory/mozjemalloc/jemalloc.c:6183 #5 0x0000000000405484 in moz_xmalloc (size=160) at /mnt/desktop/gecko/memory/mozalloc/mozalloc.cpp:83 #6 0x00007fffeb6cf0a6 in operator new (size=160) at /mnt/desktop/gecko/obj-firefox-opt/dist/include/mozilla/mozalloc.h:186 #7 nsCSSFrameConstructor::FrameConstructionItemList::AppendItem(nsCSSFrameConstructor::FrameConstructionData const*, nsIContent*, nsIAtom*, int, PendingBinding*, already_AddRefed<nsStyleContext>&&, bool, nsTArray<nsIAnonymousContentCreator::ContentInfo>*) (this=this@entry=0x7fffffff7970, aFCData=0x7fffee70b3f8 <nsCSSFrameConstructor::FindXULTagData(mozilla::dom::Element*, nsIAtom*, int, nsStyleContext*)::sXULTagData+8>, aContent=aContent@entry= 0x7fffda0048f0, aTag=aTag@entry=0x7fffe1a6c2e0, aNameSpaceID=9, aPendingBinding=aPendingBinding@entry=0x7fffc8d24fe0, aStyleContext=aStyleContext@entry=<unknown type in /mnt/desktop/gecko/obj-firefox-opt/dist/bin/libxul.so, CU 0x15342b29, DIE 0x15552d8b>, aSuppressWhiteSpaceOptimizations=aSuppressWhiteSpaceOptimizations@entry=false, aAnonChildren=aAnonChildren@entry=0x0) at /mnt/desktop/gecko/layout/base/nsCSSFrameConstructor.h:840 #8 0x00007fffeb6ac522 in nsCSSFrameConstructor::AddFrameConstructionItemsInternal (this=this@entry=0x7fffbe15a050, aState=..., aContent=aContent@entry=0x7fffda0048f0, aParentFrame=aParentFrame@entry=0x7fffc1e2f5d8, aTag=0x7fffe1a6c2e0, aNameSpaceID=9, aSuppressWhiteSpaceOptimizations=aSuppressWhiteSpaceOptimizations@entry=false, aStyleContext=aStyleContext@entry=0x7fffc1ba6b20, aFlags=aFlags@entry=3, aAnonChildren=aAnonChildren@entry=0x0, aItems=...) at /mnt/desktop/gecko/layout/base/nsCSSFrameConstructor.cpp:5687
arena_bin_nonfull_run_get (bin=0x7ffff6c00598, arena=0x7ffff6c00040) at /mnt/desktop/gecko/memory/mozjemalloc/jemalloc.c:3992 3992 run->bin = bin;
Terrence, Steve, can either of you help identify whats going on in this stack trace? https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=6409821#L12294 Seems GC or jit related. Thanks!
As I said on IRC yesterday, we're crashing likely because we have a poisoned GC pointer (all crashes have values like 0xfffc2b2b2b2b2b2b in a register). (In reply to Brian Grinstead [:bgrins] from comment #7) > I haven't been able to reproduce yet with: `./mach mochitest > devtools/client/webconsole/test/browser_webconsole_split.js > --run-until-failure` but I'll keep trying. You probably have to run more tests, these GC-related crashes often show up some time after the actual bug.
I'm doing some Try pushes to see if this fails frequently enough to do Try debugging.
(In reply to Jan de Mooij [:jandem] from comment #15) > I'm doing some Try pushes to see if this fails frequently enough to do Try > debugging. Ugh. OS X 10.10 opt dt2 - 5 out of 8 runs are orange. Of those 5 oranges, 3 are this DoTypeMonitorFallback signature and 2 are in Interpret. As I mentioned elsewhere, I think these are caused by the same bug.
A try run with some debug logs to try to narrow that down within the devtools codebase... https://treeherder.mozilla.org/#/jobs?repo=try&revision=5750aab5f318
OK, after a lot of debugging and hair pulling, I think I know what's going on. Try servering a patch right now. Unfortunately, this margin is too small to describe the problem. More seriously, ActivationEntryMonitor's constructor should not trigger GC.
(In reply to Jan de Mooij [:jandem] from comment #18) > Try servering a patch right now. Without the patch (5 orange, 3 green): https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ad1393022d1&group_state=expanded&filter-searchStr=OS%20X%2010.10%20opt%20Mochitest%20Mochitest%20DevTools%20Browser%20Chrome%20M%28dt2%29 With the patch (0 orange, 12 green): https://treeherder.mozilla.org/#/jobs?repo=try&revision=33ddd6727bfd&group_state=expanded&filter-searchStr=OS%20X%2010.10%20opt%20Mochitest%20Mochitest%20DevTools%20Browser%20Chrome%20M%28dt2%29
For now, suppress GC in the ActivationEntryMonitor constructors. The comments explain why we should not GC there.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8709415 - Flags: review?(nfitzgerald)
Product: Firefox → Core
It's not clear to me if we need this on Aurora or not, but please set the bug to affected if we do since Gecko 45 is our next ESR version as well.
I think we want this on Aurora as well. FWIW, this bug (and the similar bug 1225176) show we should not add code to SpiderMonkey that is not tested in the shell. If we had exposed ActivationEntryMonitor to the shell, somehow, the fuzzers would have found this a long time ago.
Comment on attachment 8709415 [details] [diff] [review] Patch Review of attachment 8709415 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, jandem! ::: js/src/jit/BaselineJIT.cpp @@ +138,5 @@ > if (data.osrFrame) > data.osrFrame->setRunningInJit(); > > +#ifdef DEBUG > + nogc.reset(); Is `nogc` a `Maybe` so that you can reset it right here? I don't find this pattern particularly lucid, however the alternatives that come to mind don't seem that much better: * Add a `AutoAssertOnGC::done()` method that ends its observation? * Move `nogc` into a new scope-restricted instance that runs the dtor here and makes the outer dtor a no-op? Perhaps best just to add a comment explaining why `nogc` is a `Maybe` up above.
Attachment #8709415 - Flags: review?(nfitzgerald) → review+
Comment on attachment 8709415 [details] [diff] [review] Patch Approval Request Comment [Feature/regressing bug #]: Bug 1160307. [User impact if declined]: Crashes when using devtools. [Describe test coverage new/current, TreeHerder]: Treeherder confirms it fixes one of our top oranges. [Risks and why]: Very low risk; should only affect devtools-related code. [String/UUID change made/needed]: None.
Attachment #8709415 - Flags: approval-mozilla-aurora?
Comment on attachment 8709415 [details] [diff] [review] Patch Fix a crash, taking it.
Attachment #8709415 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.