Closed
Bug 778460
Opened 13 years ago
Closed 13 years ago
Tinderbox spidermonkey-rootanalysis builds broken since 2012-07-17
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla18
People
(Reporter: emorley, Assigned: sfink)
References
Details
(Whiteboard: [js:p2])
Attachments
(1 file)
Tinderbox mozilla-inbound_linux64-debug_spidermonkey-rootanalysis builds on inbound tip are segfaulting:
https://tbpl.mozilla.org/php/getParsedLog.php?id=13941541&tree=Mozilla-Inbound
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&noignore=1&jobname=spidermonkey&rev=29bff59d3bbe
Tracing this back, the seg faults started in this range:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=56a19b273c2b&tochange=b6291bb72294
Prior to that, they were still busted, but with a variety of jit_test.py failures:
https://tbpl.mozilla.org/php/getParsedLog.php?id=13846927&tree=Mozilla-Inbound
The range for these failures starting is:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=4a25a6c994fd&tochange=2e8d41e9f8ef
Given that these builds have been failing for 11 days without anyone noticing (and our current infra-load contained state of late), do we even need these builds?
Reporter | ||
Comment 1•13 years ago
|
||
> segfaulting:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=13941541&tree=Mozilla-
{
make -C jsapi-tests check
make[1]: Entering directory `/builds/slave/m-in-lnx64-dbg-spidermonkey-rootanalysis/objdir/js/jsapi-tests'
/builds/slave/m-in-lnx64-dbg-spidermonkey-rootanalysis/objdir/dist/bin/run-mozilla.sh /builds/slave/m-in-lnx64-dbg-spidermonkey-rootanalysis/objdir/dist/bin/jsapi-tests
make[1]: Leaving directory `/builds/slave/m-in-lnx64-dbg-spidermonkey-rootanalysis/objdir/js/jsapi-tests'
make[1]: *** [check] Segmentation fault
}
> Prior to that, they were still busted, but with a variety of jit_test.py
> failures:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=13846927&tree=Mozilla-
> Inbound
(Too many to paste here)
Comment 2•13 years ago
|
||
The rooting analysis builds work by causing jit-tests to crash or assert when someone introduces new rooting hazards (which aren't a problem for the product we are currently shipping, but will become a big problem when we start doing generational/compacting GC). A couple weeks ago I got them green, then noticed a week or so ago they were orange again. I didn't do anything, because keeping the builds green is the job of the person doing the pushes, and I do not have time to watch for / track down bustage.
Turning these builds orange is not currently cause for backout / hotfix, as the builds are not shown by default and are not run on try server. My understanding is the latter is a prerequisite for the former. Steve has a patch in bug 775355 to fix this, but it's been sitting for a week with no activity.
Reporter | ||
Comment 3•13 years ago
|
||
Thank you for clarification as to what these builds are for - I had no idea up until now :-)
> Turning these builds orange is not currently cause for backout / hotfix, as
> the builds are not shown by default and are not run on try server. My
> understanding is the latter is a prerequisite for the former. Steve has a
> patch in bug 775355 to fix this, but it's been sitting for a week with no
> activity.
It's ok, I wasn't proposing backing anything out :-)
I know the SM hidden builds often get busted for a few days, so don't tend to give them a second thought. I was just surprised to keep on pressing the down arrow on TBPL and seeing some of them busted for 2-4 weeks (latter being the other bugs filed at the same time as this one) - so thought it would be worth filing so it was on someone's radar.
Blocks: 773059
Assignee | ||
Updated•13 years ago
|
Assignee: general → sphink
Reporter | ||
Updated•13 years ago
|
Summary: Tinderbox spidermonkey-rootanalysis builds busted for 11 days → Tinderbox spidermonkey-rootanalysis builds broken since 2012-07-17
Assignee | ||
Comment 4•13 years ago
|
||
I think I've finally tracked down part of what's going on.
I could not reproduce locally. Still not sure why. I got access to a slave and could reproduce there, although the tree I built crashed in a different place.
The gdb installed on the slaves is old and gives internal errors all over the place. Upgrading to a newer gdb fixes that, but only function names are given. No locals or line numbers are available. I created something similar to an --enable-shared-js build, and was able to get full debug info, but then the bug did not reproduce. I ended up debugging without debug info.
I believe the root problem is that we have not added the --enable-rooting-analysis flag to the toplevel configure.in, so gc/Root.h is getting compiled with different flags. Without JSGC_ROOT_ANALYSIS, the Rooted<T> structure is different (it contains no 'stack' or 'prev' fields.) An inlined operator= will change the 'ptr' field, which is at offset zero without JSGC_ROOT_ANALYSIS. With it, 'stack' sits at that offset, and the debug check will dereference *stack. For pointer types, that's not a big deal, and should lead to an assert failure. For Rooted<Value>, this will dereference a Value interpreted as a pointer, and will probably crash.
That's clear, but what I'm confused about is that my non-JSGC_ROOT_ANALYSIS operator= is getting called from within js::Interpret, which should see JSGC_ROOT_ANALYSIS and therefore have the 'stack' field. I'm wondering if that operator= was emitted into libxul.so and the linker is choosing it or something? Very confusing. (gdb only sees the |stack|-less one for Rooted<Value>, while it sees the |stack|-ful one for Rooted<JSObject*>.)
It is possible that differing amounts of inlining explains why I can't reproduce it locally. Or different effective weakness of various symbols. Or something.
Anyway, this will hopefully fix it all so I don't have to unravel everything that's going on.
Assignee | ||
Comment 5•13 years ago
|
||
Unless I've fooled myself again by doing a build that isn't really doing rooting analysis, which is not only possible but likely, http://tbpl.mozilla.org/?tree=Try&rev=1748cefd2cf8 says this change makes it green again.
Attachment #658910 -
Flags: review?(terrence)
Assignee | ||
Updated•13 years ago
|
Attachment #658910 -
Flags: review?(terrence) → review?(ted.mielczarek)
Comment 6•13 years ago
|
||
Comment on attachment 658910 [details] [diff] [review]
Copy the various GC-related --enable-* configure options into toplevel configure.in to get correct defines
Review of attachment 658910 [details] [diff] [review]:
-----------------------------------------------------------------
Bleh, I hate the explosion of configure options here. Are we going to get rid of these in the near future?
Attachment #658910 -
Flags: review?(ted.mielczarek) → review+
Assignee | ||
Comment 7•13 years ago
|
||
(In reply to Ted Mielczarek [:ted] from comment #6)
> Bleh, I hate the explosion of configure options here. Are we going to get
> rid of these in the near future?
"Near future"? Hm, not unless your definition is pretty loose.
But yes, exact rooting will become the default once it works 'n stuff. Dunno if we'll switch it to --enable-conservative-rooting then, but even if so, we'll probably break it before too long and have to remove it anyway. --enable-rooting-analysis is probably removable once exact rooting works. I would think that both incremental and conservative GC would eventually be compiled in all the time and controllable via preferences.
I'll file a bug now for incremental, since I don't know what point there is in turning it off now: bug 789577.
Comment 8•13 years ago
|
||
Okay, just in my experience every single configure option is a sweet sweet honeypot for random people building Firefox to fiddle with, which inevitably results in broken builds. I much prefer that we don't add more knobs to twiddle unless we absolutely need them.
Updated•13 years ago
|
Whiteboard: [js:p2]
Comment 9•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in
before you can comment on or make changes to this bug.
Description
•