Closed Bug 933882 Opened 6 years ago Closed 6 years ago

Invalidate scripts and discard JIT code instead of doing AutoDebugModeGC

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla28
Tracking Status
firefox26 --- wontfix
firefox27 + fixed
firefox28 --- fixed

People

(Reporter: billm, Assigned: shu)

References

Details

(Whiteboard: [qa-])

Attachments

(3 files, 4 obsolete files)

We're having a problem in bug 815603 where people who have Firebug enabled are seeing long pauses when switching tabs. Firebug changes the debug mode during tab switch, which causes a full AutoDebugModeGC in these cases. These GCs can be ~500ms, which is a long time to wait to switch tabs.

Shu thinks we don't need a GC here anymore. Instead, we can get away with a CellIter over JSScripts and something to discard JIT code. That should mostly fix the Firebug issue. In the long term, switching Firebug to JSD2 should make things even better by allowing us to focus on a single zone. And bug 716647 will allow us to get away without discarding JIT code. But we should do this short-term thing now.
(In reply to Bill McCloskey (:billm) from comment #0)

> should make things even better by allowing us to focus on a single zone. And
> bug 716647 will allow us to get away without discarding JIT code.

Not true unfortunately. Bug 716647 will still invalidate and recompile code, it'll just try to do it lazily + with live frames on the stack. I plan to add another mode to Debugger with different guarantees that requires the user to explicitly throw away code if he wants to do slow things like onStep.
It would be great if the fix for this made it already into the next upcoming Firefox release.

Sebastian
Assignee: nobody → shu
Status: NEW → ASSIGNED
Comment on attachment 826280 [details] [diff] [review]
wip Invalidate JIT code instead of doing full GC on debug mode toggle. (r=?)

Given how we don't really depend on jsanalyze for much anymore, seems like we can just throw away JIT code for now instead of doing a full GC on debug mode toggle.

Brian, could you take a quick look and tell me if this makes sense?

Note that this is meant as a stopgap for Firebug until bug 716647 is fixed.
Attachment #826280 - Flags: feedback?(bhackett1024)
Shu, would that patch apply safely to Aurora or Beta?  Which of them would you be confident it can be backported to?
Interestingly, turning off debug mode is much faster than turning it on, with this patch.  Is that expected?
(In reply to Boris Zbarsky [:bz] from comment #5)
> Shu, would that patch apply safely to Aurora or Beta?  Which of them would
> you be confident it can be backported to?

Should work for Aurora. I'm not sure about Beta.
Brian, seems like you're the best candidate for r? here.
Attachment #827253 - Flags: review?(bhackett1024)
Attachment #826280 - Attachment is obsolete: true
Attachment #826280 - Flags: feedback?(bhackett1024)
Comment on attachment 827253 [details] [diff] [review]
Invalidate JIT code instead of doing full GC on debug mode toggle. (r=?)

Cancelling r? Still crashy.
Attachment #827253 - Flags: review?(bhackett1024)
Comment on attachment 827253 [details] [diff] [review]
Invalidate JIT code instead of doing full GC on debug mode toggle. (r=?)

Review of attachment 827253 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/Ion.cpp
@@ +2723,5 @@
> +        return;
> +
> +    // Invalidate the stack if any compartments toggled from on->off, because
> +    // we allow scripts to be on stack when turning off debug mode.
> +    bool invalidateStack = needInvalidation_ == ToggledOff;

Since you unconditionally destroy the jitcode at the end of this function, I think you always need to invalidate the stack.  Otherwise FinishInvalidation and FinishDiscardBaselineScript will destroy code that is still on the stack.
(In reply to Brian Hackett (:bhackett) from comment #10)
> Comment on attachment 827253 [details] [diff] [review]
> Invalidate JIT code instead of doing full GC on debug mode toggle. (r=?)
> 
> Review of attachment 827253 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/Ion.cpp
> @@ +2723,5 @@
> > +        return;
> > +
> > +    // Invalidate the stack if any compartments toggled from on->off, because
> > +    // we allow scripts to be on stack when turning off debug mode.
> > +    bool invalidateStack = needInvalidation_ == ToggledOff;
> 
> Since you unconditionally destroy the jitcode at the end of this function, I
> think you always need to invalidate the stack.  Otherwise FinishInvalidation
> and FinishDiscardBaselineScript will destroy code that is still on the stack.

It's not possible to turn *on* debug mode for compartments with live JS frames on the stack at the moment (it'll just fail), but it's possible to turn *off* debug mode for compartments with live frames. Since the discarding is filtered by compartment or zone, that's why I only walk the stack if we're toggling on -> off.
Attachment #827822 - Flags: review?(bhackett1024)
Attachment #827253 - Attachment is obsolete: true
See also comment 81 on bug 815603 for my results of testing Boris' try-build.

Sebastian
Had a condition wrong.
Attachment #827986 - Flags: review?(bhackett1024)
Attachment #827822 - Attachment is obsolete: true
Attachment #827822 - Flags: review?(bhackett1024)
Depends on: 935470
Attachment #827986 - Flags: review?(bhackett1024) → review+
Christian,

ASAN was failing due to OOM, and it's highly likely it's because this patched changed GC behavior. What's the memory requirements for ASAN? This whole point of this patch is to *put off* doing a massive GC on debug mode toggle, but it seems without it we OOM on ASAN tests. Can the tests themselves request a GC somehow? Or what am I supposed to do to fix the orange?
Flags: needinfo?(choller)
You could add some explicit GC calls to these tests.  In browser-chrome, how many tests toggle debug mode and how many times?

But I'd prefer if you figured out why the GC heuristics are failing to notice "we're almost out of memory, we should GC now!".  Might this patch have introduced a leak or poor accounting of memory held alive by GC objects?
It is unlikely this introduced leaks. This only discards jit code. It is more likely the existing GCs were masking leaks.
https://hg.mozilla.org/integration/mozilla-inbound/rev/2abeb02c4777
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6981912ff87

I papered over the OOMs by requesting GCs from the tests. Try looked cleanish, all oranges seem to be existing intermittents: https://tbpl.mozilla.org/?tree=Try&rev=0d73890e342c

Also opened bug 937726 for investigating the OOMs.
See Also: → 937726
https://hg.mozilla.org/mozilla-central/rev/2abeb02c4777
https://hg.mozilla.org/mozilla-central/rev/c6981912ff87
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Shu: as per comment 5 and comment 7, do you plan to uplift this fix to Aurora 28?
Flags: needinfo?(shu)
(In reply to Chris Peterson (:cpeterson) from comment #22)
> Shu: as per comment 5 and comment 7, do you plan to uplift this fix to
> Aurora 28?

When we put out the OOM fire, I'll look. Busy with that atm.
Flags: needinfo?(shu)
Depends on: 939036
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 941500, 937913, 941549
Sheriffs report 3 shutdown leaks have started showing up, or showing up more frequently.  These are all cases where we leak a pair of nsGlobalWindows (inner and outer presumably?) associated with a debugger test, on 10.8 opt only.  Pretty weird, but it does sound like it could be a regression from this.
Retriggers clearly point the shu's push as the cause of the spike.
https://hg.mozilla.org/mozilla-central/rev/9ee5c2664d78
https://hg.mozilla.org/mozilla-central/rev/b060467f52d6
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
It seems this (well, a recent commit to jscompartment.cpp & Debugger.cpp, so likely https://hg.mozilla.org/mozilla-central/rev/9ee5c2664d78) broke non-ion builds: http://buildbot.rhaalovely.net/builders/mozilla-central-sparc64/builds/620/steps/build/logs/stdio (fails when linking js shell)

../libjs_static.a(jscompartment.o)(.text+0x3ed4): In function `JSCompartment::removeDebuggee(js::FreeOp*, js::GlobalObject*, js::detail::HashTable<js::GlobalObject* const, js::HashSet<js::GlobalObject*, js::DefaultHasher<js::GlobalObject*>, js::SystemAllocPolicy>::SetOps, js::SystemAllocPolicy>::Enum*)':
/home/buildslave/mozilla-central-sparc64/build/js/src/jscompartment.cpp:824: undefined reference to `js::AutoDebugModeInvalidation::~AutoDebugModeInvalidation()'
../libjs_static.a(jscompartment.o)(.text+0x53a4): In function `JSCompartment::addDebuggee(JSContext*, js::GlobalObject*)':
/home/buildslave/mozilla-central-sparc64/build/js/src/jscompartment.cpp:795: undefined reference to `js::AutoDebugModeInvalidation::~AutoDebugModeInvalidation()'
../libjs_static.a(Debugger.o)(.text+0xaf0c): In function `js::Debugger::addAllGlobalsAsDebuggees(JSContext*, unsigned int, JS::Value*)':
/home/buildslave/mozilla-central-sparc64/build/js/src/vm/Debugger.cpp:1965: undefined reference to `js::AutoDebugModeInvalidation::~AutoDebugModeInvalidation()'
../libjs_static.a(Debugger.o)(.text+0xaf30):/home/buildslave/mozilla-central-sparc64/build/js/src/vm/Debugger.cpp:1969: undefined reference to `js::AutoDebugModeInvalidation::~AutoDebugModeInvalidation()'
../libjs_static.a(Debugger.o)(.text+0xafc0): In function `js::Debugger::addDebuggeeGlobal(JSContext*, JS::Handle<js::GlobalObject*>)':
/home/buildslave/mozilla-central-sparc64/build/js/src/vm/Debugger.cpp:2127: undefined reference to `js::AutoDebugModeInvalidation::~AutoDebugModeInvalidation()'
shot in the dark, but maybe adding ~AutoDebugModeInvalidation(){}; for !JS_ION in jscompartment.h would help, but i'm not sure its' the best way.
(In reply to Landry Breuil (:gaston) from comment #30)
> shot in the dark, but maybe adding ~AutoDebugModeInvalidation(){}; for
> !JS_ION in jscompartment.h would help, but i'm not sure its' the best way.

Landry, could you file a separate bug, marked as blocking this one, for this problem? It makes the work easier to keep track of.
Depends on: 942346
Flags: needinfo?(choller)
qtip: e8b497a9d658
top: bug934799-part2-uplift.patch
Attachment #8341332 - Attachment is obsolete: true
Attachment #8341332 - Attachment is patch: false
Comment on attachment 8341336 [details] [diff] [review]
Invalidate JIT code instead of doing full GC on debug mode toggle. (

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
  Bug 754201 most recently. Not sure what introduced AutoDebugModeGC initially.

User impact if declined: 
  Firebug unusable with many tabs. Firebug has worked around this by disabling JITs permanently if it's on, which is in some ways worse since it slows down *all* tabs.

Testing completed (on m-c, etc.): 
  On m-c.

Risk to taking this patch (and alternatives if risky): 
  GC behavior changes surfacing existing leaks/OOMs. When landing on m-c new leaks were surfaced (and eventually fixed). I am nominating those fixes for uplift also.

String or IDL/UUID changes made by this patch:
  None

The Firebug slowness metabug is bug 815603. List of bugs I'm nominating for uplift in this batch:

 - bug 936143
 - bug 935228
 - bug 933882
 - bug 935470 
 - bug 942346
 - bug 934799
Attachment #8341336 - Flags: approval-mozilla-beta?
Attachment #8341336 - Flags: approval-mozilla-aurora?
Comment on attachment 8341339 [details] [diff] [review]
Force GC in Debugger mochitests for ASan. (

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
  Bug 754201 most recently. Not sure what introduced AutoDebugModeGC initially.

User impact if declined: 
  Firebug unusable with many tabs. Firebug has worked around this by disabling JITs permanently if it's on, which is in some ways worse since it slows down *all* tabs.

Testing completed (on m-c, etc.): 
  On m-c.

Risk to taking this patch (and alternatives if risky): 
  GC behavior changes surfacing existing leaks/OOMs. When landing on m-c new leaks were surfaced (and eventually fixed). I am nominating those fixes for uplift also.

String or IDL/UUID changes made by this patch:
  None

The Firebug slowness metabug is bug 815603. List of bugs I'm nominating for uplift in this batch:

 - bug 936143
 - bug 935228
 - bug 933882
 - bug 935470 
 - bug 942346
 - bug 934799
Attachment #8341339 - Flags: approval-mozilla-beta?
Attachment #8341339 - Flags: approval-mozilla-aurora?
Comment on attachment 8341336 [details] [diff] [review]
Invalidate JIT code instead of doing full GC on debug mode toggle. (

We can take this on Aurora and thus it will be on Beta in a week, but we're way past the point of taking untracked, major change to Firefox 26 as we are going to build our final candidate today as well as the RC.
Attachment #8341336 - Flags: approval-mozilla-beta?
Attachment #8341336 - Flags: approval-mozilla-beta-
Attachment #8341336 - Flags: approval-mozilla-aurora?
Attachment #8341336 - Flags: approval-mozilla-aurora+
Attachment #8341339 - Flags: approval-mozilla-beta?
Attachment #8341339 - Flags: approval-mozilla-beta-
Attachment #8341339 - Flags: approval-mozilla-aurora?
Attachment #8341339 - Flags: approval-mozilla-aurora+
> but we're way past the point of taking untracked, major change to Firefox 26
That's pretty sad, since this change represents a fix until Firebug is JSD2 ready, which will probably be with Firefox 27 or 28.

Sebastian
(In reply to Sebastian Zartner from comment #38)
> That's pretty sad, since this change represents a fix until Firebug is JSD2
> ready, which will probably be with Firefox 27 or 28.
> 
> Sebastian

Doesn't that depend on bug 637572 as well? What's landed there so far will make 28, but I don't know if there's any intent to uplift.
Lukas, if we don't get this in Firefox 26, then Firebug will continue severely regressing the performance of Firefox until we ship Firefox 27.

If we're OK with that, that's one thing, but I just want us to understand the impact here.
Flags: needinfo?(lsblakk)
(In reply to Boris Zbarsky [:bz] from comment #40)
> Lukas, if we don't get this in Firefox 26, then Firebug will continue
> severely regressing the performance of Firefox until we ship Firefox 27.
> 
> If we're OK with that, that's one thing, but I just want us to understand
> the impact here.

I don't like that impact but also cannot take the risk of landing this much change right before shipping with barely any time to have our Beta audience evaluate and find regressions. These bugs weren't tracked and thus weren't on our radar.  Given your statement, I'll track for 27 so we ensure this is watched more closely for that release.
Flags: needinfo?(lsblakk)
That's fair.  The only other real option is to slip the 26 release while we gain confidence in these patches or to do a point release with these patches before the actual 27 release, I guess...
Whiteboard: [qa-]
Depends on: 958980
You need to log in before you can comment on or make changes to this bug.