The default bug view has changed. See this FAQ.

Baldr: optimize compile time

RESOLVED FIXED in Firefox 52

Status

()

Core
JavaScript Engine
RESOLVED FIXED
4 months ago
4 months ago

People

(Reporter: luke, Assigned: luke)

Tracking

(Blocks: 1 bug)

Trunk
mozilla52
Points:
---

Firefox Tracking Flags

(firefox52 fixed)

Details

Attachments

(4 attachments)

(Assignee)

Description

4 months ago
Doing some profiling, it looks like a major regression crept in from bug 1273462 and I also noticed a few other improvements.  Together these take compile time of Unity down from 3420ms to 1492ms on my machine using --thread-count=8.
(Assignee)

Comment 1

4 months ago
Created attachment 8810028 [details] [diff] [review]
inline-vector

There are apparently very many tiny functions.  Amazingly, this reduces pure validation time from 290ms to 134ms for Unity.
Attachment #8810028 - Flags: review?(sunfish)
(Assignee)

Comment 2

4 months ago
Created attachment 8810029 [details] [diff] [review]
double-num-tasks

This ensures worker threads remain saturated with work and reduces wait time (validation thread waiting on workers) by 100ms on Unity.
Attachment #8810029 - Flags: review?(sunfish)
(Assignee)

Comment 3

4 months ago
Created attachment 8810030 [details] [diff] [review]
optimize-protection

Calling mprotect() twice in a loop costs 1.7s out of 3.4s.  This patch hoists it out so it happens once.  There's a bit of trickiness because, while the code is unprotected, we add some instructions which previously was asserted against.  Fortunately, looking through the logic, there is no problem.
Attachment #8810030 - Flags: review?(sunfish)
(Assignee)

Updated

4 months ago
Attachment #8810030 - Flags: feedback?(emanuel.hoogeveen)
Comment on attachment 8810030 [details] [diff] [review]
optimize-protection

Review of attachment 8810030 [details] [diff] [review]:
-----------------------------------------------------------------

That's a huge difference, thanks for tracking it down. I have some patches up in bug 1306972 that rejigger things somewhat and should also help performance (though probably not as much). I would be curious to know how much those help on this particular benchmark without this patch.
Attachment #8810030 - Flags: feedback?(emanuel.hoogeveen) → feedback+

Comment 5

4 months ago
Comment on attachment 8810028 [details] [diff] [review]
inline-vector

Review of attachment 8810028 [details] [diff] [review]:
-----------------------------------------------------------------

Nice and simple :).
Attachment #8810028 - Flags: review?(sunfish) → review+

Comment 6

4 months ago
Comment on attachment 8810030 [details] [diff] [review]
optimize-protection

Review of attachment 8810030 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ds/PageProtectingVector.h
@@ -77,5 @@
>                              vector.capacity() >= pageSize + offsetToPage;
>      }
>  
>      void protect() {
> -        MOZ_ASSERT(!regionUnprotected);

This may change the invariant of PageProtectingVector in a subtle way. Consider the following sequence, which I believe is what happens in patchCallSites:

 - start with a normal populated PageProtectingVector
 - call unprotectRegion on some region of it
 - append some elements
 - call reprotectRegion

Previously, this sequence would trigger the assert in protect(). With this patch, the assert is removed, and this sequence silently succeeds, however it leaves the PageProtectingVector in a special state where pages outside the region that was reprotected are left unprotected.

I guess PageProtectingVector is sort of best-effort anyway, so perhaps this is ok, but it's surprising and I'd like to make sure I understand.
Attachment #8810030 - Flags: review?(sunfish)
Hmmm, yes that's true. Like this, any bytes appended while a region is unprotected won't trigger the page protection. Could you add a modified version of AutoUnprotectAssemblerBufferRegion that calls unprotectDataRegion(0, assembler->size()) in its constructor and reprotectDataRegion(0, assembler->size()) in its destructor instead?

This is a bit hairy because appending bytes could cause the underlying vector to reallocate - but it should work because we're unprotecting the whole thing in advance anyway.
Actually, that won't even work because PageProtectingVector::setContainingRegion() uses protectedBytes for its upper limit. Ugh, I guess we can live with it for now.
(Assignee)

Comment 9

4 months ago
(In reply to Dan Gohman [:sunfish] from comment #6)
You're right and indeed I did consider this case but I think it's not a big problem because there are always some dangling bytes beyond the last protected page and since protect() always reprotects the entire region from scratch the next append() of a byte after the reprotectRegion() will maximally protect the entire Vector.  So we're really just adding a tiny window of time where some maybe some bytes stay unprotected a little longer, but I think it's a pretty small window.
(Assignee)

Updated

4 months ago
Attachment #8810030 - Flags: review?(sunfish)
Comment on attachment 8810030 [details] [diff] [review]
optimize-protection

Review of attachment 8810030 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense. This seems fine for now.
Attachment #8810030 - Flags: review?(sunfish) → review+
Comment on attachment 8810029 [details] [diff] [review]
double-num-tasks

Review of attachment 8810029 [details] [diff] [review]:
-----------------------------------------------------------------

I don't understand why this produces a big speedup yet, but it seems safe.
Attachment #8810029 - Flags: review?(sunfish) → review+
(Assignee)

Comment 12

4 months ago
(In reply to Dan Gohman [:sunfish] from comment #11)
> I don't understand why this produces a big speedup yet, but it seems safe.

The idea is that if there are N cores/threads running N tasks, as soon as one of those cores/threads finishes a task, it should be able to *immediately* move on to the next task without waiting for the validation thread to wake up (b/c it waits after dispatching N tasks for one of them to finish), generate a new task, and dispatch it the queue.  Since the finish times of multiple tasks can coincide, this means having one pending task available per thread.  So the speedup is just less wasted time on all the worker cores.

Comment 13

4 months ago
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ee685bfbb7d5
Baldr: give OpIter Vectors an inline capacity (r=sunfish)
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c158c99496d
Baldr: allow 2*num-cores outstanding tasks (r=sunfish)
https://hg.mozilla.org/integration/mozilla-inbound/rev/71825cbd0e25
Baldr: hoist mprotect out of MacroAssembler::patchCall/FarJump (r=sunfish)

Comment 14

4 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/ee685bfbb7d5
https://hg.mozilla.org/mozilla-central/rev/8c158c99496d
https://hg.mozilla.org/mozilla-central/rev/71825cbd0e25
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
status-firefox52: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52

Comment 15

4 months ago
Last commit breaks JS_CODEGEN_NONE builds.

> /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp: In member function 'bool js::wasm::ModuleGenerator::patchCallSites(js::wasm::ModuleGenerator::TrapExitOffsetArray*)':
> /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp:253:5: error: 'AutoPrepareForPatching' is not a member of 'js::jit::MacroAssembler'
>      MacroAssembler::AutoPrepareForPatching patching(masm_);
>      ^~~~~~~~~~~~~~
> /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp: In member function 'bool js::wasm::ModuleGenerator::patchFarJumps(const TrapExitOffsetArray&)':
> /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp:350:5: error: 'AutoPrepareForPatching' is not a member of 'js::jit::MacroAssembler'
>      MacroAssembler::AutoPrepareForPatching patching(masm_);
>      ^~~~~~~~~~~~~~
Created attachment 8811252 [details] [diff] [review]
fix-no-ion.patch
Attachment #8811252 - Flags: review?(luke)
(Assignee)

Comment 17

4 months ago
Comment on attachment 8811252 [details] [diff] [review]
fix-no-ion.patch

Review of attachment 8811252 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, thanks!
Attachment #8811252 - Flags: review?(luke) → review+

Comment 18

4 months ago
(In reply to sorear2 from comment #15)
> Last commit breaks JS_CODEGEN_NONE builds.
> 
> > /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp: In member function 'bool js::wasm::ModuleGenerator::patchCallSites(js::wasm::ModuleGenerator::TrapExitOffsetArray*)':
> > /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp:253:5: error: 'AutoPrepareForPatching' is not a member of 'js::jit::MacroAssembler'
> >      MacroAssembler::AutoPrepareForPatching patching(masm_);
> >      ^~~~~~~~~~~~~~
> > /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp: In member function 'bool js::wasm::ModuleGenerator::patchFarJumps(const TrapExitOffsetArray&)':
> > /tmp/mozilla-central-79feeed42933/js/src/wasm/WasmGenerator.cpp:350:5: error: 'AutoPrepareForPatching' is not a member of 'js::jit::MacroAssembler'
> >      MacroAssembler::AutoPrepareForPatching patching(masm_);
> >      ^~~~~~~~~~~~~~
Flags: needinfo?(luke)
I've made a fix for this that I am about to land.
Flags: needinfo?(luke)

Comment 20

4 months ago
Pushed by bbouvier@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/243ff1a92bbf
Fix no-ion builds; r=luke

Comment 21

4 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/243ff1a92bbf
Comment on attachment 8811252 [details] [diff] [review]
fix-no-ion.patch

Approval Request Comment
[Feature/regressing bug #]: Regression from this very bug. The bug landed during the 52 cycle, but the fixup landed on central during the 53 cycle.
[User impact if declined]: Tier-3 build failure
[Describe test coverage new/current, TreeHerder]: Build spidermonkey with --disable-ion
[Risks and why]: NPOTB
[String/UUID change made/needed]: None
Attachment #8811252 - Flags: approval-mozilla-aurora?
Comment on attachment 8811252 [details] [diff] [review]
fix-no-ion.patch

fix a build regression in aurora52
Attachment #8811252 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I don't think this landed on autora yet. Should the checkin-needed keyword be set, or how does that work for uplift?
status-firefox52: fixed → affected
(Assignee)

Comment 25

4 months ago
It landed right before the branch, actually.
status-firefox52: affected → fixed
Sorry, I should have specified I was talking about the bustage fix for the None backend, which didn't make it into 52 [1]. IIUC status-firefox52 needs to be set to affected for the uplift request to appear in the sheriffs' queue.

[1] https://hg.mozilla.org/releases/mozilla-aurora/file/86eaf21cfe06/js/src/jit/none/MacroAssembler-none.h#l429
status-firefox52: fixed → affected
Comment 26 is exactly correct. FWIW, the follow-up fix probably should have landed in a different bug specifically to avoid these tracking issues and delays in uplift.

Comment 28

4 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/8b0e2637bfd6
status-firefox52: affected → fixed
You need to log in before you can comment on or make changes to this bug.