Closed
Bug 1273462
Opened 8 years ago
Closed 8 years ago
Consider marking fully used pages of AssemblerBuffer as read-only to detect heap corruption
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla49
Tracking | Status | |
---|---|---|
firefox49 | --- | fixed |
People
(Reporter: ehoogeveen, Assigned: ehoogeveen)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 4 obsolete files)
Part 1 v2: Simplify and refactor AssemblerBuffer a bit to make it easier to replace mozilla::Vector.
6.55 KB,
patch
|
ehoogeveen
:
review+
|
Details | Diff | Splinter Review |
21.55 KB,
patch
|
ehoogeveen
:
review+
|
Details | Diff | Splinter Review |
Since bug 1271165 is still waiting for review, I decided to try Jan's original suggestion from bug 1124397 and make as many pages of AssemblerBuffer non-writable as possible. The patches in this bug attempt to do that by interposing a new class, PageProtectingVector, which duplicates a small set of functions from mozilla::Vector. A PageProtectingVector can only grow, and every time a full page of its elements are used, that page is marked as read-only. PageProtectingVector takes care of unprotecting and reprotecting its pages when the underlying vector needs to grow, and allows for a minimal set of pages to be briefly unprotected so they can be modified.
Assignee | ||
Comment 1•8 years ago
|
||
I split this out because I think it's a nice simplification regardless. I used a couple of templates to get around type casts, since in the end we need reinterpret_cast<>s anyway. Using append instead of growByUninitialized + memcpy makes administration in the next patch simpler, and I think it's less of a footgun anyway.
Attachment #8753319 -
Flags: review?(jdemooij)
Assignee | ||
Comment 2•8 years ago
|
||
The meat of this bug, as described in the first comment. This protects as aggressively as possible, only unprotecting the set of pages is absolutely needs to. I'm only asking for feedback because performance comparisons on Talos show that this consistently regresses Kraken on OSX by 3-4%, probably because it's unprotecting and reprotecting for every write that happens out of order. Oddly, Windows seems mostly unaffected, and even Linux (which uses the same mprotect calls as OSX) seems less affected. I tried moving the unprotect-reprotect calls in patchCall, patchThunk, setNextJump and linkJump up to their callers, unprotecting all pages outside of for loops, but this didn't seem to help performance at all. Some of the callers are deeply embedded inside codegen, and I just don't know enough about how the JITs are structured to move the unprotect calls further up the call chain. As a result, PageProtectingVector::disableProtection and PageProtectingVector::enableProtection are currently unused. These disable or re-enable protection for the entire buffer, they're just looking for a good place to do this and prevent the fine-grained unprotect-reprotect dance.
Attachment #8753323 -
Flags: feedback?(jdemooij)
Assignee | ||
Comment 3•8 years ago
|
||
The actual performance comparison: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=cb6e6f2bb2e5&newProject=try&newRevision=c31d95b5cc01&framework=1&showOnlyImportant=0&showOnlyConfident=1 The tabpaint and tp5o regressions may be real as well, but the Kraken regression seems more consistent (though it was larger on Linux in an earlier comparison, oddly).
Assignee | ||
Comment 4•8 years ago
|
||
Fixed up PageProtectingVector::setContainingRegion to work with larger regions, and fixed up some outdated comments. Currently the patch only calls unprotectRegion and reprotectRegion on 4-byte regions, so this doesn't change behavior, but the logic was still wrong (and it's actually somewhat simpler now).
Attachment #8753323 -
Attachment is obsolete: true
Attachment #8753323 -
Flags: feedback?(jdemooij)
Attachment #8753364 -
Flags: feedback?(jdemooij)
Assignee | ||
Comment 5•8 years ago
|
||
Since we discussed only enabling this for certain kinds of scripts on IRC, I figured I should confirm that this has negligible overhead when disabled. The Windows testing machines aren't playing ball, but Linux and OSX seem to confirm that we could land this infrastructure disabled by default without regressing anything: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=cb6e6f2bb2e5&newProject=try&newRevision=b2c1b85327ad&framework=1 (that ts_paint opt e10s regression on osx-10-10 *must* be spurious, it wasn't even that big with the mprotect calls *enabled*)
Assignee | ||
Comment 6•8 years ago
|
||
This version disables the protection by default so we can experiment with enabling it for certain kinds of compilations. Since it just adds the infrastructure, I'm putting it up for review this time (I also reordered things so it can land before bug 1271165).
Attachment #8753364 -
Attachment is obsolete: true
Attachment #8753364 -
Flags: feedback?(jdemooij)
Attachment #8753790 -
Flags: review?(jdemooij)
Comment 7•8 years ago
|
||
Comment on attachment 8753319 [details] [diff] [review] Part 1: Simplify and refactor AssemblerBuffer a bit to make it easier to replace mozilla::Vector. Review of attachment 8753319 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/x86-shared/AssemblerBuffer-x86-shared.h @@ +72,5 @@ > + { > + template<typename T> > + void sizedAppendUnchecked(T value, size_t size) > + { > + m_buffer.infallibleAppend(reinterpret_cast<unsigned char*>(&value), size); Can you make sizedAppendUnchecked MOZ_ALWAYS_INLINE? I'm a bit afraid some compilers may use a loop now (IIRC all our compilers optimized the memcpy calls to a single move instruction). Maybe you can check what MSVC emits?
Attachment #8753319 -
Flags: review?(jdemooij) → review+
Comment 8•8 years ago
|
||
Comment on attachment 8753790 [details] [diff] [review] Part 2 v1.3: Add infrastructure to mark all fully used pages of AssemblerBuffer's vector as read-only (disabled by default). Review of attachment 8753790 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for working on this and sorry for the delay. I'm a bit torn: it might help us track down this bug, but it also adds a bunch of overhead/complexity compared to a plain Vector. Maybe we can use this temporarily on Nightly and after a few weeks we can change the AssemblerBuffer to use a plain Vector again? ::: js/src/ds/PageProtectingVector.h @@ +32,5 @@ > +{ > + size_t pageSize; > + size_t pageMask; > + > + size_t offsetToPage; Please add a brief description here. Unlike the other members it wasn't immediately obvious. @@ +74,5 @@ > + protectedBytes = 0; > + } > + } > + > + MOZ_MUST_USE bool anyProtected(size_t first, size_t last) { Nit: as this is just a bool query, MOZ_MUST_USE seems unnecessary/confusing. @@ +78,5 @@ > + MOZ_MUST_USE bool anyProtected(size_t first, size_t last) { > + return protectedBytes && last >= offsetToPage && first < offsetToPage + protectedBytes; > + } > + > + void setContainingRegion(size_t first, size_t last, uintptr_t& addr, size_t& size) { Convention is to use pointers for out params, because it's clearer at the call site. @@ +120,5 @@ > + }; > + > + public: > + explicit PageProtectingVector(AllocPolicy policy = AllocPolicy()) > + : pageSize(gc::SystemPageSize()) Nit: these lines should be indented with 2 spaces, and comma usually goes at the end. @@ +197,5 @@ > + guard.emplace(this); > + return vector.reserve(size); > + } > + > + template<typename U> void infallibleAppend(const U* values, size_t size) { Nit: MOZ_ALWAYS_INLINE? ::: js/src/gc/Memory.h @@ +43,5 @@ > > void* TestMapAlignedPagesLastDitch(size_t size, size_t alignment); > > void ProtectPages(void* p, size_t size); > +void ReadOnlyPages(void* p, size_t size); Maybe use a verb, ProtectPagesReadOnly?
Attachment #8753790 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 9•8 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #8) > Thanks for working on this and sorry for the delay. I'm a bit torn: it might > help us track down this bug, but it also adds a bunch of overhead/complexity > compared to a plain Vector. > > Maybe we can use this temporarily on Nightly and after a few weeks we can > change the AssemblerBuffer to use a plain Vector again? I'll address the review comments later, but to just quickly reply: I think the actual overhead added by this is low (I'll check the code that gets generated, as with part 1), but in general I agree. The only things preventing this from truly being a drop-in replacement (letting you enable/disable it with a one-liner) are the enableBufferProtection/disableBufferProtection/unprotectDataRegion/reprotectDataRegion definitions that all the classes need. Do you think we should leave those in (but make the underlying ones in AssemblerBuffer do nothing), or take them out again too? It's unfortunate that we can't just add them where they're actually needed, but I can't see a way around that. The only other option I see would be to instead use a single function to expose the underlying vector on each level (or change the data() functions do to that), and then operate on those. Do you think that would be preferable? We could leave that in without lying about what it does.
Assignee | ||
Comment 10•8 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #7) > Can you make sizedAppendUnchecked MOZ_ALWAYS_INLINE? I'm a bit afraid some > compilers may use a loop now (IIRC all our compilers optimized the memcpy > calls to a single move instruction). > > Maybe you can check what MSVC emits? It turns out everything gets inlined, so I checked 4 callsites. Before the changes, the code MSVC emits is: In movb_im(int32_t imm, int32_t offset, RegisterID base): mov rax,qword ptr [rbx+10h] mov rcx,qword ptr [rbx+8] mov byte ptr [rcx+rax],bpl inc qword ptr [rbx+10h] In movw_im(int32_t imm, int32_t offset, RegisterID base): add qword ptr [rbx+8],2 mov rcx,qword ptr [rbx+8] mov rax,qword ptr [rbx] mov word ptr [rcx+rax-2],bp In movl_i32m(int32_t imm, int32_t offset, RegisterID base): add qword ptr [rbx+10h],4 mov rcx,qword ptr [rbx+10h] mov rax,qword ptr [rbx+8] mov dword ptr [rcx+rax-4],ebp In movq_i64r(int64_t imm, RegisterID dst): add qword ptr [rbx+10h],8 mov rcx,qword ptr [rbx+10h] mov rax,qword ptr [rbx+8] mov qword ptr [rax+rcx-8],rsi After part 1, everything is still inlined, and movb_im, movl_i32m and movq_i64r use the same amount of instructions (albeit reordered slightly), but movw_im uses: mov rcx,qword ptr [rbx+8] movzx eax,bp add rcx,qword ptr [rbx] mov dword ptr [rsp+48h],ebp mov word ptr [rcx],ax add qword ptr [rbx+8],2 I'm not sure why it adds the movzx; it doesn't do anything special for the 8-bit case. I tried putting the size parameter into the template, but that made no difference (I do think it looks a bit nicer though). After part 2, things no longer get inlined. Forcing it to inline by sprinkling MOZ_ALWAYS_INLINE about, it generates the following code for movl_i32m: mov rax,qword ptr [r14+28h] add rax,qword ptr [r14+30h] mov dword ptr [rax],ebp add qword ptr [r14+30h],4 add qword ptr [r14+20h],4 # Bump PageProtectingVector::unprotectedBytes cmp byte ptr [r14+140h],0 # See if protection is enabled mov rax,qword ptr [r14+20h] # Load PageProtectingVector::unprotectedBytes je 07FF654C3FDD3h # Move on if protection is disabled cmp rax,qword ptr [r14] # Compare against PageProtectingVector::pageSize jl 07FF654C3FDD3h # Move on if unprotectedBytes < pageSize mov rbx,qword ptr [r14+8] # Load PageProtectingVector::pageMask mov rcx,qword ptr [r14+10h] # Add PageProtectingVector::offsetToPage not rbx # Invert the page mask add rcx,qword ptr [r14+18h] # Add PageProtectingVector::protectedBytes and rbx,rax # Set rbx to the number of bytes to protect add rcx,qword ptr [r14+28h] # Add PageProtectingVector::vector.begin() mov rdx,rbx call js::gc::MakePagesReadOnly sub qword ptr [r14+20h],rbx # Adjust PageProtectingVector::unprotectedBytes add qword ptr [r14+18h],rbx # Adjust PageProtectingVector::protectedBytes Maybe the last block would be nicer out-of-line, but that seems to require a separate function marked as MOZ_NEVER_INLINE. The generated code looks good to me though.
Assignee | ||
Comment 11•8 years ago
|
||
Carrying forward r+. (In reply to Jan de Mooij [:jandem] from comment #7) > Can you make sizedAppendUnchecked MOZ_ALWAYS_INLINE? Done, more details in comment #10. I assume we can overlook that single extra instruction in the 16-bit case ;)
Attachment #8753319 -
Attachment is obsolete: true
Attachment #8757487 -
Flags: review+
Assignee | ||
Comment 12•8 years ago
|
||
Thanks for the reviews! Carrying forward r+. (In reply to Jan de Mooij [:jandem] from comment #8) > > + size_t offsetToPage; > > Please add a brief description here. Unlike the other members it wasn't > immediately obvious. Done. I also added an even longer comment for unprotectedBytes, explaining why it's signed. I think that particular bit of cleverness was warranted, since it simplified things a fair bit, but it deserves an explanation. > > + MOZ_MUST_USE bool anyProtected(size_t first, size_t last) { > > Nit: as this is just a bool query, MOZ_MUST_USE seems unnecessary/confusing. OK, makes sense. I wasn't sure if it was going to be policy to use MOZ_MUST_USE everywhere, but this is just a private method anyway. > > + void setContainingRegion(size_t first, size_t last, uintptr_t& addr, size_t& size) { > > Convention is to use pointers for out params, because it's clearer at the > call site. OK, done. Having to use *ptr = value for outparams always looks a bit.. C-y to me, but I suppose it's better to be explicit. > > + explicit PageProtectingVector(AllocPolicy policy = AllocPolicy()) > > + : pageSize(gc::SystemPageSize()) > > Nit: these lines should be indented with 2 spaces, and comma usually goes at > the end. Oh, hmm, place I based this off originally must have used a different style. Fixed. > > + template<typename U> void infallibleAppend(const U* values, size_t size) { > > Nit: MOZ_ALWAYS_INLINE? Done; needed too, at least for MSVC. I had to sprinkle MOZ_ALWAYS_INLINE around a few other places as well. I doubt the increase in code size will be a concern (the next best thing requires MOZ_NEVER_INLINE on an out-of-line definition and only saves 9 instructions - didn't seem worth it). > > +void ReadOnlyPages(void* p, size_t size); > > Maybe use a verb, ProtectPagesReadOnly? Ended up going with MakePagesReadOnly. I was trying for something that fit in with the existing function names, but it didn't really work out :)
Attachment #8753790 -
Attachment is obsolete: true
Attachment #8757491 -
Flags: review+
Assignee | ||
Comment 13•8 years ago
|
||
Try run with a one-liner on top of part 2 to actually enable the protection everywhere: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ce356935214 This will actually land disabled with the protection disabled (and shouldn't add much overhead), but seeing it work on try gives me confidence that I didn't miss anything.
Status: NEW → ASSIGNED
Keywords: checkin-needed
Comment 14•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/dbd9ceba3964 https://hg.mozilla.org/integration/mozilla-inbound/rev/9a76ad279eaa
Keywords: checkin-needed
Comment 15•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/dbd9ceba3964 https://hg.mozilla.org/mozilla-central/rev/9a76ad279eaa
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Comment 16•8 years ago
|
||
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #2) > I tried moving the unprotect-reprotect calls in patchCall, patchThunk, > setNextJump and linkJump up to their callers, unprotecting all pages outside > of for loops, but this didn't seem to help performance at all. Some of the > callers are deeply embedded inside codegen, and I just don't know enough > about how the JITs are structured to move the unprotect calls further up the > call chain. Incidentally, I just hit these two unprotect-reprotect calls; they happen inside a loop with millions of elements so they are doubling overall wasm compilation time for Unity-sized apps. I understand the code enough that I should be able to hoist them outside the loop but next time, feel free to ping me :)
You need to log in
before you can comment on or make changes to this bug.
Description
•