Closed
Bug 1332594
Opened 9 years ago
Closed 9 years ago
Detect poison in AssemblerBuffer resulting from calling realloc
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla54
People
(Reporter: ehoogeveen, Assigned: ehoogeveen)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 6 obsolete files)
|
26.02 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
|
4.41 KB,
patch
|
ehoogeveen
:
review+
|
Details | Diff | Splinter Review |
From bug 1124397, it looks like realloc itself is causing part of our buffer to be poisoned. Let's check and make sure.
| Assignee | ||
Comment 1•9 years ago
|
||
This changes ProtectedReallocPolicy to check for poison and other shenanigans resulting from realloc, and also turns off the other protections of PageProtectingVector (aside from the buffer overflow assertion). I made it so that we fall back to normal realloc in case of OOM since these extra allocations can be pretty big.
It'd be nice to get this into 53 since the crash rate on Aurora is a lot higher (though I suppose having the current accidental workaround would be kind of nice for Aurora users!), but it doesn't matter that much as crash stats won't tell us exactly what's going wrong.
Attachment #8828763 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 2•9 years ago
|
||
The previous patch was too slow. This removes the poison check after realloc (which didn't add much) and makes the check before it less extreme. We still compare the new buffer to the original. This patch also switches AssemblerBuffer back to using plain mozilla::Vector, but with the new alloc policy.
With this, performance more or less matches the status quo (only doing this for large buffers didn't affect performance much - the bigger the buffer, the larger the overhead).
Attachment #8828763 -
Attachment is obsolete: true
Attachment #8828763 -
Flags: review?(jdemooij)
Attachment #8828784 -
Flags: review?(jdemooij)
Comment 3•9 years ago
|
||
Comment on attachment 8828784 [details] [diff] [review]
v2 - Check for poison and other corruption during realloc.
Review of attachment 8828784 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good. I wonder if it would be useful to store the buffer size etc somewhere on the stack, so we can get it from crash dumps, but we can always do that later when we know we're crashing here.
::: js/src/ds/PageProtectingVector.h
@@ +559,5 @@
> +
> + // Check for the poison pattern every so often.
> + const uint8_t* oldAddrBytes = reinterpret_cast<const uint8_t*>(oldAddr);
> + for (size_t j, i = 0; i < bytes; i += 1024) {
> + for (j = 0; j < 16 && oldAddrBytes[i + j] == PoisonPattern; ++j);
We need to ensure i + j stays in bounds right? Could change |j < 16| to |j < 16 && i + j < bytes|. Or change |i < bytes| in the outer loop to |i + 16 < bytes|. Or something.
@@ +579,5 @@
> +
> + const uint8_t* newAddrBytes = reinterpret_cast<const uint8_t*>(newAddr);
> + const uint8_t* tmpAddrBytes = reinterpret_cast<const uint8_t*>(tmpAddr);
> + for (size_t i = 0; i < bytes; ++i) {
> + if (MOZ_UNLIKELY(newAddrBytes[i] != tmpAddrBytes[i])) {
Maybe use PodEqual here:
if (!mozilla::PodEqual(tmpAddrBytes, newAddrBytes, bytes)) {
if (oldAddr == newAddr)
...
}
PodEqual will use memcmp for larger buffers so it might also be a bit faster (assuming the compiler doesn't auto-vectorize this loop).
Attachment #8828784 -
Flags: review?(jdemooij) → review+
| Assignee | ||
Comment 4•9 years ago
|
||
Thanks!
(In reply to Jan de Mooij [:jandem] from comment #3)
> Looks good. I wonder if it would be useful to store the buffer size etc
> somewhere on the stack, so we can get it from crash dumps, but we can always
> do that later when we know we're crashing here.
If this *is* corruption in realloc I'm not sure how much more that would tell us. But we can worry about that once we know for sure.
> We need to ensure i + j stays in bounds right? Could change |j < 16| to |j <
> 16 && i + j < bytes|. Or change |i < bytes| in the outer loop to |i + 16 <
> bytes|. Or something.
Oops, wrote that a little too quickly. Fixed.
> Maybe use PodEqual here:
Done. Looks a little cleaner too :)
Attachment #8828784 -
Attachment is obsolete: true
Attachment #8828812 -
Flags: review+
| Assignee | ||
Comment 5•9 years ago
|
||
Hmm, getting some failures on try with the poison pattern scan. I think that's because we only set mozjemalloc's opt_junk on debug builds, so if we reserve a larger buffer before using most of the previous one there might be random poison and other stuff in the buffer we're using.
The poison detection was just a sanity check anyway, so let's just get rid of it.
Attachment #8828812 -
Attachment is obsolete: true
Attachment #8828855 -
Flags: review+
| Assignee | ||
Comment 6•9 years ago
|
||
I don't know what's up with the TraceLogger assertion failure, but I can't imagine it's related to this patch. Otherwise green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6eae3a3f83072e0273e137e32b69357dea4e7886
Status: NEW → ASSIGNED
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/209d492c3ec5
Check for poison and other corruption during realloc. r=jandem
Keywords: checkin-needed
Comment 8•9 years ago
|
||
| Assignee | ||
Comment 9•9 years ago
|
||
How strange! Okay, this requires a debug build and --tbpl to reproduce, specifically "--ion-eager --ion-offthread-compile=off". With that the assertion at [1] fails.
The text for id is "script self-hosted:3481:0"
The text for prev is "script c:\\Users\\Emanuel\\firefox\\mozilla-central\\js\\src\\jit-test\\tests\\tracelogger\\bug1302417.js:13:24"
I have no idea what this is testing or why it would fail with this patch.
Hannes, this patch adds extra allocations (using js_pod_malloc), but all we use them for is to compare two buffers. In the event of OOM we should fall back to the normal logic, so I wouldn't expect it to make a difference there, although it might temporarily increase the total amount of memory used. Any idea why it could cause this assertion to fail?
[1] https://dxr.mozilla.org/mozilla-central/rev/3cedab21a7e65e6a1c4c2294ecfb5502575a46e3/js/src/vm/TraceLogging.cpp#576
Flags: needinfo?(hv1989)
Comment 10•9 years ago
|
||
If we fail to create the events for ion, we need to forcefully disable it and fail the tracelogger. We cannot recover TL anymore. We bake nullptr in. As a result when executing that script we will not report the start and stop events. As a result our log cannot be correct. Lets be obvious about it and fail Tracelogger (with message if TLOPTIONS=Errors is set).
Flags: needinfo?(hv1989)
Attachment #8831109 -
Flags: review?(bbouvier)
Comment 11•9 years ago
|
||
Comment on attachment 8831109 [details] [diff] [review]
Silently fail the logger when not-recoverable
Review of attachment 8831109 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with the nit addressed, but please let's open a new bug (or reuse the refactoring one) and let's do not derail this one (since it makes tracking flags and assignee totally inconsistent).
::: js/src/jit/CodeGenerator.cpp
@@ +9857,5 @@
> Assembler::PatchDataWithValueCheck(CodeLocationLabel(code, patchableTraceLoggers_[i]),
> ImmPtr(logger),
> ImmPtr(nullptr));
> }
> + TraceLoggerSilentFail(logger, "OOM when trying to patch ion log events.");
Shouldn't this be in the else branch?
Attachment #8831109 -
Flags: review?(bbouvier) → review+
Comment 12•9 years ago
|
||
Updated patch.
Attachment #8831109 -
Attachment is obsolete: true
Attachment #8831123 -
Flags: review+
| Assignee | ||
Comment 13•9 years ago
|
||
Thanks! I ran all the jit-tests locally with that patch applied and --tbpl, and all of them pass now (some tests fail with --slow, but the same test fails with and without the patches in this bug).
If you'd like to land this that would be great (I need to request something higher than level 1 access sometime..), but note that bbouvier asked for attachment 8831109 [details] [diff] [review] to go in its own bug.
Comment 14•9 years ago
|
||
Pushed by hv1989@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/740e1005d4b6
TraceLogger: Get --disable-trace-logging working again, r=bbouvier
Comment 15•9 years ago
|
||
Comment on attachment 8831123 [details] [diff] [review]
Silently fail the logger when not-recoverable
Landed in bug 1334426
Attachment #8831123 -
Attachment is obsolete: true
Comment 16•9 years ago
|
||
(In reply to Pulsebot from comment #14)
> Pushed by hv1989@gmail.com:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/740e1005d4b6
> TraceLogger: Get --disable-trace-logging working again, r=bbouvier
Hmmm. Somehow I managed to add the wrong bug id on that patch. Very sorry!
Comment 17•9 years ago
|
||
Pushed by hv1989@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ecd98c460060
Check for poison and other corruption during realloc. r=jandem
I had to back this out for valgrind failures like https://treeherder.mozilla.org/logviewer.html#?job_id=72632918&repo=mozilla-inbound
https://hg.mozilla.org/integration/mozilla-inbound/rev/f00b9a00eef5b7a92611e7926d7dac9f8b3ea5fb
Flags: needinfo?(emanuel.hoogeveen)
| Assignee | ||
Comment 19•9 years ago
|
||
I don't know if this is real or if we have to add a suppression. Is there a way to coax Valgrind into telling us how long it thinks the allocated region was? Getting a jemalloc callstack might be too much to ask since the realloc has already happened at this point.
Flags: needinfo?(emanuel.hoogeveen)
| Assignee | ||
Updated•9 years ago
|
Keywords: leave-open
Comment 20•9 years ago
|
||
| bugherder | ||
Comment 21•9 years ago
|
||
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #19)
> I don't know if this is real or if we have to add a suppression. Is there a
> way to coax Valgrind into telling us how long it thinks the allocated region
> was? Getting a jemalloc callstack might be too much to ask since the realloc
> has already happened at this point.
Maybe the problem is that not all of the Vector's memory has been initialized when we resize it (because length < capacity), so there will be some uninitialized bytes at the end...
| Assignee | ||
Comment 22•9 years ago
|
||
Ah, good point. So the same problem as what I described in comment #5, except we don't actually care that the bytes are uninitialized (so long as they match before and after). Still, probably better to just initialize them than to just add a suppression and potentially miss other problems.
| Assignee | ||
Comment 23•9 years ago
|
||
Sorry about the size of this patch, but this is mostly code removal. PageProtectingVector seemed the logical place to add this initialization, but it was bothering me how bloated PageProtectingVector was becoming, so I tore out the reentrance guard and the poison check. The actual code to add initialization is very small.
Attachment #8831711 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 24•9 years ago
|
||
This patch didn't really change, so carrying forward r+. With part 1 the Valgrind issue seems to be gone, as expected: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ca86f3673c7978f332e6afbb1ece6dc3bcc0615
Attachment #8828855 -
Attachment is obsolete: true
Attachment #8831712 -
Flags: review+
Updated•9 years ago
|
Attachment #8831711 -
Flags: review?(jdemooij) → review+
| Assignee | ||
Comment 25•9 years ago
|
||
Wow, that was fast :) Thanks! Oh, I forgot to mention, I also fixed up some small style issues in AssemblerBuffer-x86.h that were bothering me. It still uses a nonstandard style, but at least it's consistent now.
Keywords: leave-open → checkin-needed
| Assignee | ||
Comment 26•9 years ago
|
||
By the way, locally this improves single threaded WASM baseline compilation performance on the test I've been using from ~710ms to ~650ms, where just part 1 improves it to ~630ms (so the check adds ~20ms).
Comment 27•9 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1064fae59e5d
Part 1: Simplify PageProtectingVector and make it initialize new buffers. r=jandem
https://hg.mozilla.org/integration/mozilla-inbound/rev/51901165b259
Part 2: Check AssemblerBuffer for corruption during realloc. r=jandem
Keywords: checkin-needed
Comment 28•9 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/1064fae59e5d
https://hg.mozilla.org/mozilla-central/rev/51901165b259
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
You need to log in
before you can comment on or make changes to this bug.
Description
•