Closed Bug 1358073 Opened 3 years ago Closed 2 years ago

Crash in js::TenuringTracer::traverse<T>


(Core :: JavaScript: GC, defect, critical)

Windows 10
Not set



Tracking Status
firefox-esr45 --- unaffected
firefox-esr52 55+ fixed
firefox53 --- wontfix
firefox54 + fixed
firefox55 - ---
firefox56 - ---


(Reporter: jonco, Assigned: jonco)


(Blocks 1 open bug)


(4 keywords, Whiteboard: [adv-main54-][adv-esr52.3-])

Crash Data


(4 files)

This bug was filed from the Socorro interface and is 
report bp-171e9a25-2ff4-4330-8fd6-1b55f0170419.

This has been getting ~500 crashes a day for a long time.
These crashes all happen while checking if a pointer is inside the nursery, which is done by checking the chunk trailer.  

Many of these show a crash at address 0xffff0 which indicates a null pointer or pointer into the first 1MiB of address space - an invalid pointer. So these would probably be bad RAM / heap corruption and not directly actionable.

A few of them show traces of 'e5' in the crash address which suggests that these could be use after free though.  The stacks for these show one of:
 - StoreBuffer::MonoTypeBuffer<js::gc::StoreBuffer::CellPtrEdge>::trace
 - jit::MRootList::trace
 - OrderedHashTableRef<js::MapObject>::trace
The e5 crashes are use-after-frees (almost any crash with a ...e5e5... address is)
Group: core-security
Group: core-security → javascript-core-security
Sec-high triage: Jon, we need somebody to be assigned to investigate and fix this bug.
Flags: needinfo?(jcoppeard)
Looking at the crashes that have e5 in the address, the ones associated with the most recent builds are all during tracing the cell pointer store buffer.

This store buffer is used by NativeObject::privateWriteBarrierPost and JSObject::writeBarrierPost (which is used by GCPtr, HeapPtr, JS::Heap).  So things like constructing a HeapPtr / JS::Heap and failing to destroy it properly, or memcpy'ing the memory could be to blame here.

Of course, this is pretty hard to track down without STR.
One thing I found is that we can free scope data in error cases without going via JS::DeletePolicy, which is specialised for some Scope::Data classes to free the data later at a safe time (using GCManagedDeletePolicy).  This could cause crashes like these.

Also we copy Scope data around with PodCopy, which bypasses constructors.  Calling constructors for embedded GCPtrs is necessary to trigger the postbarrier so this is a potential source of crashes if this data contains nursery objects.
Assignee: nobody → jcoppeard
Attachment #8868172 - Flags: review?(shu)
Comment on attachment 8868172 [details] [diff] [review]

Review of attachment 8868172 [details] [diff] [review]:

Thanks for the patch, and yikes, this was pretty scary oversight on my part. We really should backport this.
Attachment #8868172 - Flags: review?(shu) → review+
Blocks: 1263355
Flags: needinfo?(jcoppeard)
Keywords: leave-open
Comment on attachment 8868172 [details] [diff] [review]

[Security approval request comment]
How easily could an exploit be constructed based on the patch? Very hard.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No.

Which older supported branches are affected by this flaw? Everything back to 51.

If not all supported branches, which bug introduced the flaw? Bug 1263355.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Should be trivial.

How likely is this patch to cause regressions; how much testing does it need? Unlikely.
Attachment #8868172 - Flags: sec-approval?
sec-approval+ for trunk.
Once it lands, we'll want ESR52 and Beta patches nominated so we can uplift this.
Attachment #8868172 - Flags: sec-approval? → sec-approval+
Sorry, I messed up and missed a couple of places where I should have called DeleteScopeData.  This actually makes no difference in practice because these are not for scope kinds that use GCManagedDeletePolicy but for we should do this for consistency with the everything else.
Attachment #8869375 - Flags: review?(shu)
Attachment #8869375 - Flags: review?(shu) → review+
Attached patch bug1358073-betaSplinter Review
Approval Request Comment
[Feature/Bug causing the regression]: Bug 1263355
[User impact if declined]: Possible crash / security vulnerability
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: This isn't a trivial change but I think the risk is low.
[Why is the change risky/not risky?]: It splits moving scope data objects into two parts and so involves pointer arithmetic which can be risky.  But conceptually it's quite simple and it's a small change.
[String changes made/needed]: None
Attachment #8870113 - Flags: approval-mozilla-beta?
Attached patch bug1358073-esr52Splinter Review
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: sec-high bug
Fix Landed on Version: 55
Risk to taking this patch (and alternatives if risky): Low
String or UUID changes made by this patch: None
Attachment #8870115 - Flags: approval-mozilla-esr52?
(In reply to Wes Kocher (:KWierso) from comment #15)
> Is this still leave-open?

Yes, that patch won't fix all of these crashes.
Flags: needinfo?(jcoppeard)
Comment on attachment 8870113 [details] [diff] [review]

sec-high fix for beta54, should be in 54.0b12
Attachment #8870113 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8870115 [details] [diff] [review]

Sec-high that was taken in 54, let's uplift to ESR52
Attachment #8870115 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Bug 1370869 will probably fix this.
Depends on: 1370869
Flags: needinfo?(jcoppeard)
Sounds like we should punt on this for ESR52 until 52.3 then (and change tracking to 55+). From an IRC conversation with Jon, it doesn't sound like we need to panic for 54, but we should keep an eye out for any uptick in OOM crashes that might result.
I'm not going to disclose this in 54 since we didn't take it (yet) in 52.2.
Whiteboard: [adv-main54-]
Can we get an ESR52 patch landed now?
Flags: needinfo?(jcoppeard)
Bug 1370869 needs to be approved first.
Flags: needinfo?(jcoppeard)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #26)
> Bug 1370869 needs to be approved first.

Is this fixed in 55 and 56? Or is there still something left to land for beta and nightly?
Flags: needinfo?(jcoppeard)
There's nothing waiting to land, but the patch that did land will only fix some of the crashes.
Flags: needinfo?(jcoppeard)
OK, I'll drop tracking if this isn't something we're expecting to change for 55/56.
The patch seems to have halved weekday crashes.  55/56 have 84 crashes in the last week (random ptr/UAFs).  Mixture of writes and reads.

What's needed to get info that would help us diagnose this?  Assertions elsewhere?
Flags: needinfo?(jcoppeard)
Whiteboard: [adv-main54-] → [adv-main54-][adv-esr52.3-]
This isn't actionable as it is, but I'm going to investigate how we can track down crashes like this in bug 1399944.
Flags: needinfo?(jcoppeard)
Blocks: GCCrashes
In the remaining crashes I see only one (ESR 52.4) crash that looks like an obvious UAF. Whatever remains is something different from what was fixed. Would it make more sense to clone this into a new bug and take a fresh look at the remaining, and call this one "FIXED" because we did in fact fix a bunch of crashes?
Flags: needinfo?(jcoppeard)
Closed: 2 years ago
Flags: needinfo?(jcoppeard)
Resolution: --- → FIXED
Group: javascript-core-security → core-security-release
Group: core-security-release
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.