Closed Bug 1330687 (CVE-2017-5410) Opened 7 years ago Closed 7 years ago

More memory corruption issues in nightly build from 01/10/17

Categories

(Core :: JavaScript: GC, defect)

53 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox-esr45 52+ fixed
firefox50 --- wontfix
firefox51 + wontfix
firefox52 + fixed
firefox-esr52 --- fixed
firefox53 + fixed
firefox54 + fixed

People

(Reporter: jerri.rice.001, Assigned: jonco)

References

Details

(4 keywords, Whiteboard: [post-critsmash-triage][adv-main52+][adv-esr45.8+])

Attachments

(6 files)

This is even more interesting.  Following a memory corruption crash of a tab involving gfx rendering(which I will follow up on) once the tab(s) were recovered, I left ff setting a minute and came to the attached screenshot.

It's worrisome because not only does it show memory corruption and issues involving threading(see bug 1330173) it also shows what seems be a use after free issue in play.

I have to say this is a major headache because I just triggered the tab crashes for the first time.  Right now I have to try to figure out STR for that and see if it just happens to turn into this again.  Not to mention I'm supposed to doing some work google right now =/

See the attached screenshot for the goods.
Nick you have worked in this code. Do you have any ideas or forward to someone who does?
Flags: needinfo?(nfitzgerald)
These are crashes in the GC, so I'll move it there to start with. Obviously, we need more to go on than this. It maybe sounds similar to some other issues we have seen.
Group: firefox-core-security → javascript-core-security
Component: Untriaged → JavaScript: GC
Product: Firefox → Core
Garbage Collection crashes are pretty generic. From the screenshot we can see a poisoned address so there's a Use-after-free involved (potentially exploitable!) but that's about it. Knowing what object it was might give us a clue as to what feature to test deeper, but there's really not much to go on here
Flags: needinfo?(jerri.rice.001)
I'll provide it as I can, this involves launching the pdf viewer from a post style form, dragging that tab to its on window, then dragging it back as a tab.  That *sometimes* triggers a gfx crash, which afterwards led to this.

I'll see if I can narrow down the needed parts.  Leaving the NI for now.
I'm going to make a video of doing this as I was just doing it for some time and could not trigger the crash.  I did find many many problems which could be very security sensitive though.  Attaching a screenshot which shows something very interesting.

In this screenshot I have clicked on a link in the global console console to open the source of a chrome level file, which the console should have permission and ownership of doing so yet notice the security error issued.

The loading of the file continues no problem though.  There's many many more issues like this, so the video will be needed.
I forgot to mention too, that even though there it lists the opening page as a resource:// page, I did see the same error for 127.0.0.1/localhost which is my local test server.

Mmm e10s.
This just shows my local web server being listed as having ownership of the successfully opened tab, with a few other(of the very many) juicy errors.
Was able to reproduce the tab crashes three times today, but not one was followed by the secondary crash that was the origin of this bug.

Still have more to narrow down for STR obviously.  Stay tuned.
Have 2 versions of STR maybe not at exact, but seemingly related.  One version triggers the exact same tab crash and break point that I originally encountered.  It is not followed up by the threading issue with the use after free though.

There is one thing I was doing that isn't in STR, and I'm hoping it's the part I'm missing.  Either way I'll have plenty for you guys to go on uploaded by sometime tomorrow.  Going to try to that last little detail between now and then because GC is hard to get to anyway and it seems like something that would trigger it.

Stay tuned
I now have four alterations, but I believe I have nailed down exactly what happens here.  I have to wait to test on a better system for a minute, but I'll get it done.

The rundown:

Since compacting was introduce into gc, movement of objects allocated in the head and the updating of pointers done by gc happen triggered by 3 events if gc is occurring.  It's the third event that I believe is of focus here.

So when a thread is trying to access the runtime as in the stacktrace, before this you clearly see there GCisMarking is called.  This is marking objects and the associated pointers for being moved as happens in the new compacting model.

IIRC When a tab representing a thread which correlates to a zone has crashed then the memory of that thread is going to be freed.  This is where the third event is of interest.  I first had a memory corruption crash of a tab, I did in fact look at it and realized that it seemed to be a null pointer deref.

Let's assume that I closed it out trying to reproduce it rapidly and that happened to be in approximately 20 seconds.  So when I returned ff and proceeded to try to recover the tab, compacting had taken place.

The platformData class deref for thread id in the first part of js::thread::Id::operator==(const js::Thread::Id & aOther) is trying to find the thread id using an instance of platformData for which the memory has been freed.



Ok I know, a lot to take in.  That's also why It's taken me longer than anticipated to get this to work out.  Sorry about the overload of information but please tell me if that seems plausible or if I missed something because reproducing this has proved next to impossible before I had this understanding(will test soon).

Clearing the NI from me but will flag myself again once I get feedback.  Let's get this taken care of despite how hard it may be to reproduce(might be easy for some ;-) ).
Flags: needinfo?(jerri.rice.001)
Throwing the ni to jonco. Looks like jemalloc freed junk.
Flags: needinfo?(nfitzgerald) → needinfo?(jcoppeard)
(In reply to Jerri Rice (rehash pending) from comment #11)
Thanks for the bug report.  This is reminiscent of bug 1280154 which might be the same issue.

The stacktrace in your screenshot is very useful.  That particular section of the code is not related to compacting GC but to incremental sweeping, and how we work out the grouping and order of zones to sweep.  It maybe there is an issue with how we do that.
The problem is in ObjectValueMap::findZoneEdges where we add a key's zone to its delegate's zone's gcZoneGroupEdges set.  At the moment we do this even for delegate zones which are not currently being marked.  The gcZoneGroupEdges set is not cleared for these zones so they can continue to hold zone pointers past the point where the zone can be destroyed.

I wasn't able to come up with a testcase that reproduced this, but a try run with the assertions and without the fix hit the assertions in several places so I'm confident that this is what's happening.

Hopefully this will fix our other related bugs too.
Assignee: nobody → jcoppeard
Flags: needinfo?(jcoppeard)
Attachment #8827896 - Flags: review?(sphink)
Caused by bug 982561.
Blocks: 982561
(In reply to Jon Coppeard (:jonco) from comment #14)
> 
> I wasn't able to come up with a testcase that reproduced this, but a try run
> with the assertions and without the fix hit the assertions in several places
> so I'm confident that this is what's happening.

This doesn't surprise me, I've probably succeeding in crashing out tabs close to 100 times since I filed this report.  Funny that when I first triggered this I had just started to pursue memory corruption issues that may pose a security threat.

Thanks so much Jon, now I can get back to my comfort zone(internal engine manipulation) without feeling bad.

Flagging myself NI so I can follow this and learn more.
Flags: needinfo?(jerri.rice.001)
Succeeding meh s/ing/ed/

I truly mean it when I say thanks Jon.  This one had me a bit exhausted :-)
Jon, can you give this a security rating when you get a chance?
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(jcoppeard)
Flags: sec-bounty?
Comment on attachment 8827896 [details] [diff] [review]
bug1330687-delegate-zone-edges

Review of attachment 8827896 [details] [diff] [review]:
-----------------------------------------------------------------

Ouch. Nice find. And I'm happy to have that assert!
Attachment #8827896 - Flags: review?(sphink) → review+
Comment on attachment 8827896 [details] [diff] [review]
bug1330687-delegate-zone-edges

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Very difficult.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

No.

Which older supported branches are affected by this flaw?

Everything back to FF 32.

If not all supported branches, which bug introduced the flaw?

Bug 982561.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

Same patch should apply as is or after trivial merge.

How likely is this patch to cause regressions; how much testing does it need?

Unlikely.
Flags: needinfo?(jcoppeard)
Attachment #8827896 - Flags: sec-approval?
Too late for a fix for 51 release, but we can track it anyway in case we end up with a dot release later.
Flags: needinfo?(jerri.rice.001)
sec-approval for checkin on February 7, two weeks into the new cycle.
Whiteboard: [checkin on 2/7]
Attachment #8827896 - Flags: sec-approval? → sec-approval+
The patch needs updating as there have been quite a lot of changes in SM recently.
Flags: needinfo?(jcoppeard)
Jerri: thanks for being so dogged in tracking this down.  You're awesome for doing that.
https://hg.mozilla.org/mozilla-central/rev/9f2ed9ef1d89
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Please request Aurora/Beta/ESR45 approval on this when you get a chance.
Flags: needinfo?(jcoppeard)
Group: javascript-core-security → core-security-release
Approval Request Comment
[Feature/Bug causing the regression]: Bug 982561.
[User impact if declined]: Possible crash / security vulnerability.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: It's a simple fix and adds a bunch of assertions to ensure that this problem doesn't recur.
[String changes made/needed]: None.
Flags: needinfo?(jcoppeard)
Attachment #8835450 - Flags: approval-mozilla-beta?
Attachment #8835450 - Flags: approval-mozilla-aurora?
Attached patch bug1330687-esr45Splinter Review
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: sec-high bug.
User impact if declined: Possible crash / security vulnerability.
Fix Landed on Version: 54.
Risk to taking this patch (and alternatives if risky): Low.
String or UUID changes made by this patch: None.

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8835451 - Flags: approval-mozilla-esr45?
Comment on attachment 8835450 [details] [diff] [review]
bug1330687-backport

sec-high fix for aurora53, beta52
Attachment #8835450 - Flags: approval-mozilla-beta?
Attachment #8835450 - Flags: approval-mozilla-beta+
Attachment #8835450 - Flags: approval-mozilla-aurora?
Attachment #8835450 - Flags: approval-mozilla-aurora+
Comment on attachment 8835451 [details] [diff] [review]
bug1330687-esr45

Fix a sec-high. ESR45+.
Attachment #8835451 - Flags: approval-mozilla-esr45? → approval-mozilla-esr45+
No longer depends on: 1338383
Flags: sec-bounty? → sec-bounty+
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main52+][adv-esr45.8+]
Alias: CVE-2017-5410
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: