55.61 KB, application/pdf
5.63 KB, text/html
1.51 KB, patch
|Details | Diff | Splinter Review|
138 bytes, text/plain
We received the 2018 Hack2Win "Firefox RCE" exploit from Noam Rathaus of Beyond Security. Please use the following for the acknowledgement. "Three independent security researcher, Niklas Baumstark, Samuel Groß and Bruno Keith, have reported this vulnerability to Beyond Security’s SecuriTeam Secure Disclosure program."
I can't reproduce the assertion failure with the PoC in the PDF (with a recent m-c debug build), but the HTML testcase turned into a JS shell test asserts.
autobisectjs shows this is probably related to the following changeset: The first bad revision is: changeset: https://hg.mozilla.org/mozilla-central/rev/b7adf3986079 user: Sander Mathijs van Veen date: Tue Feb 07 07:18:00 2017 +0100 summary: Bug 1337367 - Postpone spilling bundles till after regalloc main loop r=bhackett Jan, is bug 1337367 a likely regressor? (not setting Blocks relationship yet)
As described in the PDF, the bug is in BacktrackingAllocator::resolveControlFlow. I need to investigate more, but I think the problem is this inconsistency: A) When we resolve phis: - We add the move at predecessor exit B) When we resolve graph edges: - We add the move at successor entry if there's exactly 1 predecessor - We add the move at predecessor exit if there are multiple predecessors (The assumption may be that a phi requires there to be multiple predecessors, but that's clearly not true here. Likely because we are eliminating blocks.) Making A behave like B fixes the testcase.
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #4) > Jan, is bug 1337367 a likely regressor? I'm not 100% sure yet but I think that just changes regalloc enough to hide the bug.
Shell flags that cause the issue *not* to assert: --no-baseline --no-ion --baseline-eager --ion-eager $ ./js-dbg-64-dm-linux-8db8a228536d --fuzzing-safe --no-threads 1493900.js Assertion failure: *def->output() != alloc, at jit/RegisterAllocator.cpp:240 Segmentation fault (core dumped)
See comment 5. Requesting review + fuzzing while investigating more.
(In reply to Jan de Mooij [:jandem] from comment #8) > Created attachment 9011730 [details] [diff] [review] > Initial patch I confirm the patch fixes the assert.
One thing we need to investigate here is whether it's expected to have a phi for a block with 1 predecessor. I'll look at the code elimination pass after lunch.
(In reply to Jan de Mooij [:jandem] from comment #11) > One thing we need to investigate here is whether it's expected to have a phi > for a block with 1 predecessor. I'll look at the code elimination pass after > lunch. Looking at the unreachable block elimination code in ValueNumbering.cpp I don't think we try to prevent this situation so then it is expected. sunfish, correct me if I'm wrong.
Attachment #9011703 - Attachment mime type: text/plain → text/html
Attachment #9011730 - Flags: review?(bhackett1024) → review+
If all goes as planned, ETA to land this is 10/1 am Paris time in order to gtb 10/1 afternoon Paris time.
Comment on attachment 9011730 [details] [diff] [review] Initial patch I didn't see any new fuzzing regressions with this patch within the last 24 hours.
Attachment #9011730 - Flags: feedback?(choller) → feedback+
Comment on attachment 9011730 [details] [diff] [review] Initial patch Review of attachment 9011730 [details] [diff] [review]: ----------------------------------------------------------------- We usually do remove single-predecessor phis, however we indeed don't guarantee that. In this case we don't do it because tryFoldEqualOperands earlier set GuardRangeBailouts on the phi, which makes GVN leave the phi in place even though it's known to be dead. The GuardRangeBailouts logic seems over-conservative, however that just means we're missing some optimizations in some cases, which isn't something we need to fix here. I was concerned about why in a single-predecessor case, putting the copy in the exit of the predecessor should lead to an inconsistent state, as it seems like that should be valid, even if potentially suboptimal. I investigated this, and the problem is indeed the inconsistency with the edge resolution logic. If edge resolution puts its moves in the MoveGroup at the entry to the successor, and phi resolution puts its moves in the MoveGroup at the exit of the predecessor, then the move resolver in the predecessor could clobber the inputs for the moves in the successor. So making phi resolution use the same logic for determining where to put its moves makes sense.
Attachment #9011730 - Flags: review?(sunfish) → review+
Comment on attachment 9011730 [details] [diff] [review] Initial patch This also didn't immediately blow up anything on my side.
Attachment #9011730 - Flags: feedback?(nth10sd) → feedback+
(In reply to Dan Gohman [:sunfish] from comment #15) Thanks for the thorough review!
Comment on attachment 9011730 [details] [diff] [review] Initial patch Approval Request Comment [Feature/Bug causing the regression]: Old bug. [User impact if declined]: Security bug. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Not yet. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: Low/medium risk. [Why is the change risky/not risky?]: Low because the change is straight-forward and I don't expect regressions. Medium risk because it's hard to reason about regalloc behavior. [String changes made/needed]: None.
Comment on attachment 9011730 [details] [diff] [review] Initial patch sec-approval+ and a=dveditz
Attachment #9011730 - Flags: sec-approval?
Attachment #9011730 - Flags: sec-approval+
Attachment #9011730 - Flags: approval-mozilla-release?
Attachment #9011730 - Flags: approval-mozilla-release+
Attachment #9011730 - Flags: approval-mozilla-esr60?
Attachment #9011730 - Flags: approval-mozilla-esr60+
Attachment #9011730 - Flags: approval-mozilla-beta?
Attachment #9011730 - Flags: approval-mozilla-beta+
(In reply to Jan de Mooij [:jandem] from comment #18) > [Is this code covered by automated tests?]: Yes. > [Needs manual test from QE? If yes, steps to reproduce]: No. Marking as qe-verify- based on the fact that this is covered by automated test and no longer requires manual confirmation.
The _area_ is covered by automated tests, but there was no specific test checked in with this patch (because it's a security bug). I believe "needs manual test from QE" was interpreted by Jan as meaning regression testing. We -DO- want to do independent verification of the security fix which would be done by using the testcase in this bug in an current Release (to demonstrate the crash conditions are correct) and then testing a fixed Release Candidate. re-nominating for qe-verify
Flags: qe-verify- → qe-verify?
Got it, let's give it a shot.
Flags: qe-verify? → qe-verify+
NI myself for follow-up tests/documentation/cleanup.
I've managed to reproduce the tab crash (https://crash-stats.mozilla.com/report/index/e976a42f-344f-43d0-bf1b-b12500181002) on 62.0.2 using the testcase from comment 1. I can no longer reproduce this bug using the latest builds: 64.0a1 (2018-10-02), 63.0b11 (20181001131022), 62.0.3 (20181001155545) and 60.2.2esr (20181001135620). Tested across platforms - Windows 7 x64, macOS 10.13 and Ubuntu 16.04 x64.
dveditz, can we land follow-up tests/docs/changes now or do you want to wait? It has been only two weeks but SecuriTeam already released details: https://blogs.securiteam.com/index.php/archives/3765
Yeah, go ahead and land the tests, etc.
Might as well un-hide the bug since they've published everything.
(In reply to Jan de Mooij [:jandem] from comment #28) > NI myself for follow-up tests/documentation/cleanup. Filed bug 1500978.
You need to log in before you can comment on or make changes to this bug.