Closed
Bug 1368268
Opened 9 years ago
Closed 8 years ago
Crash in `anonymous namespace''::ActiveVerifier::StartTracking
Categories
(Core :: Security: Process Sandboxing, defect, P1)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jesup, Assigned: bobowen)
Details
(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [post-critsmash-triage][adv-main61+] sb+)
Crash Data
This bug was filed from the Socorro interface and is
report bp-0523557b-76a6-4075-8847-5e6e00170527.
=============================================================
This is crashing in the sandbox code with a UAF. Frequency is high.
Tempted by sec-critical since it's crashing the sandbox with a UAF; this might be expoitable into a sandbox escape (just guessing, but sandbox UAFs make me wary.)
URLs are all over the place
| Reporter | ||
Comment 1•9 years ago
|
||
NI to abillings for sec level assessment, jimm for triage
Flags: needinfo?(jmathies)
Flags: needinfo?(abillings)
Updated•9 years ago
|
Flags: needinfo?(abillings)
Keywords: sec-high → sec-critical
Comment 2•9 years ago
|
||
I'm going to default to "critical" here to be safe.
Updated•9 years ago
|
Flags: needinfo?(jmathies)
Whiteboard: sb?
Comment 3•9 years ago
|
||
Bob, any thoughts here? Odd that the crash happens off main thread.
Flags: needinfo?(bobowencode)
| Assignee | ||
Comment 4•9 years ago
|
||
I wonder what happened in Fx53 which has made this spike up.
It seems to have dropped right back down again in 54 beta (when compared to 53 beta) and there are no instances in 55.
Fx55 has an update to the chromium sandbox code, so that might explain why it has disappeared.
The only likely changeset appears to be:
https://chromium.googlesource.com/chromium/src/+/678736a103c3e64acdce5e76910ff077b17cc6c3%5E%21/
Although it's not clear it would have resulted in a UAF.
This still doesn't explain why 53 appears to be particularly bad for this, I wonder if we've improved our shared memory usage in 54.
Could the spike in 53 have been the GPU process triggering the underlying problem?
Flags: needinfo?(bobowencode)
Updated•9 years ago
|
status-firefox53:
--- → affected
status-firefox54:
--- → affected
status-firefox55:
--- → ?
status-firefox-esr52:
--- → affected
Updated•9 years ago
|
Whiteboard: sb? → sblc5, sbwc3, sbmc3
| Assignee | ||
Comment 6•9 years ago
|
||
A superficially modified version of the chromium patch I mentioned in comment 4, applies, builds and seems to run without any issues for Beta 54.
Not sure what we want to do here given that it is late in Beta, appears to be very rare (in 54) and we don't really know if that patch will fix the issue anyway.
dveditz - what do you think?
Flags: needinfo?(dveditz)
Comment 7•9 years ago
|
||
Wontfix for 53.
There are about 300 crashes per week on release.
We're going to build RC 54 today. So if we do end up wanting this in 54 it would take spinning a second RC. Maybe it's best to wait one more cycle.
Comment 8•9 years ago
|
||
The patch fixes a race--perhaps fx54 changes things around such that the bad timing happens less often? I don't think we need to respin 54 to pick this up, though we should keep an eye on the numbers as we release to see if we need to get this into a point-release ride-along. I'm guessing we'll be fine until 55.
Flags: needinfo?(dveditz)
Keywords: sec-critical → sec-high
| Assignee | ||
Comment 9•9 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #8)
> The patch fixes a race--perhaps fx54 changes things around such that the bad
> timing happens less often?
Yes it should stop the race, but as far as I can tell the issue would have resulted in an extra "leaked" ActiveVerifier, although I don't think the other one ever gets deleted either.
I'll keep an eye on this as we release.
Updated•9 years ago
|
Group: core-security → dom-core-security
| Assignee | ||
Comment 10•9 years ago
|
||
We've only had 4 of these reported in Fx54 so far, so this has definitely dropped back down to a very low level.
Unfortunately, we've also had one in 56, so it hasn't gone away completely.
Comment 11•8 years ago
|
||
(In reply to Bob Owen (:bobowen) from comment #10)
> We've only had 4 of these reported in Fx54 so far, so this has definitely
> dropped back down to a very low level.
>
> Unfortunately, we've also had one in 56, so it hasn't gone away completely.
What do you recommend to happen next?
Flags: needinfo?(bobowencode)
| Assignee | ||
Comment 12•8 years ago
|
||
(In reply to Frederik Braun [:freddyb] from comment #11)
> (In reply to Bob Owen (:bobowen) from comment #10)
> > We've only had 4 of these reported in Fx54 so far, so this has definitely
> > dropped back down to a very low level.
> >
> > Unfortunately, we've also had one in 56, so it hasn't gone away completely.
>
> What do you recommend to happen next?
The problem is I don't really see how the crash is occurring.
There is a change to the way the lock is created in an update to the chromium code I am working on, although there's nothing in the comments to suggest it was being done because of this sort of problem.
I don't think I'll land that update until 57 now.
Flags: needinfo?(bobowencode)
Updated•8 years ago
|
Priority: -- → P3
Whiteboard: sbwc3 → sb+
Updated•8 years ago
|
Component: Security → Security: Process Sandboxing
| Reporter | ||
Updated•8 years ago
|
status-firefox55:
? → ---
status-firefox56:
--- → affected
status-firefox57:
--- → affected
status-firefox58:
--- → ?
Comment 13•8 years ago
|
||
This is still happening, though I only see a handful of crashes in the last week:
bp-a472925f-b7e3-4bf7-ad17-875f40171018
Comment 14•8 years ago
|
||
Haven't seen any in 58.0 nightly, but then again I only saw one in 57.0 nightly. There are several in 57 beta but not high volume.
Comment 15•8 years ago
|
||
Hi Jim:
Please reassign this bug to appropriate developer.
Thanks!
Wennie
Assignee: nobody → jmathies
| Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(jmathies)
| Assignee | ||
Comment 16•8 years ago
|
||
I'll take another look at this, but I couldn't see how this could happen before.
I'm also working on an update to the sandbox code, which could help in theory.
Assignee: jmathies → bobowencode
Flags: needinfo?(jmathies) → needinfo?(bobowencode)
| Assignee | ||
Comment 17•8 years ago
|
||
I re-looked at this, I thought I'd found something to do with our pre-compile options, but turned out to be confusion caused by inlining.
Bug 1366701, which I plan to land early in Fx59 has a small change to this code (which I mentioned in Comment 12, but have only just sorted out all the issues), so there's a chance it might resolve this.
As Dan says this hasn't happened in 58a1 at all, be interesting to see what happens on Beta.
I'll leave the NI for now.
Comment 18•8 years ago
|
||
This is still happening in 58: https://crash-stats.mozilla.com/report/index/ea4ac1ea-3f3d-469d-97d3-d4fd50171220
None so far in 59 but of course less users.
Comment 19•8 years ago
|
||
I think I may have a bead on this. If so, this is a gnarly bug caused by the fact that modules mean there are multiple "global" namespaces. This would be a problem upstream.
Before the bug, I should point out that the underlying Chromium code has changed again -- they changed the static lock (that I'm conjecturing is at the root of the issue) from a LazyInstance to a proper singleton [1]. The move appears to be inspired by a desire to remove LazyInstance, not by a bug, but we also still don't see _any_ crashes from this in 59, which is about the time it should have hit Firefox (Bob merged it in November). So maybe all of this will be wrong and it was just some esoteric C++ crap that got resolved by the cosmetic change. But I don't think so.
The deal is, the crash happens when accessing a map that is shared concurrently. The map is guarded by this lock [2]. Again, previously, it was a LazyInstance, which, for our purposes, can just be viewed as a static value like the current one. Now, the code is supposed to be robust to the fact that different modules have global namespaces. It tries to make sure that the same instance of the ActiveVerifier is shared across modules in the current process -- the main_module_verifier in [3]. This looks correct. They forgot, however, to share the _lock_, which isn't part of the ActiveVerifier instance (maybe for good reason -- it may not be safe to get the instance so that we can fetch the lock from it... without already owning the lock). So each module in the process will operate on the same ActiveVerifier, but will "lock" its contents with the module's own lock. Which isn't "locking" anything.
Here's the part I'm unsure of -- we need multiple modules working on this object to run into a problem. Without fully understanding the usage of this code, I'm conjecturing that firefox.exe (which should be the module returned by GetModuleHandle(NULL)) and xul.dll must both be using this, inspiring a race. I think the answer is just to share the lock across modules the same way it currently shares the ActiveVerifier. I can hack that up pretty quick -- I'm still looking at validating some of my assumptions tho.
---------------------------
[1] https://codereview.chromium.org/2667513003/
[2] https://searchfox.org/mozilla-central/rev/9f2721b19b10a9a441fa5ca7116720d2fcebcc4a/security/sandbox/chromium/base/win/scoped_handle.cc#45
[3] https://searchfox.org/mozilla-central/rev/9f2721b19b10a9a441fa5ca7116720d2fcebcc4a/security/sandbox/chromium/base/win/scoped_handle.cc#139
Assignee: bobowencode → davidp99
Comment 20•8 years ago
|
||
Damn. Missed this.
https://searchfox.org/mozilla-central/rev/9f2721b19b10a9a441fa5ca7116720d2fcebcc4a/security/sandbox/chromium/base/win/scoped_handle.cc#80
// The methods required by HandleTraits. They are virtual because we need to
// forward the call execution to another module, instead of letting the
// compiler call the version that is linked in the current module.
Comment 21•8 years ago
|
||
We still have no sightings of this in 59. Maybe it _was_ fixed by the upstream change somehow. Until we see it again, we're going to continue to monitor but operate on the assumption that things are ok.
| Assignee | ||
Comment 22•8 years ago
|
||
Bad news here, looks like this is still happening in Beta 59 and Release 58.
The signature had changed slightly, so I've removed the anonymous namespace part, which also caused an issue on crash-stats because of the space.
Crash Signature: [@ `anonymous namespace''::ActiveVerifier::StartTracking] → [@ ActiveVerifier::StartTracking]
Updated•8 years ago
|
Assignee: davidp99 → nobody
Comment 23•8 years ago
|
||
Assigning unowned critical/high security bugs to triage owner. Please find an appropriate assignee for this bug.
Assignee: nobody → jmathies
| Reporter | ||
Updated•8 years ago
|
status-firefox59:
--- → affected
status-firefox60:
--- → ?
| Assignee | ||
Comment 24•8 years ago
|
||
I think I've finally worked out what might be going on here.
After comparing some dumps with the assembly of a working version, it looked like the internal _Myhead element of the internal list of ActiveVerifier::map_ [1] was getting deleted.
This element is for a null handle, so I couldn't see how until I realised that _Myhead is the dummy element used for map_.end().
So what I think is happening is either we're getting an untracked handle or more likely a double closed handle at [2].
The element isn't found and so it returns end(), an iterator pointing to _Myhead.
Those CHECKs should crash the process but they don't because we've never defined OFFICIAL_BUILD and our logging is stubbed.
So, we hit the map_.erase() and delete the head of map_'s interal list.
Which means operations on map_ are going to crash pretty soon.
We should probably define OFFICIAL_BUILD as MOZILLA_OFFICIAL and at least make the logging crash when the severity is LOG_FATAL.
[1] https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/security/sandbox/chromium/base/win/scoped_handle.cc#101
[2] https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/security/sandbox/chromium/base/win/scoped_handle.cc#210
Assignee: jmathies → bobowencode
Status: NEW → ASSIGNED
Flags: needinfo?(bobowencode)
| Assignee | ||
Updated•8 years ago
|
Priority: P3 → P1
| Assignee | ||
Comment 25•8 years ago
|
||
If comment 24 is correct, then the initial fix (to at least make the crash safe) is to make those CHECKs crash.
I plan to do that as a non-sec bug blocker to bug 1055227, as it is a perfectly normal patch to land and then means it's not obviously linked to a sec bug, does that sound OK to you?
Flags: needinfo?(dveditz)
Comment 27•8 years ago
|
||
(In reply to Bob Owen (:bobowen) from comment #25)
> If comment 24 is correct, then the initial fix (to at least make the crash
> safe) is to make those CHECKs crash.
> I plan to do that as a non-sec bug blocker to bug 1055227, as it is a
> perfectly normal patch to land and then means it's not obviously linked to a
> sec bug, does that sound OK to you?
Which bug is it? :)
Flags: needinfo?(bobowencode)
| Assignee | ||
Comment 28•8 years ago
|
||
Flags: needinfo?(bobowencode)
Comment 29•8 years ago
|
||
i.e., we can close this bug here?
| Assignee | ||
Comment 30•8 years ago
|
||
We've not had it since 60b5 (I think the fix landed in 60b7), so it's looking hopeful.
I was going to wait until release, but I guess we can close it.
This change will only have turned it into a safe crash, probably in StopTracking.
Bug 1452090 landed a change that hooks CloseHandle and DuplicateHandle to try and catch where the real problem.
That's only on Nightly for now, so we may not see it and might need to turn it on for beta for a couple of cycles.
| Assignee | ||
Comment 31•8 years ago
|
||
We've still had 4 of these so far in Fx60, so it hasn't gone away completely.
However, the ActiveVerifier is now turned off in late Beta and Release (in Fx61), so we certainly shouldn't get any then.
As this is so rare in Nightly and (early) Beta, where the Verifier is still on, I think we can close this.
Just realised that I should get that change (bug 1452090) uplifted to ESR.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox61:
--- → fixed
status-firefox62:
--- → fixed
Resolution: --- → FIXED
Updated•8 years ago
|
Group: dom-core-security → core-security-release
Updated•8 years ago
|
Updated•8 years ago
|
Whiteboard: sb+ → [adv-main61+] sb+
Updated•8 years ago
|
Flags: qe-verify-
Whiteboard: [adv-main61+] sb+ → [post-critsmash-triage][adv-main61+] sb+
Updated•6 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•