Closed
Bug 1342567
Opened 7 years ago
Closed 7 years ago
Crash in nsExpirationTracker<T>::RemoveObject
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
People
(Reporter: philipp, Assigned: tnikkel)
References
Details
(Keywords: crash, sec-high, testcase-wanted, Whiteboard: [post-critsmash-triage][adv-main54+][adv-esr52.2+])
Crash Data
Attachments
(8 files, 1 obsolete file)
1.35 KB,
patch
|
aosmond
:
review+
dveditz
:
sec-approval+
|
Details | Diff | Splinter Review |
5.29 KB,
patch
|
aosmond
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
6.55 KB,
patch
|
aosmond
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
1.42 KB,
patch
|
aosmond
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
881 bytes,
patch
|
aosmond
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
7.56 KB,
patch
|
gchang
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
7.62 KB,
patch
|
gchang
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
1.50 KB,
patch
|
aosmond
:
review+
jcristau
:
approval-mozilla-beta+
jcristau
:
approval-mozilla-esr52+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-8ce8fb39-5515-4de9-98d4-4ccfa2170224. ============================================================= Crashing Thread (0) Frame Module Signature Source 0 xul.dll nsExpirationTracker<imgCacheEntry, 3>::RemoveObject(imgCacheEntry*) obj-firefox/dist/include/nsExpirationTracker.h:147 1 xul.dll nsExpirationTracker<imgCacheEntry, 3>::MarkUsed(imgCacheEntry*) obj-firefox/dist/include/nsExpirationTracker.h:170 2 xul.dll nsContentUtils::LoadImage(nsIURI*, nsINode*, nsIDocument*, nsIPrincipal*, nsIURI*, mozilla::net::ReferrerPolicy, imgINotificationObserver*, int, nsAString_internal const&, imgRequestProxy**, unsigned int) dom/base/nsContentUtils.cpp:3297 3 xul.dll nsImageBoxFrame::UpdateImage() layout/xul/nsImageBoxFrame.cpp:260 4 xul.dll nsRuleNode::WalkRuleTree(nsStyleStructID, nsStyleContext*) layout/style/nsRuleNode.cpp:2422 5 xul.dll nsStyleContext::DoGetStyleList<1>() obj-firefox/layout/style/nsStyleStructList.h:51 6 xul.dll nsImageBoxFrame::DidSetStyleContext(nsStyleContext*) layout/xul/nsImageBoxFrame.cpp:533 7 xul.dll mozilla::dom::Element::FindAttrValueIn(int, nsIAtom*, nsIAtom* const* const*, nsCaseTreatment) dom/base/Element.cpp:2668 8 xul.dll nsImageBoxFrame::UpdateLoadFlags() layout/xul/nsImageBoxFrame.cpp:307 9 xul.dll nsImageBoxFrame::Init(nsIContent*, nsContainerFrame*, nsIFrame*) layout/xul/nsImageBoxFrame.cpp:206 10 xul.dll nsCSSFrameConstructor::InitAndRestoreFrame(nsFrameConstructorState const&, nsIContent*, nsContainerFrame*, nsIFrame*, bool) layout/base/nsCSSFrameConstructor.cpp:4971 this signature has been around for a while - a portion of it seems to be uaf though, and if you focus just on them they seem to have gotten more frequent in the 52 beta cycle: https://crash-stats.mozilla.com/search/?signature=%3DnsExpirationTracker%3CT%3E%3A%3ARemoveObject&address=%3D0xffffffffe5e5e5e5&release_channel=beta&date=%3E%3D2016-04-01T00%3A00%3A00.000Z&_sort=-date&_facets=signature&_facets=version&_facets=user_comments&_facets=adapter_vendor_id&_facets=build_id&_facets=install_time&_facets=platform_pretty_version&_facets=useragent_locale&_facets=cpu_arch&_facets=process_type&_facets=proto_signature&_facets=addons&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=address#facet-version around half of them have the mozilla_cc2@internetdownloadmanager.com addon present - not sure if that would be related
Comment 1•7 years ago
|
||
Half of all the crashes have that addon, or half of the ones that look like a UAF type crash?
Component: Untriaged → ImageLib
Flags: needinfo?(madperson)
Reporter | ||
Comment 2•7 years ago
|
||
half (or by now closer to 40%) of the UAF-type reports show that the IDM extension was also present (you can see that in the long crash stats link in comment #0 when switching to the addons facet tab). if you turn it around and specifically exclude the uaf crashes from a socorro search, IDM is installed in 8% of the remaining [@ nsExpirationTracker<T>::RemoveObject] crashes.
Flags: needinfo?(madperson)
Assignee | ||
Comment 3•7 years ago
|
||
Lots like we can get these kind of crashes if we MarkUsed on an entry that isn't in the nsExpirationTracker.
Updated•7 years ago
|
Group: core-security → gfx-core-security
Comment 4•7 years ago
|
||
Timothy, Andrew, do you have any idea to investigate this?
Flags: needinfo?(tnikkel)
Flags: needinfo?(aosmond)
Comment 5•7 years ago
|
||
Without any detailed analysis, my gut says is that it is related to the problem identified in bug 1297111. nsExpirationTracker is not thread safe but SurfaceCache can interact with it on many threads, and the extensions simply alter the probability of these thread collisions.
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Andrew Osmond [:aosmond] from comment #5) > Without any detailed analysis, my gut says is that it is related to the > problem identified in bug 1297111. nsExpirationTracker is not thread safe > but SurfaceCache can interact with it on many threads, and the extensions > simply alter the probability of these thread collisions. This is a different expiration tracker, no the surface cache one. I think the imgLoader surface cache tracker is main thread only. I think the mechanism we use to track what is in the tracker is buggy.
Flags: needinfo?(tnikkel)
Comment 8•7 years ago
|
||
Too late for firefox 52, mass-wontfix.
Assignee | ||
Comment 9•7 years ago
|
||
(In reply to [:philipp] from comment #0) > around half of them have the mozilla_cc2@internetdownloadmanager.com addon > present - not sure if that would be related Does anyone have a copy of this addon? I don't want to install this program and I don't have a Windows VM handy. Might be useful to know how it interacts with the image cache.
Assignee | ||
Comment 10•7 years ago
|
||
I've been through this code up and down, left and right, with a fine tooth comb. I don't know how this can happen. The only things I can see possible that are happening are: 1) the entry isn't in the cache at all, so it's not in the cache tracker either (the cache tracker being a subset of the cache). You'd be right to expect that the entry is in the cache as we just retrieved it a few lines before: https://dxr.mozilla.org/mozilla-central/rev/10ea10d9993c9701e5525928257a589dea2c05d8/image/imgLoader.cpp#2155 However there is a ValidateEntry call which can do a lot of things. But I went through ValidateEntry and everything it can call in detail to see if it could possible result in entries being removed from the cache and I couldn't find anything. It would be good to either rule this out (and confirm my analysis). Or if my analysis is wrong then we have a strong lead of where to look. 2) the HasNoProxies flag on the cache entry is out of sync with the imgRequest (whether the imgRequest has proxies or not). If this is the case it still wouldn't explain the crashes because we use the HasNoProxies flag on the cache entry as the source of truth for whether to put the entry in the cache tracker or not. And as far as I can tell if they got out of sync and then the HasNoProxies was updated to be correct we would handle that change correctly (adding or removing from the cache tracker).
Attachment #8855962 -
Flags: review?(aosmond)
Updated•7 years ago
|
Attachment #8855962 -
Flags: review?(aosmond) → review+
Assignee | ||
Comment 11•7 years ago
|
||
Comment on attachment 8855962 [details] [diff] [review] diagnostic patch [Security approval request comment] How easily could an exploit be constructed based on the patch? the patch points out where a problem happens, but that is already public on crash-stats, although landing this patch would call more attention to it, an attacker would need to figure out how to trigger the condition first Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? see previous answer Which older supported branches are affected by this flaw? not sure If not all supported branches, which bug introduced the flaw? not sure Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? don't need to backport, diagnostic patch only How likely is this patch to cause regressions; how much testing does it need? just adds a note to crashreports if we are likely about to crash
Attachment #8855962 -
Flags: sec-approval?
Updated•7 years ago
|
Flags: needinfo?(aosmond)
Comment 12•7 years ago
|
||
If these crashes are mostly from a particular addon, maybe it is somehow interacting with the image cache off the main thread? We see that occasionally with the observer service, such as bug 1339083 and bug 1295600. You could try adding some release assert main thread checks to the entry points to this cache and see if that turns up anything.
Comment 13•7 years ago
|
||
Comment on attachment 8855962 [details] [diff] [review] diagnostic patch sec-approval=dveditz Might want to add the "leave-open" keyword so the sheriffs don't resolve it when they merge to mozilla-central.
Attachment #8855962 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Updated•7 years ago
|
Keywords: leave-open
Assignee | ||
Comment 14•7 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #12) > If these crashes are mostly from a particular addon, maybe it is somehow > interacting with the image cache off the main thread? We see that > occasionally with the observer service, such as bug 1339083 and bug 1295600. > You could try adding some release assert main thread checks to the entry > points to this cache and see if that turns up anything. Interesting! I'll write a patch for that next.
Assignee | ||
Comment 15•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/445330fd12114f3f201a16771cd4fec74b878a66
Assignee | ||
Comment 16•7 years ago
|
||
For the record, six days ago... https://hg.mozilla.org/mozilla-central/rev/445330fd1211
Assignee | ||
Comment 17•7 years ago
|
||
Attachment #8858987 -
Flags: review?(aosmond)
Comment 18•7 years ago
|
||
Comment on attachment 8858987 [details] [diff] [review] mainthread asserts Review of attachment 8858987 [details] [diff] [review]: ----------------------------------------------------------------- What threads does imgLoader::mUncachedImagesMutex protect us from contention with? Most of the call trees into imgLoader::AddToUncachedImages and imgLoader::RemoveFromUncachedImages appear to be on the main thread (mostly called directly or indirectly from the methods we are adding the asserts to). Is it imgRequest::~imgRequest which can be called off main thread (given it has a threadsafe destructor)?
Attachment #8858987 -
Flags: review?(aosmond) → review+
Comment 19•7 years ago
|
||
> (given it has a threadsafe destructor)?
*Has threadsafe *refcounting*
Assignee | ||
Comment 20•7 years ago
|
||
(In reply to Andrew Osmond [:aosmond] from comment #18) > What threads does imgLoader::mUncachedImagesMutex protect us from contention > with? Most of the call trees into imgLoader::AddToUncachedImages and > imgLoader::RemoveFromUncachedImages appear to be on the main thread (mostly > called directly or indirectly from the methods we are adding the asserts > to). Is it imgRequest::~imgRequest which can be called off main thread > (given it has a threadsafe destructor)? Exactly. The comment for mUncachedImagesMutex spells it out: https://dxr.mozilla.org/mozilla-central/rev/a374c35469935a874fefe64d3e07003fc5bc8884/image/imgLoader.h#454 I don't expect the uncached images to be a problem here because they are basically only tracked so we can report them in about:memory. Once an image goes into the uncached images it shouldn't be in any of the cache data structures, and it should be impossible for it to return to the cache at a later time.
Assignee | ||
Comment 21•7 years ago
|
||
We had our first crash with the annotation https://crash-stats.mozilla.com/report/index/67d0af19-481a-41b2-9925-35ff90170418 the gfx critical note says "request->HasConsumers() false inCache false". So the cache entry is somehow getting removed. Either some illegal off main thread access to the img cache is happening, or the ValidateEntry is somehow able to re-entry and modify the imgcache. I've got the patch (which I'll land shortly) to detect the off main thread case. I'll attach a patch to detect the other case.
Assignee | ||
Comment 22•7 years ago
|
||
Attachment #8859388 -
Flags: review?(aosmond)
Assignee | ||
Comment 23•7 years ago
|
||
Attachment #8859390 -
Flags: review?(aosmond)
Assignee | ||
Comment 24•7 years ago
|
||
Comment on attachment 8858987 [details] [diff] [review] mainthread asserts [Security approval request comment] How easily could an exploit be constructed based on the patch? if the imgcache is indeed being used off main thread, and if the attacker sees the patch, looks at crash stats, finds a crash, sees the stack, and then is able to use the imgcache off main thread in the same way it might not be too hard to make this happen on a build without this diagnostic patch. We need to get this data ourselves before we can do anything about this problem though, not really any other way to get this data. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? see previous answer Which older supported branches are affected by this flaw? not sure If not all supported branches, which bug introduced the flaw? not sure Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? don't need to backport, diagnostic patch only How likely is this patch to cause regressions; how much testing does it need? purposely adds a crash if we use the imgcache off main thread to see if that is happening, i'll back this out on the first crash we see
Attachment #8858987 -
Flags: sec-approval?
Comment 25•7 years ago
|
||
Comment on attachment 8858987 [details] [diff] [review] mainthread asserts sec-approval+
Attachment #8858987 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 26•7 years ago
|
||
Comment on attachment 8859388 [details] [diff] [review] check for imgloader reentry I'll remove the MOZ_RELEASE_ASSERT before landing, that was just for tryserver to verify that it was not happening.
Assignee | ||
Comment 27•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/600f9d60d76ffa8498a32fcf6ddf5b2856a6fbb0
Updated•7 years ago
|
Attachment #8859388 -
Flags: review?(aosmond) → review+
Updated•7 years ago
|
Attachment #8859390 -
Flags: review?(aosmond) → review+
Assignee | ||
Comment 28•7 years ago
|
||
Comment on attachment 8859388 [details] [diff] [review] check for imgloader reentry [Security approval request comment] How easily could an exploit be constructed based on the patch? the patch points out that we think it might be possible for the imgcache to reenter itself. if the attacker figures out how then they could maybe do something with that. I'm not landing the MOZ_RELEASE_ASSERTS so neither we nor the attackers will get the stacks of re-entrancy (at this point, depending on the results of this patch we might need the stacks later) Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? same as answer to the previous question Which older supported branches are affected by this flaw? not sure If not all supported branches, which bug introduced the flaw? not sure Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? don't need to backport, diagnostic patch only How likely is this patch to cause regressions; how much testing does it need? the patch only adds a crash report annotation in a bad situation
Attachment #8859388 -
Flags: sec-approval?
Assignee | ||
Comment 29•7 years ago
|
||
Comment on attachment 8859390 [details] [diff] [review] check if our entry was replaced or removed from the cache [Security approval request comment] How easily could an exploit be constructed based on the patch? this patch just adds a tiny bit more info to the diagnostic patch we already landed, no new information is exposed Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? not anymore then is already landed Which older supported branches are affected by this flaw? not sure If not all supported branches, which bug introduced the flaw? not sure Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? don't need to backport, diagnostic patch only How likely is this patch to cause regressions; how much testing does it need? just changes what we put in a crash report annotation, not risky, no testing needed
Attachment #8859390 -
Flags: sec-approval?
https://hg.mozilla.org/mozilla-central/rev/600f9d60d76f
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Assignee | ||
Comment 31•7 years ago
|
||
Looks like this was closed accidentally despite the leave-open keyword.
Status: RESOLVED → REOPENED
status-firefox55:
fixed → ---
Resolution: FIXED → ---
Target Milestone: mozilla55 → ---
Updated•7 years ago
|
Attachment #8859388 -
Flags: sec-approval? → sec-approval+
Updated•7 years ago
|
Attachment #8859390 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 32•7 years ago
|
||
The addons list from one of the crashes was very vanilla, nothing suspect in there, I think only adblock plus beyond the default stuff we ship. So the addon lead might be a dead end.
https://hg.mozilla.org/mozilla-central/rev/289645ac65c1 https://hg.mozilla.org/mozilla-central/rev/f577512b1a29
Assignee | ||
Comment 34•7 years ago
|
||
We've had a few crashes in builds that have the main thread fatal asserts but there are no occurrences of hitting the fatal asserts. So doesn't look like it's off main thread access. Stilling waiting for crashes in a build that contains the last two patches.
Updated•7 years ago
|
Keywords: testcase-wanted
Assignee | ||
Comment 35•7 years ago
|
||
No crashes on nightly in about two weeks, looks like we'll need some data from beta (which has multiple crashes a day) to make any progress here. This rolls up all the diagnostic patches for beta except the main thread asserts because that doesn't seem to be the problem.
Assignee | ||
Updated•7 years ago
|
Attachment #8865261 -
Attachment description: rollout patch of diagnostics for beta → rollup patch of diagnostics for beta
Assignee | ||
Comment 36•7 years ago
|
||
Comment on attachment 8865261 [details] [diff] [review] rollup patch of diagnostics for beta We have gotten zero crashes on nightly in about two weeks, to make any progress here we need more data. So I'm suggesting uplifting this to beta. There should be no downside (no extra crashes are added, only crash report notes). We get enough crashes on beta that we only need one day on beta to get enough data. So can be landed for only one beta build. Approval Request Comment [Feature/Bug causing the regression]: diagnostic patch to get information from crash stats [User impact if declined]: diagnostic patch to get information from crash stats [Is this code covered by automated tests?]: n/a [Has the fix been verified in Nightly?]: n/a [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no [Why is the change risky/not risky?]: only adds a note to crash reports [String changes made/needed]: none
Attachment #8865261 -
Flags: approval-mozilla-beta?
Assignee | ||
Updated•7 years ago
|
Attachment #8865261 -
Attachment is obsolete: true
Attachment #8865261 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 37•7 years ago
|
||
We finally got a crash with the latest diagnostic patches https://crash-stats.mozilla.com/report/index/ea212243-7678-4f5f-9beb-5aff40170509
Assignee | ||
Comment 38•7 years ago
|
||
The crash report annotation shows that there is no entry in the cache at all for our key. And that we call RemoveFromCache (the version that passes an entry and not a key) while that entry is "in use". Still not sure how that can happen. So I'll land a patch that release asserts when we get the interesting RemoveFromCache call so we can get a stack. I'll request uplift to beta so we don't have to wait two weeks to get data from this.
Assignee | ||
Comment 39•7 years ago
|
||
Attachment #8866035 -
Flags: review?(aosmond)
Assignee | ||
Comment 40•7 years ago
|
||
Updated•7 years ago
|
Attachment #8866035 -
Flags: review?(aosmond) → review+
Assignee | ||
Comment 41•7 years ago
|
||
Comment on attachment 8866035 [details] [diff] [review] release assert if we remove an "in use" entry from the cache [Security approval request comment] How easily could an exploit be constructed based on the patch? no new information is being disclosed by this patch, just a rel assert where we can a crash note before Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? no new information is being disclosed by this patch, just a rel assert where we can a crash note before Which older supported branches are affected by this flaw? diagnostic patch only If not all supported branches, which bug introduced the flaw? diagnostic patch only Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? diagnostic patch only How likely is this patch to cause regressions; how much testing does it need? we might crash on a rel assert a little more often then we would hit a uaf crash
Attachment #8866035 -
Flags: sec-approval?
Comment 42•7 years ago
|
||
Comment on attachment 8866035 [details] [diff] [review] release assert if we remove an "in use" entry from the cache <insert comment here> sec-approval+
Attachment #8866035 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 43•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/06d6f928ed649f4d2978025a9c3535432dbb5ab8
Comment 44•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/06d6f928ed64
Assignee | ||
Comment 45•7 years ago
|
||
Comment on attachment 8866037 [details] [diff] [review] rollup patch for beta Instead of waiting another two weeks to get another crash lets uplift to beta. We get enough crashes on beta that we only need one day on beta to get enough data. So can be landed for only one beta build and then backed out. Approval Request Comment [Feature/Bug causing the regression]: diagnostic patch to get information from crash stats [User impact if declined]: diagnostic patch to get information from crash stats [Is this code covered by automated tests?]: n/a [Has the fix been verified in Nightly?]: n/a [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no [Why is the change risky/not risky?]: crashes slightly earlier where we are likely to crash anyways [String changes made/needed]: none
Attachment #8866037 -
Flags: approval-mozilla-beta?
Comment 46•7 years ago
|
||
Comment on attachment 8866037 [details] [diff] [review] rollup patch for beta This is a diagnostic patch to get information. Beta54+. Should be in 54 beta 8.
Attachment #8866037 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 47•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/0f4b47c9fab1
Assignee | ||
Comment 48•7 years ago
|
||
https://crash-stats.mozilla.com/report/index/36e15fd1-9962-4cb3-a0ba-a8ab00170517 Looks like one of the content policy calls we make in ShouldLoadCachedImage https://dxr.mozilla.org/mozilla-central/rev/b8e9b674033bcd1f3a4c59b9d0ee7619c1a17cc5/image/imgLoader.cpp#559 calls into JS because content policies can be implemented in JS it seems. That's already bad, but it looks like the JS does a sync XHR which goes ahead and spins it's own event loop which can do anything, in this case it modifies the image cache entry we are looking at. I saw ehsan in the blame of this code so I'll pick on you. Is this expected? Is there anything we can do here? Is it even reasonable to make sure every place that calls a content policy ShouldLoad method can handle running a nested event loop there?
Flags: needinfo?(ehsan)
Comment 49•7 years ago
|
||
the rollup patch has issues like Hunk #7 FAILED at 2417 7 out of 7 hunks FAILED -- saving rejects to file image/imgLoader.cpp.rej patching file image/imgLoader.h Hunk #1 FAILED at 138 Hunk #2 FAILED at 172 2 out of 2 hunks FAILED -- saving rejects to file image/imgLoader.h.rej could you take a look, thanks!
Flags: needinfo?(tnikkel)
Assignee | ||
Comment 50•7 years ago
|
||
The rollup patch has already been uplifted.
Flags: needinfo?(tnikkel)
Comment 51•7 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #48) > I saw ehsan in the blame of this code so I'll pick on you. Is this expected? Certainly (at least the part about content policies being implemented in JS). This is how extensions like ad blockers work. (And I have seen the IDM add-on use the content policy API in other bugs FWIW so no surprises there.) > Is there anything we can do here? Is it even reasonable to make sure every > place that calls a content policy ShouldLoad method can handle running a > nested event loop there? You need to protect your code against this. :-) The simplest way to do that would be to check nsIDocument::IsInSyncOperation() <https://searchfox.org/mozilla-central/rev/9a7fbdee1d54f99cd548af95b81231d80e5f9ad1/dom/base/nsIDocument.h#2541>. That function would return true if a nested event loop such as this one for sync XHR is currently running somewhere underneath you on the stack. If you need to defer some operation to the end of any such sync loops, you may want to modify the counter management in nsIDocument::SetIsInSyncOperation() (called from nsAutoSyncOperation::~nsAutoSyncOperation) to do something when the counter hits zero, which would be when the last one of these sync loops returns.
Flags: needinfo?(ehsan)
Assignee | ||
Comment 52•7 years ago
|
||
The nested event loop doesn't matter here. Just running js can do plenty of things to screw us up here. So basically whenever we do a content policy check we have to assume that the entire world can change out from under us and not save any state across content policy calls. Surely this is a more widespread problem in code calling content policy methods and can't just be limited to imagelib? We need to audit all calls. It's quite frustrating because I spent hours pouring over the code that imgLoader::ValidateEntry calls to try to see if any of the calls could "escape" like this. If this is the expected behaviour then it needs to be very clear so that this same mistake isn't repeated over and over again in our code. Apparently, this comment https://dxr.mozilla.org/mozilla-central/rev/41958333867b0f537271dbd4cb4ba9e8a67a85a8/dom/base/nsIContentPolicy.idl#64 is completely wrong then? We should remove it.
Flags: needinfo?(ehsan)
Assignee | ||
Comment 53•7 years ago
|
||
I just had a look at some of the places where this code is called, this seems really bad. It seems like it adds JS calls and nested event loops to tons of very deeply nested code that isn't expecting it.
Assignee | ||
Comment 54•7 years ago
|
||
We got the data we needed, we can back this out. Approval Request Comment [Feature/Bug causing the regression]: n/a, back out a diagnostic patch [User impact if declined]: n/a, back out a diagnostic patch [Is this code covered by automated tests?]: n/a, back out a diagnostic patch [Has the fix been verified in Nightly?]: n/a, back out a diagnostic patch [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no, back out a diagnostic patch [Why is the change risky/not risky?]: back out a diagnostic patch [String changes made/needed]: none
Attachment #8868754 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 55•7 years ago
|
||
Just for fun here is a stack without a nested event loop https://crash-stats.mozilla.com/report/index/ab33de34-e187-4bdc-9ce2-5d5720170517
Assignee | ||
Comment 56•7 years ago
|
||
Backed out all diagnostic changes landed on central/inbound https://hg.mozilla.org/integration/mozilla-inbound/rev/641fa04790771eff942517d200348a72501c09e7
Comment 57•7 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #56) > Backed out all diagnostic changes landed on central/inbound > https://hg.mozilla.org/integration/mozilla-inbound/rev/ > 641fa04790771eff942517d200348a72501c09e7 This got merged to m-c as well. https://hg.mozilla.org/mozilla-central/rev/641fa04790771eff942517d200348a72501c09e7
Comment 58•7 years ago
|
||
Comment on attachment 8868754 [details] [diff] [review] backout patch for beta Backout diagnostic patch. Beta54+. Should be in 54 beta 9.
Attachment #8868754 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 59•7 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #52) > The nested event loop doesn't matter here. Just running js can do plenty of > things to screw us up here. Right. > So basically whenever we do a content policy check we have to assume that > the entire world can change out from under us and not save any state across > content policy calls. Surely this is a more widespread problem in code > calling content policy methods and can't just be limited to imagelib? We > need to audit all calls. Yeah, although I think most callers are already aware of this. I'm surprised that imagelib assumed something else all these years. :-( And how that didn't blow up earlier. > It's quite frustrating because I spent hours pouring over the code that > imgLoader::ValidateEntry calls to try to see if any of the calls could > "escape" like this. If this is the expected behaviour then it needs to be > very clear so that this same mistake isn't repeated over and over again in > our code. Firstly, I'm sorry you had to waste a lot of time and energy on this. :-( But in general, please never assume that *any* XPCOM add-on that isn't builtinclass has any sane behavior, that is a very dangerous assumption and can indeed backfire in all sorts of ways. In general, extensions are free to implement any and all such interfaces and make their implementations do absolutely anything. After 57 when release builds won't load anything but web extension based add-ons, this problem will mostly fix itself... > Apparently, this comment > > https://dxr.mozilla.org/mozilla-central/rev/ > 41958333867b0f537271dbd4cb4ba9e8a67a85a8/dom/base/nsIContentPolicy.idl#64 > > is completely wrong then? We should remove it. Yes, that's definitely wrong. That was done ~17 years ago by Boris: https://github.com/mozilla/gecko-dev/commit/6d2d10784c084733cdbcad5bf647d7d36c16e965. OK to remove it?
Flags: needinfo?(ehsan) → needinfo?(bzbarsky)
Comment 60•7 years ago
|
||
backout uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/ad4fa51c539f
Comment 61•7 years ago
|
||
That comment is true in the sense that if an implementor of ShouldLoad does any of those things really bad stuff will happen. If an implementor of ShouldLoad does a sync XHR, really bad things can also happen, and not just for the imagelib callsite... Like comment 52 says, we do content policy checks in all sorts of places that are deeply nested under various callstacks right now. I don't think any of them are really hardened against either malice (what's the point?) or incompetence on the part of the content policy implementors. Yes, this is all pretty terrible. Spot-fixing things for stability purposes probably makes sense, of course. Bug 893916 tracks one of the proposed mitigation strategies for this specifically for images, though it would not necessarily help the specific case discussed in this bug, even if we managed to do it. When we drop support for XPCOM addons, that might help some. I don't know whether anyone has done a really careful audit of the webextension content policy codepaths. We should really do that.
Updated•7 years ago
|
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 62•7 years ago
|
||
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #59) > But in general, please never assume that *any* XPCOM add-on that isn't > builtinclass has any sane behavior, that is a very dangerous assumption and > can indeed backfire in all sorts of ways. In general, extensions are free > to implement any and all such interfaces and make their implementations do > absolutely anything. After 57 when release builds won't load anything but > web extension based add-ons, this problem will mostly fix itself... I didn't know that the code involved addons in any way. If I knew then of course I would assume the addon was doing the worst thing possible.
Assignee | ||
Comment 63•7 years ago
|
||
I think this should make things better, hopefully it doesn't mess up our accounting of entries in the cache somehow.
Attachment #8872110 -
Flags: review?(aosmond)
Updated•7 years ago
|
Attachment #8872110 -
Flags: review?(aosmond) → review+
Assignee | ||
Updated•7 years ago
|
Keywords: leave-open
Assignee | ||
Comment 64•7 years ago
|
||
Comment on attachment 8872110 [details] [diff] [review] spotfix imgloader [Security approval request comment] How easily could an exploit be constructed based on the patch? an attacker would need to have control of an extension installed into Firefox on the targets computer to exploit this. this patch doesn't reveal anything that the diagnostic patches we already landed didn't already reveal Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? probably not. I scoured the code to try to find this problem (using more info than is in the patch) and I didn't find it. Which older supported branches are affected by this flaw? all If not all supported branches, which bug introduced the flaw? not sure, likely around for a very long time Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? trivial How likely is this patch to cause regressions; how much testing does it need? since we haven't been able to reproduce this bug the only way to test this is to check crashstats to confirm that the crashes go away, and that we haven't made other crashes worse
Attachment #8872110 -
Flags: sec-approval?
Comment 65•7 years ago
|
||
NI? release management to see if we have time to take this on Beta and ESR52 given the current schedule. This will determine if we take it on trunk now.
Flags: needinfo?(rkothari)
Flags: needinfo?(lhenry)
We will gtb b13 on Thursday so yes we can take it.
Flags: needinfo?(rkothari)
Comment 67•7 years ago
|
||
Comment on attachment 8872110 [details] [diff] [review] spotfix imgloader Please nominate patches for Beta and ESR52 approval, to be checked in after mozilla-central. sec-approval+
Attachment #8872110 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 68•7 years ago
|
||
Comment on attachment 8872110 [details] [diff] [review] spotfix imgloader [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: it's marked sec-high, but thats probably too high since it requires an attacker to control an extension. it should fix crashes. User impact if declined: some crashes Fix Landed on Version: 55 Risk to taking this patch (and alternatives if risky): not risky String or UUID changes made by this patch: none See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info. Approval Request Comment [Feature/Bug causing the regression]: around for a long time [User impact if declined]: some crashes [Is this code covered by automated tests?]: no [Has the fix been verified in Nightly?]: not yet, it just landed [Needs manual test from QE? If yes, steps to reproduce]: no, we don't know how to reproduce [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no [Why is the change risky/not risky?]: if we reach the condition changed in the patch we are likely crashing on the next line [String changes made/needed]: none
Attachment #8872110 -
Flags: approval-mozilla-esr52?
Attachment #8872110 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 69•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/39f174dc2f917c2846e856f53015eba646cda76d
Comment 70•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/39f174dc2f91
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
tracking-firefox54:
--- → ?
tracking-firefox55:
--- → ?
tracking-firefox-esr52:
--- → ?
Flags: needinfo?(lhenry)
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 71•7 years ago
|
||
Comment on attachment 8872110 [details] [diff] [review] spotfix imgloader imgloader crash fix, sec-high, beta54+ and esr52.2+
Attachment #8872110 -
Flags: approval-mozilla-esr52?
Attachment #8872110 -
Flags: approval-mozilla-esr52+
Attachment #8872110 -
Flags: approval-mozilla-beta?
Attachment #8872110 -
Flags: approval-mozilla-beta+
Comment 72•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/30ba0234b771 https://hg.mozilla.org/releases/mozilla-esr52/rev/6e644bc1a57f
Updated•7 years ago
|
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Updated•7 years ago
|
Group: gfx-core-security → core-security-release
Updated•7 years ago
|
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main54+][adv-esr52.2+]
Updated•7 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•