Closed
Bug 916128
Opened 11 years ago
Closed 11 years ago
Canvas2D crash [@nsXPCWrappedJS::Release]
Categories
(Core :: Graphics: Canvas2D, defect)
Tracking
()
VERIFIED
FIXED
mozilla27
Tracking | Status | |
---|---|---|
firefox26 | --- | unaffected |
firefox27 | --- | verified |
firefox-esr24 | --- | unaffected |
b2g18 | --- | unaffected |
People
(Reporter: posidron, Assigned: spohl)
References
Details
(Keywords: crash, sec-high, testcase)
Attachments
(4 files, 5 obsolete files)
458 bytes,
text/html
|
Details | |
3.42 KB,
text/plain
|
Details | |
4.69 KB,
patch
|
spohl
:
review+
|
Details | Diff | Splinter Review |
1.27 KB,
patch
|
spohl
:
review+
|
Details | Diff | Splinter Review |
I believe this depends on bug 817700 as well.
Tested with https://hg.mozilla.org/integration/mozilla-inbound/rev/3c7913e853b8
Reporter | ||
Comment 1•11 years ago
|
||
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → spohl.mozilla.bugs
Comment 2•11 years ago
|
||
Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/5b89a959713b
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 ID:20130912125238
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/e375ecffbc98
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 ID:20130912140538
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=5b89a959713b&tochange=e375ecffbc98
Assignee | ||
Comment 3•11 years ago
|
||
Looks like this is caused by FileCallback not being thread-safe:
Hit MOZ_CRASH(CallbackObject not thread-safe) at /Users/Stephen/Documents/mozilla-central/dom/bindings/CallbackObject.cpp:31
Thread 45 Crashed:
0 XUL 0x00000001046ad840 mozilla::dom::CallbackObject::Release() + 224 (CallbackObject.cpp:31)
1 XUL 0x0000000102405534 nsRefPtr<mozilla::dom::FileCallback>::~nsRefPtr() + 52 (nsAutoPtr.h:887)
2 XUL 0x00000001024054f5 nsRefPtr<mozilla::dom::FileCallback>::~nsRefPtr() + 21 (nsAutoPtr.h:888)
3 XUL 0x0000000102405495 mozilla::dom::EncodingCompleteEvent::~EncodingCompleteEvent() + 53 (ImageEncoder.cpp:28)
4 XUL 0x0000000102404aa5 mozilla::dom::EncodingCompleteEvent::~EncodingCompleteEvent() + 21 (ImageEncoder.cpp:28)
5 XUL 0x0000000102404ac9 mozilla::dom::EncodingCompleteEvent::~EncodingCompleteEvent() + 25 (ImageEncoder.cpp:28)
6 XUL 0x0000000102403415 mozilla::dom::EncodingCompleteEvent::Release() + 485 (ImageEncoder.cpp:68)
7 XUL 0x00000001024052c1 nsRefPtr<mozilla::dom::EncodingCompleteEvent>::~nsRefPtr() + 49 (nsAutoPtr.h:887)
8 XUL 0x0000000102404745 nsRefPtr<mozilla::dom::EncodingCompleteEvent>::~nsRefPtr() + 21 (nsAutoPtr.h:888)
9 XUL 0x0000000102405355 mozilla::dom::EncodingRunnable::~EncodingRunnable() + 53 (ImageEncoder.cpp:92)
10 XUL 0x00000001024050f5 mozilla::dom::EncodingRunnable::~EncodingRunnable() + 21 (ImageEncoder.cpp:92)
11 XUL 0x0000000102405119 mozilla::dom::EncodingRunnable::~EncodingRunnable() + 25 (ImageEncoder.cpp:92)
12 XUL 0x00000001024037d5 mozilla::dom::EncodingRunnable::Release() + 485 (ImageEncoder.cpp:147)
13 XUL 0x0000000101563f1b nsCOMPtr<nsIRunnable>::~nsCOMPtr() + 91 (nsCOMPtr.h:514)
14 XUL 0x00000001015623b5 nsCOMPtr<nsIRunnable>::~nsCOMPtr() + 21 (nsCOMPtr.h:515)
15 XUL 0x0000000104779333 nsThread::ProcessNextEvent(bool, bool*) + 1875 (nsThread.cpp:630)
16 XUL 0x00000001046d1519 NS_ProcessNextEvent(nsIThread*, bool) + 169 (nsThreadUtils.cpp:238)
17 XUL 0x0000000104777c17 nsThread::ThreadFunc(void*) + 295 (nsThread.cpp:250)
18 libnss3.dylib 0x00000001012383f5 _pt_root + 357
19 libsystem_c.dylib 0x00007fff968687a2 _pthread_start + 327
20 libsystem_c.dylib 0x00007fff968551e1 thread_start + 13
Reporter | ||
Comment 4•11 years ago
|
||
On MacOS with a non-debug/opt build this produces now a stack-buffer-underflow.
Two days ago it only returned a null-deref, yesterday it was not reproducible at all.
Will mark this as s-s.
Group: core-security
Assignee | ||
Comment 5•11 years ago
|
||
It seems right to back out bug 817700 until this issue (and bug 914966) is resolved. If I don't hear otherwise, I'll go ahead with the backout today.
Yeah, EncodingCompleteEvent can't assume that it will be destroyed on the main thread since the main thread could run it before NS_DispatchToMainThread returns on the background thread.
The solution is to clear mCallback (and mScriptContext) once we're done with them at http://hg.mozilla.org/mozilla-central/annotate/87c4a21ed158/content/canvas/src/ImageEncoder.cpp#l45.
Happy to review here.
Assignee | ||
Comment 7•11 years ago
|
||
Thanks Kyle! I get the impression that we shouldn't be calling Release() on mCallback from the encoding thread at all. Unfortunately, we do this indirectly via the EncodingRunnable destructor (see stack in comment 3). Clearing mCallback and mScriptContext like you suggested wasn't sufficient in my local run.
Should I try to make CallbackObject thread-safe (not sure how difficult that would be), store a raw FileCallback* instead of an nsRefPtr in EncodingCompleteEvent, or should I try something different? Before bug 911575 we used to store an nsCOMPtr<nsIFileCallback> which did not exhibit this issue. Bug 911575 replaced the use of nsIFileCallback with FileCallback.
Flags: needinfo?(khuey)
(In reply to Stephen Pohl [:spohl] from comment #7)
> Thanks Kyle! I get the impression that we shouldn't be calling Release() on
> mCallback from the encoding thread at all.
That's correct.
> Unfortunately, we do this
> indirectly via the EncodingRunnable destructor (see stack in comment 3).
> Clearing mCallback and mScriptContext like you suggested wasn't sufficient
> in my local run.
Er, why not? What went wrong after that?
> Should I try to make CallbackObject thread-safe (not sure how difficult that
> would be)
No. That's not possible
> , store a raw FileCallback* instead of an nsRefPtr in EncodingCompleteEvent,
That'll be a different sg:crit.
> or should I try something different? Before bug
> 911575 we used to store an nsCOMPtr<nsIFileCallback> which did not exhibit
> this issue. Bug 911575 replaced the use of nsIFileCallback with FileCallback.
It just didn't exhibit the problem yet. We're going to make XPCWrappedJS main-thread-only at some point.
Flags: needinfo?(khuey)
Comment 9•11 years ago
|
||
> Clearing mCallback and mScriptContext like you suggested wasn't sufficient
Hmm. Why not?
> Should I try to make CallbackObject thread-safe
You can't: it's cycle-collected.
> store a raw FileCallback* instead of an nsRefPtr in EncodingCompleteEvent
And proxy-release on mainthread? Better to use nsMainThreadPtrHolder if we have to.
> Before bug 911575 we used to store an nsCOMPtr<nsIFileCallback> which did not exhibit
> this issue
Maybe. See bug 771074: it could trigger data races which would corrupt memory. The new code just does a MOZ_CRASH instead of silently scribbling on non-threadsafe data structures.
Comment 10•11 years ago
|
||
Note that nsMainThreadPtrHolder might be the simple-and-principled way to do this...
Assignee | ||
Comment 11•11 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #8)
> (In reply to Stephen Pohl [:spohl] from comment #7)
> > Unfortunately, we do this
> > indirectly via the EncodingRunnable destructor (see stack in comment 3).
> > Clearing mCallback and mScriptContext like you suggested wasn't sufficient
> > in my local run.
>
> Er, why not? What went wrong after that?
We continue to call Release() on mCallback via the EncodingRunnable destructor (same stack as comment 3). I'd say this causes problems when the EncodingRunnable object is done and gets destroyed before the EncodingCompleteEvent's Run() method could get as far as clearing mCallback. The console prints:
Hit MOZ_CRASH(CallbackObject not thread-safe) at /Users/Stephen/Documents/mozilla-central/dom/bindings/CallbackObject.cpp:31
> > , store a raw FileCallback* instead of an nsRefPtr in EncodingCompleteEvent,
>
> That'll be a different sg:crit.
I figured you'd say that...
(In reply to Boris Zbarsky [:bz] from comment #9)
> > store a raw FileCallback* instead of an nsRefPtr in EncodingCompleteEvent
>
> And proxy-release on mainthread? Better to use nsMainThreadPtrHolder if we
> have to.
I shall try this route.
> > Before bug 911575 we used to store an nsCOMPtr<nsIFileCallback> which did not exhibit
> > this issue
>
> Maybe. See bug 771074: it could trigger data races which would corrupt
> memory. The new code just does a MOZ_CRASH instead of silently scribbling
> on non-threadsafe data structures.
Ah, I see. Thanks!
Comment 12•11 years ago
|
||
> EncodingRunnable object is done and gets destroyed before the EncodingCompleteEvent's
> Run() method could ge
The EncodingRunnable should transfer ownership to the EncodingCompleteEvent, I'd think.
Assignee | ||
Comment 13•11 years ago
|
||
It looks like I've been misunderstanding the meaning of NS_ENSURE_SUCCESS. I thought this was a debug-only check, similar to an assertion and didn't realize it would return early when encountering a failure. In this particular case, we return early after a (seemingly expected) failed call to ImageEncoder::ExtractDataInternal here:
http://hg.mozilla.org/mozilla-central/annotate/87c4a21ed158/content/canvas/src/ImageEncoder.cpp#l118
We end up never dispatching the complete event to the main thread and therefore release mCallback from the encoder thread.
This was one of the tougher ways for me to learn the meaning of NS_ENSURE_SUCCESS... :-(
Assignee | ||
Comment 14•11 years ago
|
||
This fixes the issue for me.
Attachment #804686 -
Flags: review?(khuey)
Assignee | ||
Comment 15•11 years ago
|
||
Comment on attachment 804686 [details] [diff] [review]
Patch
Ah, snap. Forgot to shut down the encoder thread in the failure case.
Attachment #804686 -
Flags: review?(khuey)
Assignee | ||
Comment 16•11 years ago
|
||
Made sure to shut down the worker thread on failure as well.
Attachment #804686 -
Attachment is obsolete: true
Attachment #804695 -
Flags: review?(khuey)
Comment on attachment 804695 [details] [diff] [review]
Patch
Review of attachment 804695 [details] [diff] [review]:
-----------------------------------------------------------------
::: content/canvas/src/ImageEncoder.cpp
@@ +130,5 @@
> + if (NS_FAILED(rv)) {
> + mEncodingCompleteEvent->SetFailed();
> + NS_DispatchToMainThread(mEncodingCompleteEvent, NS_DISPATCH_NORMAL);
> + return rv;
> + }
You should wrap this entire function with something like.
nsresult InnerDoWork(args);
nsresult DoWork(args) {
rv = InnerDoWorker(args);
if (NS_FAILED(rv)) {
mEncodingCompleteEvent->SetFailed();
}
NS_DispatchToMainThread(mEncodingCompleteEvent);
return rv;
}
@@ +158,5 @@
> rv = NS_DispatchToMainThread(mEncodingCompleteEvent, NS_DISPATCH_NORMAL);
> + if (NS_FAILED(rv)) {
> + mEncodingCompleteEvent->SetFailed();
> + NS_DispatchToMainThread(mEncodingCompleteEvent, NS_DISPATCH_NORMAL);
> + return rv;
You definitely don't want to try to dispatch again ...
Attachment #804695 -
Flags: review?(khuey) → review-
Assignee | ||
Comment 18•11 years ago
|
||
Thanks Kyle! Addressed feedback.
Attachment #804695 -
Attachment is obsolete: true
Attachment #804785 -
Flags: review?(khuey)
Comment on attachment 804785 [details] [diff] [review]
Patch
Review of attachment 804785 [details] [diff] [review]:
-----------------------------------------------------------------
::: content/canvas/src/ImageEncoder.cpp
@@ +49,4 @@
> }
>
> + mScriptContext = nullptr;
> + mCallback = nullptr;
Add a comment about why you are explicitly setting these to null.
@@ +153,2 @@
> rv = NS_DispatchToMainThread(mEncodingCompleteEvent, NS_DISPATCH_NORMAL);
> NS_ENSURE_SUCCESS(rv, rv);
Not that NS_DispatchToMainThread really fails, but this should be
if (NS_FAILED(rv)) {
// Better to leak than to crash.
mEncodingCompleteEvent.forget();
return rv;
}
Attachment #804785 -
Flags: review?(khuey) → review+
Assignee | ||
Comment 20•11 years ago
|
||
Thanks, Kyle! Addressed feedback. Carrying over r+.
I will land this at the same time as bug 817700 once we're ready to land again.
Attachment #804785 -
Attachment is obsolete: true
Attachment #805307 -
Flags: review+
Comment on attachment 810047 [details] [diff] [review]
Test
Review of attachment 810047 [details] [diff] [review]:
-----------------------------------------------------------------
Please name the test 916128-1.html. Also please credit Christoph Diehl as the author of this commit.
Attachment #810047 -
Flags: review?(khuey) → review+
Assignee | ||
Comment 23•11 years ago
|
||
Addressed feedback, carrying over r+.
Sorry, didn't mean to take credit for the test. Thanks for catching it.
I chose the descriptive file name because of the discussion in firefox-dev[1] and dev-planning[2], but actually prefer a name with just a bug number myself. I'm fine either way.
[1] https://groups.google.com/forum/#!topic/firefox-dev/RIGHJHpxcIY
[2] https://groups.google.com/forum/#!msg/mozilla.dev.planning/HlOyIYwWeb8/NUH1cUa2cgYJ
Attachment #810047 -
Attachment is obsolete: true
Attachment #810535 -
Flags: review+
Assignee | ||
Comment 24•11 years ago
|
||
Assignee | ||
Comment 25•11 years ago
|
||
Assignee | ||
Comment 26•11 years ago
|
||
The test (attachment 804478 [details]) is now failing at http://mxr.mozilla.org/mozilla-central/source/content/html/content/src/HTMLCanvasElement.cpp#600, an assert introduced with bug 916322. The context's and the canvas' sizes are mismatched again.
At the time of failure we have:
elementSize = { width = 0, height = 150 }
mCurrentContext->GetWidth() = 1
mCurrentContext->GetHeight() = 1
Milan, are you able to tell what's going on here?
Flags: needinfo?(milan)
Comment 27•11 years ago
|
||
Even when we pass width (or height) zero to the context, we insist on having a minimum 1x1 size, as 0 size "causes problems" according to the code comments. This would explain the 0 vs. 1. I'm checking on the 150 vs. 1.
Comment 28•11 years ago
|
||
Not sure this is a right thing to do, but let me know if this actually fixes the problem.
Updated•11 years ago
|
Attachment #813597 -
Flags: review?(bas)
Updated•11 years ago
|
Attachment #813597 -
Attachment description: WIP: Since the contexts don't allow <=0 size, don't allow that in the canvas element either. → Since the contexts don't allow <=0 size, don't allow that in the canvas element either.
Assignee | ||
Comment 29•11 years ago
|
||
Milan, this fixes the issue in my testing! mCurrentContext->GetHeight() now also correctly returns 150. Thanks!
Flags: needinfo?(milan)
Comment 30•11 years ago
|
||
Great. Let's wait for the review to see if this is the right approach. May take a few days because of the summit.
Comment 31•11 years ago
|
||
Comment on attachment 813597 [details] [diff] [review]
Since the contexts don't allow <=0 size, don't allow that in the canvas element either.
Review of attachment 813597 [details] [diff] [review]:
-----------------------------------------------------------------
If this is in line with what the specs wants us to do (which I don't know), this looks fine to me.
Attachment #813597 -
Flags: review?(bas) → review+
Comment 32•11 years ago
|
||
(In reply to Bas Schouten (:bas.schouten) from comment #31)
> Comment on attachment 813597 [details] [diff] [review]
> Since the contexts don't allow <=0 size, don't allow that in the canvas
> element either.
>
> Review of attachment 813597 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> If this is in line with what the specs wants us to do (which I don't know),
> this looks fine to me.
The spec calls for "valid non-negative integer", so we should be covered.
Comment 33•11 years ago
|
||
Stephen, looks like you should be able to try again - just make sure you grab my patch along the way.
Flags: needinfo?(spohl.mozilla.bugs)
Assignee | ||
Comment 34•11 years ago
|
||
Great! Let's try this again.
https://hg.mozilla.org/integration/mozilla-inbound/rev/6a3b0b83d3f6
https://hg.mozilla.org/integration/mozilla-inbound/rev/584f639cdb03
https://hg.mozilla.org/integration/mozilla-inbound/rev/0beb30c3e1af
Flags: needinfo?(spohl.mozilla.bugs)
Comment 35•11 years ago
|
||
Looks like this will get back out. Let me spin off the patch I put in into a separate bug and make sure that lands, as it did cause some issues in the tests that were expecting it to fail.
Comment 36•11 years ago
|
||
Backed out because bug 817700 was backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/4026ca2897a0
Comment 37•11 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #32)
> (In reply to Bas Schouten (:bas.schouten) from comment #31)
> > Comment on attachment 813597 [details] [diff] [review]
> > Since the contexts don't allow <=0 size, don't allow that in the canvas
> > element either.
> >
> > Review of attachment 813597 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > If this is in line with what the specs wants us to do (which I don't know),
> > this looks fine to me.
>
> The spec calls for "valid non-negative integer", so we should be covered.
I have no idea how "valid non-negative integer" suggested to me that it's OK to exclude zero.
Comment 38•11 years ago
|
||
Green try with the patch from bug 924573 and the test above.
Comment 39•11 years ago
|
||
Comment on attachment 813597 [details] [diff] [review]
Since the contexts don't allow <=0 size, don't allow that in the canvas element either.
Different patch in bug 924573.
Attachment #813597 -
Flags: review+ → review-
Assignee | ||
Comment 40•11 years ago
|
||
Try run with all patches here, plus patches in bug 817700 and bug 924573 applied. Bug 924573 has already landed on inbound. This bug plus bug 817700 will land simultaneously, assuming try will be green:
https://tbpl.mozilla.org/?tree=Try&rev=8246ac236c59
Comment 41•11 years ago
|
||
If bug 924573 sticks, it'll be worth doing a try run (do not take my r- patch above.)
Assignee | ||
Comment 42•11 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #41)
> If bug 924573 sticks, it'll be worth doing a try run (do not take my r-
> patch above.)
Yup. I've manually applied the patch from bug 924573 instead of the one here. Try run is referenced in comment 40. I'll push bug 817700 and this bug here once bug 924573 is green on inbound and the try run in comment 40 succeeded.
Assignee | ||
Comment 43•11 years ago
|
||
Comment on attachment 813597 [details] [diff] [review]
Since the contexts don't allow <=0 size, don't allow that in the canvas element either.
Marking this patch as obsolete since a new patch landed with bug 924573 and to avoid confusion.
Attachment #813597 -
Attachment is obsolete: true
Assignee | ||
Comment 44•11 years ago
|
||
Try run with opt & debug:
https://tbpl.mozilla.org/?tree=Try&rev=8f08a238adb3
Assignee | ||
Comment 45•11 years ago
|
||
Comment 46•11 years ago
|
||
fixed on mozilla-central
https://hg.mozilla.org/mozilla-central/rev/18d72b851345
https://hg.mozilla.org/mozilla-central/rev/3b3eb94009e6
Status: NEW → RESOLVED
Closed: 11 years ago
status-firefox27:
--- → fixed
Flags: in-testsuite+
Resolution: --- → FIXED
Updated•11 years ago
|
Target Milestone: --- → mozilla27
Comment 47•11 years ago
|
||
Cleaning up list of security bugs for b2g18. This bug doesn't need to be backported either due to it affecting a later version of Fx or another reason.
status-b2g18:
--- → unaffected
Comment 48•11 years ago
|
||
verified with Nightly build 20131024030204
Updated•11 years ago
|
status-firefox26:
--- → unaffected
status-firefox-esr24:
--- → unaffected
Updated•10 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•