Closed Bug 637957 Opened 13 years ago Closed 13 years ago

Make notifiers strong owners of mCallback, to prevent crashes due to unexpected events ordering

Categories

(Toolkit :: Storage, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0
Tracking Status
blocking2.0 --- final+
blocking1.9.2 --- .17+
status1.9.2 --- .17-fixed
blocking1.9.1 --- .19+
status1.9.1 --- .19-fixed

People

(Reporter: mak, Assigned: mak)

References

Details

(Whiteboard: [sg:moderate][hardblocker][tests in bug 638123][has patch])

Attachments

(3 files, 4 obsolete files)

See https://bugzilla.mozilla.org/show_bug.cgi?id=634796#c22

While spinnging the events loop in a HandleResult is not nice at all, owenrship should be changed so that it won't make us crash at random addresses.
Changing ownership here, should pay particularly attention that releases will happen on the right thread.
Summary: Made notifiers strong owners of mCallback, to prevent crashes due to unexpected events ordering → Make notifiers strong owners of mCallback, to prevent crashes due to unexpected events ordering
(In reply to comment #1)
> Changing ownership here, should pay particularly attention that releases will
> happen on the right thread.
Releases will absolutely happen on the correct thread; the callback is being run on the thread it was called on (which is the one the releases should happen on).
(In reply to comment #2)
> (In reply to comment #1)
> > Changing ownership here, should pay particularly attention that releases will
> > happen on the right thread.
> Releases will absolutely happen on the correct thread; the callback is being
> run on the thread it was called on (which is the one the releases should happen
> on).

Yes it runs on the correct thread, but the runnable is not created on the same thread, so it could still release mCallback on the thread where the runnable was created rather than the thread where mCallback runs.
For example AsyncGetBookmarksForURI should be destroyed on the main-thread, the results notifier runs on the main thread, but is created and destroyed on the async thread afaict.
So, we have to release the ownership explicitely in Run(), this is fine, there is still the AddRef problem... Actually Run() could just:
AddRef
->HandleResult
Release
Attached patch patch v1.0 (obsolete) — Splinter Review
The test is not perfect, but works. It is practically partially running on destruction, thus I use NS_RUNTIMEABORT, looks like I'm unable to make it properly fake-async (my tries with spinning ended up making it hang).
Attachment #516261 - Flags: review?(sdwilsh)
Attached patch patch v1.0 (8 lines context) (obsolete) — Splinter Review
better for review
Attachment #516261 - Attachment is obsolete: true
Attachment #516261 - Flags: review?(sdwilsh)
Attachment #516262 - Flags: review?(sdwilsh)
OS: Windows 7 → All
Hardware: x86 → All
This fixes bug 634796, so I'm promoting this to hardblocker.  Review coming shortly...
Assignee: nobody → mak77
Status: NEW → ASSIGNED
blocking2.0: --- → final+
Whiteboard: [hardblocker]
Comment on attachment 516262 [details] [diff] [review]
patch v1.0 (8 lines context)

>+++ b/storage/src/mozStorageAsyncStatementExecution.cpp
>+    // Hold a strong reference to the callback while notifying it, so that if it
>+    // should spin the events loop, won't be able to release it in spite of us.
>+    nsCOMPtr<mozIStorageStatementCallback> callback = mCallback;
nit: "...spin the event loop, the callback won't be released and freed out from under us."
Also, this needs to be done for ErrorCallback, with a test as well (easiest way to get that callback is to trigger a constraint error by trying to inset a duplicate primary key).

>+    // Ensure to release our reference on this thread.
>+    callback = 0;
You don't need to do this.  nsCOMPtr's destructor will run the moment this method returns, which will happen on this thread.

>+++ b/storage/test/test_HandleResult_spinEventsLoop.cpp
How about we go with test_async_callbacks_with_spun_event_loops.cpp?

I don't think the test actually tests this, however.  With that said, I'm OK with getting the fix in, and then spending time out of the release critical path to get the test right.  What we really want to test is this:
1) Fire off a query that would get a handleResult callback (ideally, just one).
2) When we get the handleResult call, spin the event loop until we get a handleCompletion callback
3) Set a heap variable that isn't a member of the callback object to true indicating that the object is currently alive.  In the callbacks destructor, we'll want to set this to false.  (we may need to create this heap variable before step 4, and may set it earlier too)
4) At this point, our handleResult spinning will be complete, and we should ensure that our heap variable that is set to false when our destructor runs is still true.

The above test should fail without your patch.  You can replace handleResult with handleError for the additional testcase I asked for.

r=sdwilsh on the code changes.  We need to work on the test more though, so file a follow-up please.
Attachment #516262 - Flags: review?(sdwilsh) → review+
> 1) Fire off a query that would get a handleResult callback (ideally, just one).

the test does this

> 2) When we get the handleResult call, spin the event loop until we get a
> handleCompletion callback

and does this

> 3) Set a heap variable that isn't a member of the callback object to true
> indicating that the object is currently alive.  In the callbacks destructor,
> we'll want to set this to false.  (we may need to create this heap variable
> before step 4, and may set it earlier too)

it's done through a local property of the callback, I like the idea of a global var though, or maybe a static property.

> 4) At this point, our handleResult spinning will be complete, and we should
> ensure that our heap variable that is set to false when our destructor runs is
> still true.

and does this

> The above test should fail without your patch.

and does this!

My test does exactly what you said, apart using a global variable, could be made more readable maybe.

 You can replace handleResult
> with handleError for the additional testcase I asked for.

will do
Depends on: 638123
I'm splitting the tests to bug 638123, since Shawn currently cannot do a full review of it and we can come with a more polished test.

The fix is pretty much straight-forward and I'll run the tests locally before pushing since they already work fine, we can land the tests in the next days on tinderboxes once properly reviewed.
Whiteboard: [hardblocker] → [hardblocker][tests in bug 638123]
Attached patch patch v2.0 (obsolete) — Splinter Review
Addressed comments.

Before pushing I'll fix and run tests locally, then will attach them to the followup for review (so this won't block the release).
Attachment #516262 - Attachment is obsolete: true
Attached patch patch v2.1Splinter Review
Per IRC discussion.
Attachment #516289 - Attachment is obsolete: true
Whiteboard: [hardblocker][tests in bug 638123] → [hardblocker][tests in bug 638123][has patch]
I attached tests I used locally to bug 638123, after double checking they fail before the patch and pass after.  Those tests will land apart per comment 10.

http://hg.mozilla.org/mozilla-central/rev/290403971149
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0
pushed a follow-up, I forgot to properly QI and luckily a Places test complained!
http://hg.mozilla.org/mozilla-central/rev/38028bf13fb2
(In reply to comment #14)
> pushed a follow-up, I forgot to properly QI and luckily a Places test
> complained!
> http://hg.mozilla.org/mozilla-central/rev/38028bf13fb2
Er, what?  mCallback in both methods is a  mozIStorageStatementCallback*, so that QI doesn't do anything useful.
...and yet that is very clearly what that assertion is saying.  Can somebody explain to me why?  I'm guessing XPConnect doesn't create the QI until it's needed, but would love a confirmation.
That fixed the error locally, i preferred avoiding a permaorange.
Looking at the thing, looks like there is some wrong iSupports inheritance in nsNavHistoryFolderResultNode, when test_async.js runs it probably doesn't cast it properly due to a bogus NS_IMPL_ISUPPORTS_INHERITED
or most likely ambiguous inheritance
(In reply to comment #17)
> That fixed the error locally, i preferred avoiding a permaorange.
> Looking at the thing, looks like there is some wrong iSupports inheritance in
> nsNavHistoryFolderResultNode, when test_async.js runs it probably doesn't cast
> it properly due to a bogus NS_IMPL_ISUPPORTS_INHERITED
I'm still pretty sure we shouldn't have to do that.  Can you file a bug about it so we can get the right fix in eventually please?
sure, fwiw Neil suggested we could use a RefPtr to save the QI
Depends on: 638405
This can fix some crashes on branch (especially when Sync is installed).
blocking1.9.2: --- → ?
blocking1.9.1: --- → ?
blocking1.9.2: ? → .16+
status1.9.1: --- → ?
Does this apply to the 1.9.1 branch also?

We'll need roll-up patches to approve for the branches. Looks like there were at least two chunks checked in to fix this.
(In reply to comment #23)
> Does this apply to the 1.9.1 branch also?
It should, yeah.

> We'll need roll-up patches to approve for the branches. Looks like there were
> at least two chunks checked in to fix this.
I'll do that later today.
Attached patch 1.9.2 rollup (obsolete) — Splinter Review
Attachment #517502 - Flags: approval1.9.2.16?
Comment on attachment 517502 [details] [diff] [review]
1.9.2 rollup

This is very clearly not what I intended to upload...
Attachment #517502 - Attachment is obsolete: true
Attachment #517502 - Flags: approval1.9.2.16?
Somehow, my 1.9.2 tree was totally hosed.  Repulling and trying again...
blocking1.9.2: .15+ → .16+
Attached patch 1.9.2 rollupSplinter Review
Attachment #517555 - Flags: approval1.9.2.16?
Attached patch 1.9.1 rollupSplinter Review
Filename difference here and we use a different method name for checking if we should run (runEvent vs shouldNotify).  Otherwise, it's all the same.
Attachment #517556 - Flags: approval1.9.1.18?
blocking1.9.1: ? → .18+
Attachment #517555 - Flags: approval1.9.2.16? → approval1.9.2.16+
Attachment #517556 - Flags: approval1.9.1.18? → approval1.9.1.18+
Group: core-security
Whiteboard: [hardblocker][tests in bug 638123][has patch] → [sg:moderate][hardblocker][tests in bug 638123][has patch]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: