perma? leak on osx and windows debug while running mochitest-chrome toolkit/components/extensions/test

RESOLVED FIXED in Firefox 52

Status

defect
P1
normal
RESOLVED FIXED
3 years ago
Last year

People

(Reporter: jmaher, Assigned: kmag)

Tracking

unspecified
mozilla53
Dependency tree / graph

Firefox Tracking Flags

(firefox52+ fixed, firefox53 fixed)

Details

(Whiteboard: [MemShrink:P1]triaged)

Attachments

(5 attachments)

we haven't been reporting leaks as failures, and we recently realized this.  In this mochitest-other log, we can see a failure:
https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1482335753/mozilla-central_yosemite_r7-debug_test-mochitest-other-bm108-tests1-macosx-build30.txt.gz

10:10:25  WARNING - TEST-UNEXPECTED-FAIL | leakcheck | default process: 4384765 bytes leaked (AbstractThread, AbstractWatcher, Animation, AnimationEffectReadOnly, AnimationEffectTimingReadOnly, ...)

there is a lot of information above that, in fact this could be similar or a duplicate of bug 1325149.  I am not sure the best way to prove that.
I think this same leak is happening intermittently across the board (linux/windows/mac in mochitest plain, browser and chrome).

It will be whitelisted in bug 1325148 so I can get the leak fix landed and prevent further damage, but this will still need to be prioritized.
Hm. This seems to be another one caused by an XMLHttpRequest with JS notificationCallbacks.
It turns out there were two major issues here:

1) XMLHttpRequest's onVerifyRedirectCallback returns the error value it was passed in its arguments, which causes JavaScript callers to re-throw that exception, which leads asyncOnChannelRedirect handlers which call it synchronously to likewise throw that error, and attempt to cancel the redirect twice. Combined with the fact that nsAsyncRedirectVerifyHelper doesn't correctly handle extra calls (and winds up with a negative callback count, rather than the 0 it's expecting), that means that the actual underlying onVerifyRedirectCallback is never called, and the redirect chain never completes.

2) We were creating a reference cycle through HTTP channels that we were monitoring, through listener objects we stored in their property bags. Since HTTP channels aren't cycle collection aware, those channels wound up being leaked.

There was also a smaller issue, where our test helpers sometimes failed to remove a global event listener, which made the leak a bit bigger.
Assignee: nobody → kmaglione+bmo
Comment on attachment 8821793 [details]
Bug 1325158: Part 3 - Don't create a reference cycle via HTTP channel property bag.

https://reviewboard.mozilla.org/r/100914/#review101700
Attachment #8821793 - Flags: review?(mixedpuppy) → review+
Blocks: 1325947
Comment on attachment 8821791 [details]
Bug 1325158: Part 1 - Don't return error result from OnRedirectVerifyCallback.

https://reviewboard.mozilla.org/r/100910/#review101750

::: dom/xhr/XMLHttpRequestMainThread.cpp:3384
(Diff revision 1)
>    mNewRedirectChannel = nullptr;
>  
>    mRedirectCallback->OnRedirectVerifyCallback(result);
>    mRedirectCallback = nullptr;
>  
> -  return result;
> +  return NS_OK;

What a footgun.  :(

Can you please add a comment stating the importance of returning NS_OK here?
Attachment #8821791 - Flags: review?(ehsan) → review+
Comment on attachment 8821792 [details]
Bug 1325158: Part 2 - Correctly handle extra OnRedirectVerifyCallback calls.

https://reviewboard.mozilla.org/r/100912/#review101752

I prefer if a Necko peer would give a thumbs up on this instead of me, even though the patch is definitely good and needed as it is IMO.  I'm mainly thinking about whether it is doing enough...

::: netwerk/base/nsAsyncRedirectVerifyHelper.cpp:109
(Diff revision 1)
>  {
>      LOG(("nsAsyncRedirectVerifyHelper::OnRedirectVerifyCallback() "
>           "result=%x expectedCBs=%u mResult=%x",
>           result, mExpectedCallbacks, mResult));
>  
> +    if (mExpectedCallbacks <= 0) {

We may want to do a diagnostic assert here.  It would be nice if we can somehow detect more of these issues if we have them, since as far as I can tell with this change we still have a silent failure which may go undetected.
Attachment #8821792 - Flags: review?(ehsan)
Comment on attachment 8821794 [details]
Bug 1325158: Part 4 - Fix cases where extension startup listener is not removed.

https://reviewboard.mozilla.org/r/100916/#review101898
Attachment #8821794 - Flags: review?(aswan) → review+
Which branches should this be tracked on and backported to?
Flags: needinfo?(kmaglione+bmo)
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #12)
> Which branches should this be tracked on and backported to?

That's a good question. Part 3, at least, needs to be uplifted to 52, which is when that reference cycle was introduced.

The leak fixed by parts 1 and 2 was introduced in 53, by bug 1305421, but those patches should probably be uplifted anyway, since they almost certainly affect other code.
Flags: needinfo?(kmaglione+bmo)
Boris, can you review part 2? There aren't many contributors to that module available at the moment.
Flags: needinfo?(bzbarsky)
[Tracking Requested - why for this release]:

I don't understand the new redirect callback setup well enough to review changes to it, sorry.  I could do it if I spent some time studying that stuff, I guess...  It would be better if someone like Honza reviewed that bit.
Flags: needinfo?(bzbarsky)
Honza is away until the 4th, so I guess it will have to wait until the new year, then.

I'll land the other parts in the mean time. They should be enough to fix the immediate problem.
Keywords: leave-open
This bug also needs a part 5, removing this dir from the "don't report leaks" list, right?  Specifically from https://hg.mozilla.org/mozilla-central/file/cf567de3d614/testing/mochitest/runtests.py#l2219 I think.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #17)
> This bug also needs a part 5, removing this dir from the "don't report
> leaks" list, right?  Specifically from
> https://hg.mozilla.org/mozilla-central/file/cf567de3d614/testing/mochitest/
> runtests.py#l2219 I think.

Yeah, I already pushed to try with that change. That whitelist hadn't landed when I wrote the original patches.
https://hg.mozilla.org/integration/mozilla-inbound/rev/c470c3365f40a87fd5247df5500bc31435be41dd
Bug 1325158: Part 1 - Don't return error result from OnRedirectVerifyCallback. r=ehsan

https://hg.mozilla.org/integration/mozilla-inbound/rev/12d17dd55f5c0be6daf472fec6f1d613c122c157
Bug 1325158: Part 3 - Don't create a reference cycle via HTTP channel property bag. r=mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/933325344e21a683e2a08f01fd7892ec835ea992
Bug 1325158: Part 4 - Fix cases where extension startup listener is not removed. r=aswan
Comment on attachment 8822740 [details]
Bug 1325158: Part 5 - Re-enable leak checks for extension tests.

https://reviewboard.mozilla.org/r/101544/#review102054

thanks, this gets us much more leak checking.  Have you tested on ASAN?  If so, I have no concerns.
Attachment #8822740 - Flags: review?(jmaher) → review+
(In reply to Joel Maher ( :jmaher) from comment #25)
> Have you tested on ASAN?  If so, I have no concerns.

Nope. Will do before landing. Thanks.
Hmm, I'm not sure if the review request from me is intentional or not...  I don't have much to add to comment 10.  Did you really want more input from me, Kris?
Flags: needinfo?(kmaglione+bmo)
(In reply to :Ehsan Akhgari from comment #28)
> Hmm, I'm not sure if the review request from me is intentional or not...  I
> don't have much to add to comment 10.  Did you really want more input from
> me, Kris?

No, sorry, mozreview fail. I removed you from the summary, but I guess it re-requested review anyway.
Flags: needinfo?(kmaglione+bmo)
Attachment #8821792 - Flags: review?(ehsan)
Tracking 52+.

Is part 2 intentionally not flagged for review?
Attachment #8821792 - Flags: review?(honzab.moz)
Whiteboard: [MemShrink]
No longer blocks: 1325947
Priority: -- → P1
Whiteboard: [MemShrink] → [MemShrink]triaged
Whiteboard: [MemShrink]triaged → [MemShrink:P1]triaged
Comment on attachment 8821791 [details]
Bug 1325158: Part 1 - Don't return error result from OnRedirectVerifyCallback.

https://reviewboard.mozilla.org/r/100910/#review105152

::: dom/xhr/XMLHttpRequestMainThread.cpp:3387
(Diff revision 2)
>    mRedirectCallback = nullptr;
>  
> -  return result;
> +  // It's important that we return success here. If we return the result code
> +  // that we were passed, JavaScript callers who cancel the redirect will wind
> +  // up throwing an exception in the process.
> +  return NS_OK;

auch...  OTOH, it's clearly documented how the async redir callback has to be handled.  But true is there is absolutely no way to catch this mistake runtime :(
Comment on attachment 8821792 [details]
Bug 1325158: Part 2 - Correctly handle extra OnRedirectVerifyCallback calls.

https://reviewboard.mozilla.org/r/100912/#review105150

r+ with comments.

::: netwerk/base/nsAsyncRedirectVerifyHelper.cpp:111
(Diff revision 2)
>           "result=%x expectedCBs=%u mResult=%x",
>           result, mExpectedCallbacks, mResult));
>  
> +    if (mExpectedCallbacks <= 0) {
> +      NS_ERROR("OnRedirectVerifyCallback called more times than expected");
> +      return NS_ERROR_UNEXPECTED;

this should be a RELEASE_ASSERT or DIAGNOSTIC_ASSERT...  if we get the call more than expected times, it's then actually impossible to decide on the result!

in some simple cases the stack trace would at least give a hint where from the doubled call comes.
Attachment #8821792 - Flags: review?(honzab.moz) → review+
Keywords: leave-open
https://hg.mozilla.org/integration/mozilla-inbound/rev/8241272efd6a432caa36ca9b2da4bd8c8cb7e9f8
Bug 1325158: Part 2 - Correctly handle extra OnRedirectVerifyCallback calls. r=honzab
https://hg.mozilla.org/mozilla-central/rev/8241272efd6a
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Hey Kris, I assume we're going to want to uplift this to 52 for the next ESR still?
Flags: needinfo?(kmaglione+bmo)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #37)
> Hey Kris, I assume we're going to want to uplift this to 52 for the next ESR
> still?

Should we also consider uplifting this fix to Beta52 and Aurora53? I noticed it while reviewing 52+ tracked bugs.
Hm. Yes, I thought this had already been uplifted to 52. I'll file uplift requests.
Flags: needinfo?(kmaglione+bmo)
Comment on attachment 8821793 [details]
Bug 1325158: Part 3 - Don't create a reference cycle via HTTP channel property bag.

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1305217
[User impact if declined]: This HTTP requests to leak when certain webRequest events are being monitored by extensions.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Automated tests should be enough to verify this.
[Needs manual test from QE? If yes, steps to reproduce]: N/A
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: Very low-risk.
[Why is the change risky/not risky?]: This change simply avoids creating a reference cycle by storing a boolean, rather than an object instance, in a property which was only used as a flag.
[String changes made/needed]: None.
Attachment #8821793 - Flags: approval-mozilla-beta?
Comment on attachment 8821791 [details]
Bug 1325158: Part 1 - Don't return error result from OnRedirectVerifyCallback.

Approval Request Comment
[Feature/Bug causing the regression]: Unknown.
[User impact if declined]: This issue causes HTTP requests to leak under certain circumstances where async redirect verification handlers are used. The particular in-tree code which causes the leak does not exist prior to Firefox 52, but the issue may be triggered by add-on code, or other in-tree code that we don't know about.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Only in automated tests.
[Needs manual test from QE? If yes, steps to reproduce]: Not necessary. Automated tests should be sufficient here.
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: Low-risk.
[Why is the change risky/not risky?]: The changes in these two patches simply prevent the unexpected propagation of an error code, and fix the handling of extra redirect verify calls. A new assertion was added in the second patch, but it is a diagnostic assert which does not affect beta or release builds.
[String changes made/needed]: None.
Attachment #8821791 - Flags: approval-mozilla-beta?
Comment on attachment 8821792 [details]
Bug 1325158: Part 2 - Correctly handle extra OnRedirectVerifyCallback calls.

Approval Request Comment: See comment 41
Attachment #8821792 - Flags: approval-mozilla-aurora?
Attachment #8821792 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
Comment on attachment 8821791 [details]
Bug 1325158: Part 1 - Don't return error result from OnRedirectVerifyCallback.

fix http request leak, beta52+
Attachment #8821791 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8821792 [details]
Bug 1325158: Part 2 - Correctly handle extra OnRedirectVerifyCallback calls.

early return on extra OnRedirectVerifyCallback, beta52+
Attachment #8821792 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8821793 [details]
Bug 1325158: Part 3 - Don't create a reference cycle via HTTP channel property bag.

http request leak, beta52+
Attachment #8821793 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.