924565 - strong observer ref prevents GC of DOMRequestHelper objects

Assignee

Description

•

11 years ago

Recent results on datazilla show a sequentially increasing launch time for the browser app.  See this link and click on one of the datapoints to view per launch times below:

  https://datazilla.mozilla.org/b2g/?branch=master&device=unagi&range=7&test=cold_load_time&app_list=browser&app=browser&gaia_rev=2ced1b05bec1f554&gecko_rev=fb530eed0484e067

I am able to reproduce this locally on my buri and did some preliminary bisecting.  This is not caused by a change in gaia revision, but does seem to change with gecko revision.

To quantify problem I took the results from running:

  b2gperf --delay=10 --no-restart Browser

Put them in a spreadsheet and then used LINEST() to perform a linear regression.  Random results should show around 0 for the line coefficient, while increasing results should show a positive number for the coefficient.  The results I got:

  Gecko hg rev fe7961bba227:  -0.01 line coefficient (GOOD)
  Gecko hg rev 64b497e6f593:   7.68 line coefficient (BAD)

Further bisecting is needed to narrow the regression range.

This is a core problem, but putting in gaia:browser for now until we identify the underlying cause.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 1

•

11 years ago

Attached image Screenshot from 2013-10-08 15:12:37.png — Details

Screenshot of increasing launch times in datazilla since we can't link to specific results.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 2

•

11 years ago

Here is the spreadsheet I used to calculate the results in comment 0 with the full b2gperf data.

https://docs.google.com/spreadsheet/ccc?key=0Ai4XIgrcTvUQdFZ3STlTeEt5czlpMUtKM2o0TjYyMGc&usp=sharing

Mike Lee [:mlee]

Updated

•

11 years ago

Whiteboard: [c=progress p= s= u=]

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 3

•

11 years ago

Attached image Screenshot from 2013-10-08 16:30:52.png — Details

Note, this appears to effect v1.2 as well:

  https://datazilla.mozilla.org/b2g/?branch=v1.2&device=hamachi&range=7&test=cold_load_time&app_list=browser&app=browser&gaia_rev=85c4af3f48a91878&gecko_rev=72011db807e6a5e4

The attachment is a screenshot.

So whatever caused this must have been uplifted from m-c to m-a.

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

blocking-b2g: --- → koi?

status-b2g-v1.2: --- → affected

Preeti Raghunath(:Preeti)

Comment 4

•

11 years ago

moving to 1.3? per triage with team.

blocking-b2g: koi? → 1.3?

Jason Smith [:jsmith]

Comment 5

•

11 years ago

Ben - Is this a regression from 1.1? If it is, feel free to renom.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 6

•

11 years ago

Jason, I believe it is a regression, but I was part of the triage that koi-'d this.  The impact seems small in these tests and its unclear that it will be visible to the user.  I guess I wasn't sure if it was worth blocking on given the other blockers we have.

What do you think?

Flags: needinfo?(jsmith)

Jason Smith [:jsmith]

Comment 7

•

11 years ago

How much slower is the launch time overtime? That can help make an assessment if we need to consider blocking on this or not.

Flags: needinfo?(jsmith)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 8

•

11 years ago

For browser its about 100ms slower at the end of the test.  The big unknown in my mind, though, is what happens if we kept launching browser.  Would it continue to worsen or stabilize?  I don't know the answer to that at the moment.

Jason Smith [:jsmith]

Comment 9

•

11 years ago

100ms is enough to be a bit noticable that it's worthy of finding the regressing patch and potentially backing it out or fixing it. Let me get a second opinion from Dietrich on this, as he's worked a lot on triaging performance regressions when he worked as a developer.

Dietrich - What's your take on this? Should we live with this or push for a back out on the regressing patch?

Flags: needinfo?(dietrich)

[:fabrice] Fabrice Desré

Comment 10

•

11 years ago

Ben, can you do a build and test with a backout of bug Bug 915598 ?

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 11

•

11 years ago

Sure, will do.  Ni? to myself to remember.

Flags: needinfo?(bkelly)

Dietrich Ayala (:dietrich)

Comment 12

•

11 years ago

This is definitely something we can't live with. 100ms is huge. Not really a question that a perf regression should be backed out, IMO.

Flags: needinfo?(dietrich)

Jason Smith [:jsmith]

Comment 13

•

11 years ago

Putting back to koi? per the discussion above - we need to get the regressing patch backed out that caused this regression.

blocking-b2g: 1.3? → koi?

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 14

•

11 years ago

Initial test with bug 915598 backed out looks good.  I added the data to spreadsheet in comment 2.

Regression coefficients are:

  Gecko hg 558cf02e6b9b:                   4.55
  Gecko hg 558cf02e6b9b + 915598 revert:  -1.35

I'll see if I can do some more runs to make sure the results are stable.  Just takes a long time, unfortunately.

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 15

•

11 years ago

Yea, I'd say this is caused by bug 915598.  I did two more runs with and without the revert.  So including the run from comment 14 the linear regression coefficients were:

  Gecko hg 558cf02e6b9b:                   4.55, 8.29, 4.85
  Gecko hg 558cf02e6b9b + 915598 revert:  -1.35, 0.35, 0.27

Again, values close to 0 are good and values greater than zero indicate the launch is getting slower with progressive launches.

I got profiles of the b2g process when the problem is occurring.  Profile from early in the run before things get slow:

  http://people.mozilla.org/~bgirard/cleopatra/#report=b84b76f76d3c2c8a348d5221c55e914ea619fba0

Profile from late in the run after things get slow:

  http://people.mozilla.org/~bgirard/cleopatra/#report=8c6d32d41fe6134bd22588075438183e304f81f7

Looks like a lot more cycle collection is going on in the second case.

Unfortunately I didn't have any luck profiling the Browser app itself.  Since b2gperf starts and stops the app so quickly I didn't have time to signal the process and collect the data.  This might be possible to do by changing b2gperf to wait a while before killing the app, but I don't know if that would interfere with the conditions that trigger the problem.

Fabrice, does this help at all?

Blocks: 915598

Flags: needinfo?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 16

•

11 years ago

Hmm, but if I'm reading the profile right most of that cycle collection is happening after the app is killed.  So that extra work should not be directly contributing to the launch time metric.

[:fabrice] Fabrice Desré

Comment 17

•

11 years ago

Ben, the browser app runs in process for now. So unless you need a tab to load content to expose the issue, profiling the b2g process gives you browser app profiling.

Do we have the same regression in load times with other apps?

I'd like to let us a chance to understand and fix that before backing out bug 915598 though. Fernando, it may be that we don't remove message listeners and then we end up in a slow path in the frame message manager (I try to fix some of these in bug 920804). If you don't have time I'll investigate that tomorrow.

Flags: needinfo?(fabrice) → needinfo?(ferjmoreno)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 18

•

11 years ago

(In reply to Fabrice Desré [:fabrice] from comment #17)
> Do we have the same regression in load times with other apps?

From looking at datazilla I have suspected that we might, but its not as pronounced as for Browser.  That could just because there is more going on in most apps and its obscured by the noise.

Looking at the recently added Template app, though, I do see an increasing launch time:

  http://mzl.la/17p5WE3

So yes, I do think this probably is affecting all apps.

Dietrich Ayala (:dietrich)

Updated

•

11 years ago

Target Milestone: --- → 1.2 C3(Oct25)

Fernando Jiménez Moreno [:ferjm]

Comment 19

•

11 years ago

(In reply to Fabrice Desré [:fabrice] from comment #17) 
> I'd like to let us a chance to understand and fix that before backing out
> bug 915598 though. Fernando, it may be that we don't remove message
> listeners and then we end up in a slow path in the frame message manager (I
> try to fix some of these in bug 920804). If you don't have time I'll
> investigate that tomorrow.

I did some investigation on this and it seems that for the Browser app (in-process) we are not removing the SystemMessageManager listeners added at [1]:

== 1st run ==
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 344 
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 343
I/GeckoMM (24083): RemoveMessageListener = Listeners count 971 

== 2nd run ==
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 344 
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 343 
I/GeckoMM (24083): RemoveMessageListener = Listeners count 973

== 3rd run ==
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 344 
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 343 
I/GeckoMM (24083): RemoveMessageListener = Listeners count 975

== 4th run ==
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 344 
I/GeckoMM (24187): RemoveWeakMessageListener = Listeners count 343 
I/GeckoMM (24083): RemoveMessageListener = Listeners count 977

(24083 is the b2g process and 24187 the Homescreen app)

Which means that we will have a bigger message listener list on the parent side to iterate on each run.

For other apps (OOP) we are not explicitly removing six message listeners:

I/GeckoMM ( 2861): RemoveMessageListener = Listeners count 7 
I/GeckoMM ( 2861): Listener -> Activities:RegisterContentTypes
I/GeckoMM ( 2861): Listener -> Activities:UnregisterContentTypes
I/GeckoMM ( 2861): Listener -> Webapps:AddApp
I/GeckoMM ( 2861): Listener -> Webapps:RemoveApp
I/GeckoMM ( 2861): Listener -> Settings:Change:Return:OK
I/GeckoMM ( 2861): RemoveMessageListener Settings:Change:Return:OK 
I/GeckoMM ( 2861): Listener -> SystemMessageManager:GetPendingMessages:Return
I/GeckoMM ( 2861): Listener -> SystemMessageManager:GetPendingMessages:Return

But I guess that's actually not a problem cause the process is gone and the listeners dies with the process.

We definitely need bug 920804 and bug 918479.

[1] https://mxr.mozilla.org/mozilla-central/source/dom/messages/SystemMessageManager.js#251

Flags: needinfo?(ferjmoreno)

[:fabrice] Fabrice Desré

Comment 20

•

11 years ago

We should remove the SystemMessageManager messages though no? Why are we keeping them around?

Fernando Jiménez Moreno [:ferjm]

Comment 21

•

11 years ago

That's what I am trying to figure out

Mike Lee [:mlee]

Comment 22

•

11 years ago

Fernando,

Assigning this to you since you appear to be working on this. If you're not and can't commit to this for 1.2 please comment. Our option at that point would be to back out bug 915598.

FxOS Perf Triage

Assignee: nobody → ferjmoreno

Status: NEW → ASSIGNED

blocking-b2g: koi? → koi+

Priority: -- → P1

Whiteboard: [c=progress p= s= u=] → [c=progress p= s= u=1.2]

Ben Francis [:benfrancis]

Comment 23

•

11 years ago

It seems like this code lives in DOM, is that right?

Component: Gaia::Browser → DOM

Product: Boot2Gecko → Core

Fernando Jiménez Moreno [:ferjm]

Comment 24

•

11 years ago

(In reply to Fabrice Desré [:fabrice] from comment #20)
> We should remove the SystemMessageManager messages though no? Why are we
> keeping them around?

After bug 924971 we remove these messages, but it doesn't help with the perf regression.

Fernando Jiménez Moreno [:ferjm]

Updated

•

11 years ago

Flags: needinfo?(gsvelto)

Dietrich Ayala (:dietrich)

Comment 25

•

11 years ago

Fernando, what's your question for Gabriele?

Also, at what point should we back out bug bug 915598? We can't ship 1.2 with this regression, so we should consider what an alternative to a fix looks like.

Fernando Jiménez Moreno [:ferjm]

Comment 26

•

11 years ago

(In reply to Dietrich Ayala (:dietrich) from comment #25)
> Fernando, what's your question for Gabriele?
> 

I've been talking to him via IRC. He is going to help me analyze the differences between the profiles at comment 15.

> Also, at what point should we back out bug bug 915598? We can't ship 1.2
> with this regression, so we should consider what an alternative to a fix
> looks like.

I would like to have Gabriele feedback before backing out anything, if that is possible.

Gabriele Svelto [:gsvelto]

Comment 27

•

11 years ago

I did a bunch of tests on an actual device and I've taken profiles of the 1st, 10th, 20th and 30th run:

http://people.mozilla.org/~bgirard/cleopatra/#report=22d99a36addf2064e46688fa12b7d91b604ded2e
http://people.mozilla.org/~bgirard/cleopatra/#report=108d98f9c3ad39c447ea7bcf2de7415372e43d9c
http://people.mozilla.org/~bgirard/cleopatra/#report=185351d0d36c689459ea94c6379ecc0f222776d5
http://people.mozilla.org/~bgirard/cleopatra/#report=fb22773d329aa796896bea6d918257d8eca7eaf5

The 10th run appears longer than the first though the profile is a bit murky but the 20th and 30th run are identical to the first. I've actually tried this a few times and I always end up with the same result so I get the feeling that the slow-down might be caused by the test-harness.

Flags: needinfo?(gsvelto)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 28

•

11 years ago

Gabriele, were you profiling the browser app itself or the b2g parent process?  I did see some differences in the parent process in the profiles in comment 15.  In particular a lot more cycle collection.

Also, it seems like you're only catching the end of the browser launch and missing the beginning?  I understand this since its hard to catch the beginning with a repeated test like b2gperf.  Just wondering if we are missing something there.

Flags: needinfo?(gsvelto)

Fernando Jiménez Moreno [:ferjm]

Comment 29

•

11 years ago

Thanks Gabriele!

I added a few logs to get the number of listeners that we add to the nsFrameMessageManager and the time to deliver IPC messages and I found out that the number of listeners keeps growing and growing when running b2gperf.

I/GeckoMM (  108): RemoveMessageListener = Listeners count 2570
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2569
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2571
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2570
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2569
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2842
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2841
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2843
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2842
I/GeckoMM (  108): RemoveMessageListener = Listeners count 2841
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3114
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3113
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3115
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3114
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3113
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3386
I/GeckoMM (  108): RemoveMessageListener = Listeners count 3385

But fortunately it stays the same when running manual tests on the device:

I/GeckoMM (  108): RemoveMessageListener = Listeners count 882 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 881 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 880 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 882 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 881 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 880 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 882 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 881 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 880 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 882 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 881 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 880 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 882 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 881 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 880 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 882 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 881 
I/GeckoMM (  108): RemoveMessageListener = Listeners count 880 


I could also noticed that the Webapps:Launch:Return:OK takes longer with each run via b2gperf:

I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 98358
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 30609
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 104278
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 32044
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 96436
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 33509
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 97839
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 34668
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 95764
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 35553
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 102935
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 36895
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 95795
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 37751
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 102784
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 39001
I/GeckoMM (  108): AsyncMessageToSameProcessParent Webapps:Launch took 95947
I/GeckoMM (  108): AsyncMessageToSameProcessChild Webapps:Launch:Return:OK took 40008

Fernando Jiménez Moreno [:ferjm]

Comment 30

•

11 years ago

Attached patch Debug nsFrameMessageManager and DOMRequestHelper — Details — Splinter Review

This is the patch that I used to get the log messages

Fernando Jiménez Moreno [:ferjm]

Updated

•

11 years ago

Attachment #822397 - Attachment description: domrequesthelper → Debug nsFrameMessageManager and DOMRequestHelper

Gabriele Svelto [:gsvelto]

Comment 31

•

11 years ago

(In reply to Ben Kelly [:bkelly] from comment #28)
> Gabriele, were you profiling the browser app itself or the b2g parent
> process?  I did see some differences in the parent process in the profiles
> in comment 15.  In particular a lot more cycle collection.

I tested only the browser app, looking at what :ferjm found in comment 29 the extra garbage collections in the main process must have been triggered by the leaked listeners.

> Also, it seems like you're only catching the end of the browser launch and
> missing the beginning?

It looks this way because on the devices the browser app does not start from the pre-launch process but is instead started from scratch.

Flags: needinfo?(gsvelto)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 32

•

11 years ago

Regarding the listeners in comment 29, does that mean this is an instance of bug 915220 which appeared to have been caused/related to bug 900221?

Fernando Jiménez Moreno [:ferjm]

Comment 33

•

11 years ago

IMHO this should not be a blocker anymore since it only affects the tests.

(In reply to Ben Kelly [:bkelly] (PTO, Back on Tuesday Oct 29) from comment #32)
> Regarding the listeners in comment 29, does that mean this is an instance of
> bug 915220 which appeared to have been caused/related to bug 900221?

Might be. We'll need to check what's going on with b2gperf.

blocking-b2g: koi+ → koi?

Summary: b2g browser launch times worsen over time → b2g browser launch times worsen over time when running b2gperf tests

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 34

•

11 years ago

(In reply to Gabriele Svelto [:gsvelto] from comment #27)
> The 10th run appears longer than the first though the profile is a bit murky
> but the 20th and 30th run are identical to the first. I've actually tried
> this a few times and I always end up with the same result so I get the
> feeling that the slow-down might be caused by the test-harness.

Gabriele, did you maintain the timing during the test or did you pause things to take measurements?

The profiles I did in comment 15 were with b2gperf running at its normal timing.

Flags: needinfo?(gsvelto)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 35

•

11 years ago

Fernando, do you have a plausible explanation of why b2gperf would run without causing this issue prior to your commit, but start causing the problem after your commit?  I'm trying to understand how we can say its a test harness issue when the test harness didn't change.

Flags: needinfo?(ferjmoreno)

Gabriele Svelto [:gsvelto]

Comment 36

•

11 years ago

(In reply to Ben Kelly [:bkelly] from comment #34)
> Gabriele, did you maintain the timing during the test or did you pause
> things to take measurements?

I'm not sure I understand your question but I'll describe what I did. For each run I would wait for the phone to quiet down (~10s) then open the browser, wait a few seconds, take the profile, then kill the browser and start again. I did multiple tries of 30 runs each, rebooting the phone in-between. Once I tried 60 runs but the results did not change either.

Flags: needinfo?(gsvelto)

Andrew Overholt [:overholt]

Comment 37

•

11 years ago

(In reply to Fernando Jiménez Moreno [:ferjm] (FxAccounts Work Week until Friday) from comment #33)
> IMHO this should not be a blocker anymore since it only affects the tests.

Okay.  Feel free to re-nom if you disagree.

blocking-b2g: koi? → ---

Mike Lee [:mlee]

Updated

•

11 years ago

Priority: P1 → --

Whiteboard: [c=progress p= s= u=1.2] → [c=progress p= s= u=]

Target Milestone: 1.2 C3(Oct25) → ---

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 38

•

11 years ago

With Fernando's help I think I understand whats happening here now.

It appears that b2gperf calls mozApps.mgmt.getAll() every time it launches an app.  This creates a new WebappsApplication object for each install apps which in turn registers a bunch of listeners as a DOMRequestHelper.

Previously these listeners were getting GC'd as weak references.

As of bug 915598 the listeners stopped getting GC'd because the each listener object was registered as an Observer using a strong reference.

Changing this line to pass true as the third argument fixed the slow down for me:

  http://mxr.mozilla.org/mozilla-central/source/dom/base/DOMRequestHelper.jsm#160

Flags: needinfo?(ferjmoreno)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 39

•

11 years ago

Attached patch domrequesthelper-lazy-obs.patch (obsolete) — Details — Splinter Review

This patch changes the Observer reference to the DOMRequestHelper to be weak instead of strong.  This seems to avoid the progressive slowdown issue for me.

Attachment #825458 - Flags: review?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 40

•

11 years ago

This seems like a legitimate platform issue, so re-nom'ing for koi.

blocking-b2g: --- → koi?

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 41

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=02143d3967ce

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 42

•

11 years ago

Hmm, dom/apps tests failing:

0:06.94 191 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | Check manifestURL - got "http://test/tests/dom/apps/tests/file_packaged_app.sjs?getManifest=true&appToUpdate", expected "http://test/tests/dom/apps/tests/file_packaged_app.sjs?getManifest=true"
 0:06.94 200 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | Shouldn't get here!
 0:06.94 217 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | Download error NO_DOWNLOAD_AVAILABLE
 0:06.94 218 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | [SimpleTest.finish()] this test already called finish!
 0:06.94 222 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | XHR error loading http://test/tests/dom/apps/tests/file_packaged_app.sjs?setVersion=3: 0 -
 0:06.94 223 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | called finish() multiple times

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 43

•

11 years ago

(In reply to Ben Kelly [:bkelly] from comment #42)
> 0:06.94 191 ERROR TEST-UNEXPECTED-FAIL |
> /tests/dom/apps/tests/test_packaged_app_update.html | Check manifestURL -
> got
> "http://test/tests/dom/apps/tests/file_packaged_app.
> sjs?getManifest=true&appToUpdate", expected
> "http://test/tests/dom/apps/tests/file_packaged_app.sjs?getManifest=true"
>  0:06.94 200 ERROR TEST-UNEXPECTED-FAIL |
> /tests/dom/apps/tests/test_packaged_app_update.html | Shouldn't get here!
>  0:06.94 217 ERROR TEST-UNEXPECTED-FAIL |
> /tests/dom/apps/tests/test_packaged_app_update.html | Download error
> NO_DOWNLOAD_AVAILABLE
>  0:06.94 218 ERROR TEST-UNEXPECTED-FAIL |
> /tests/dom/apps/tests/test_packaged_app_update.html | [SimpleTest.finish()]
> this test already called finish!
>  0:06.94 222 ERROR TEST-UNEXPECTED-FAIL |
> /tests/dom/apps/tests/test_packaged_app_update.html | XHR error loading
> http://test/tests/dom/apps/tests/file_packaged_app.sjs?setVersion=3: 0 -
>  0:06.94 223 ERROR TEST-UNEXPECTED-FAIL |
> /tests/dom/apps/tests/test_packaged_app_update.html | called finish()
> multiple times

It seems the mozApps.mgmt.oninstall handler set at the end of test_packaged_app_update.html is running during the next test suite.  Since the next test suite overwrite the oninstall event target, this implies to me we are getting a separate management object.  Does this make sense?

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 44

•

11 years ago

It seems WebappsRegistry is getting GC'd now while the underlying WebappsApplicationMgmt persists due to its strong ref listeners.

I'll keep investigating tomorrow.  Please let me know if I am on the wrong track.  Thanks!

Fernando Jiménez Moreno [:ferjm]

Comment 45

•

11 years ago

Wow! Thanks Ben! :) I totally missed the Observer thing :(

(In reply to Ben Kelly [:bkelly] from comment #44)
> It seems WebappsRegistry is getting GC'd now while the underlying
> WebappsApplicationMgmt persists due to its strong ref listeners.
> 
> I'll keep investigating tomorrow.  Please let me know if I am on the wrong
> track.  Thanks!

It makes sense to add the WebappsRegistry "Webapps:Install:Return:OK" listener as a strong one. We will be defaulting to strong listeners in Bug 928389 anyway as the WebApps object leaks (bug 889984) that caused all this weak listeners work will likely be solved by Bug 903291.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 46

•

11 years ago

Taking this while I track down the test failure.

Assignee: ferjmoreno → bkelly

Whiteboard: [c=progress p= s= u=] → [c=progress p=3 s= u=]

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 47

•

11 years ago

So it turns out the test failures were due to a different reason.

When we use a strong observer ref the WebappsApplicationMgmt application gets notified about the inner-window-destroyed event and cleans itself up between the test cases.

With a weak observer ref this notification never occurs.  It turns out this is due to most of the nsDOMRequestHelper child classes overriding QueryInterface, but not specifying nsIObserver.  The nsIObserver QI is necessary or this line will always fail to resolve the weak ref:

  http://mxr.mozilla.org/mozilla-central/source/xpcom/ds/nsObserverList.cpp#72

I have some patches that seem to work.  In try now:

  https://tbpl.mozilla.org/?tree=Try&rev=74726cf35102

I also want to write some test cases around this since it seems easy to regress here.  I probably won't have tests ready until next week.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 48

•

11 years ago

Attached patch Part 1: Add nsIObserver to all DOMRequestHelper QI statements (obsolete) — Details — Splinter Review

Part 1: Add nsIObserver to all DOMRequestHelper QI statements

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 49

•

11 years ago

Attached patch Part 2: Enable weak observer ref and enforce expected QI interfaces (obsolete) — Details — Splinter Review

Work in progress patch.

Note, I think we should explicitly QueryInterface() for nsIObserver and nsISupportsWeakReference in initDOMRequestHelper() to force a fast fail if they are missing.  It seems the behavior is subtle and non-obvious if these are missing.

Attachment #825458 - Attachment is obsolete: true

Attachment #825458 - Flags: review?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #826097 - Attachment description: domrequesthelper-qi-observer.patch → Part 1: Add nsIObserver to all DOMRequestHelper QI statements

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 50

•

11 years ago

Attached patch Part 1: Declare nsIObserver QI in DOMRequestHelper child classes. (obsolete) — Details — Splinter Review

Missed a DOMRequestHelper in the first patch.  The fast fail logic helped me catch this when running b2gperf.

Attachment #826097 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 51

•

11 years ago

Updated try:

  https://tbpl.mozilla.org/?tree=Try&rev=64b8b4fb7f92

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 52

•

11 years ago

Unfortunately it appears that adding nsIObserver to the QI for WebappsApplication re-introduces the progressive slowdown.

It seems the previous code depended on the observer weak refs getting immediately flushed due to the missing QI to avoid the impact from b2gperf creating so many WebappsApplication objects.

This also explains why bug 915220 only appeared on b2g18 and not m-c. The patch for b2g18 in bug 900221 only added calls to addWeakMessageListener(), but never enabled the weak observer reference. Therefore b2g18 suffered from the growing WebappsApplication observer problem caused by calling mozApps.mgmt.getAll() repeatedly. In contrast, the fix in bug 889984, which landed on m-c, enabled the observer weak refs with the mismatched nsIObserver QI. So m-c avoided the growing observer problem.

What I don't understand is why we only saw the problem for Clock on v1.1 and now mainly for Browser on v1.2/m-c. Also, why wasn't this an issue prior to bug 900221 and bug 889984?

Anyway, the question now is what to do at this point? It seems to me:

1) We need to enable the observer weak ref flag to avoid leaking DOMRequestHelper objects.
2) We need to add the nsIObserver QI to all DOMRequestHelper objects so that the weak observer ref works correctly.
3) We need to change b2gperf to not call mozApps.mgmt.getAll() repeatedly or change mozApps.mgmt.getAll() so that it does not create new objects over and over for the same application.
4) We might want to consider investigate optimizing the observer notification path. It seems to copy the array of observer refs twice for every notification.

My current patches handle (1) and (2). Should I write a new bug for this since it doesn't actually fix the progressive slowdown? I can talk to Dave Hunt about (3). I'm not sure if (4) is worth pursuing.

Fabrice, Fernando, what do you think? Sorry if this bug has gotten a bit messy.

Flags: needinfo?(ferjmoreno)

Flags: needinfo?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 53

•

11 years ago

We can probably ignore (4) from comment 52. Looking back at the profile from comment 15:

http://people.mozilla.org/~bgirard/cleopatra/#report=8c6d32d41fe6134bd22588075438183e304f81f7&search=Observer

We can see that we are indeed spending a ton of time in the observer notification for the inner-window-destroyed event. The extra time doesn't appear to be from copying the reference arrays around, though. Its mostly in the execution of nsXPCWrappedJSClass::CallMethod() calling into the DOMRequestHelper.prototype.observe function().

Interestingly, this profile might also explain why we mainly see the progressive slowdown in Browser, but not other apps. You can see in the profile that we actually get two window destroyed events for Browser. One during launch and one later when the app is closed. The event during launch spends about 200ms and while the app closing event spends about 750ms. (The later event takes longer because it triggers a GC. This makes sense given that an app was just closed.)

From previous debugging it seems we only create and destroy this window during launch for Browser. I assume this is due to that fact that it runs in the parent process instead of OOP. Note, I also previously commented out the logic that creates a new tab and this window is still created.

Given that b2gperf takes its launch time measurement before closing the app, it seems likely that the window closing during launch and its observer notifications (the 200ms in this profile) are whats slowing down Browser launch. So:

(5) We might want to open a bug to investigate why an extra window gets created/destroyed when launching Browser.

It seems most of those 200ms are spent in:

nsXPCWrappedJSClass::CallMethod
GetContextFromObjectOrDefault
ExposeGCThingToActiveJS

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 54

•

11 years ago

Debug suggests that the additional 'inner-window-destroyed' event that is fired during launch is due to the 'about:blank' URI getting over-written with the app URI.  I suppose this is expected, but it would be nice if we could avoid the extra work related to about:blank during launch.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 55

•

11 years ago

(In reply to Ben Kelly [:bkelly] from comment #52)
> Unfortunately it appears that adding nsIObserver to the QI for
> WebappsApplication re-introduces the progressive slowdown.

So I did some additional testing this morning and it appears this statement is only partially true.  While the progressive slowdown occurs again, the effect is much reduced.

Here are the results of running three "b2gperf --delay=10 Browser" runs before and after the attached patches.  The number is the coefficient of a linear regression.  Lower numbers are better.

  Before patch:  5.9, 7.8, 6.4
  After patch:   4.0, 0.9, 2.9

Also, from profiling I don't see any other obvious issues at work here.  I think we should move forward with this fix, change b2gperf to avoid calling mozApps.mgmt.getAll() repeatedly, and mark this bug resolved.

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #826122 - Flags: review?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 56

•

11 years ago

Comment on attachment 826102 [details] [diff] [review]
Part 2: Enable weak observer ref and enforce expected QI interfaces

Try from comment 51 looks pretty green.  Re-requesting review on these patches.

Attachment #826102 - Flags: review?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #826122 - Flags: review?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 57

•

11 years ago

Comment on attachment 826102 [details] [diff] [review]
Part 2: Enable weak observer ref and enforce expected QI interfaces

Dropping review until I can write some tests.

Attachment #826102 - Flags: review?(fabrice)

Andrew Overholt [:overholt]

Comment 58

•

11 years ago

The perf impact here seems real (if I'm reading the comments correctly :) so koi+ again.

blocking-b2g: koi? → koi+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 59

•

11 years ago

Attached patch Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior. (obsolete) — Details — Splinter Review

This patch adds test cases verifying that the DOMRequestHelper uninit() function is called as expected in these cases:

  - no message listeners, no GC
  - weak message listeners, no GC
  - strong message listeners, no GC
  - strong message listeners, forced GC

And is not called in these cases where we expect it to be garbage collected:

  - no message listeners, forced GC
  - weak message listeners, forced GC

Updated try with all three patches:

  https://tbpl.mozilla.org/?tree=Try&rev=bae1c08bdcc4

Attachment #827492 - Flags: review?(fabrice)

Flags: needinfo?(ferjmoreno)

Flags: needinfo?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 60

•

11 years ago

Comment on attachment 826122 [details] [diff] [review]
Part 1: Declare nsIObserver QI in DOMRequestHelper child classes.

Add back review flag now that we have tests.

Attachment #826122 - Flags: review?(fabrice)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 61

•

11 years ago

Comment on attachment 826102 [details] [diff] [review]
Part 2: Enable weak observer ref and enforce expected QI interfaces

Add back review flag now that we have unit tests.

Attachment #826102 - Flags: review?(fabrice)

Mike Lee [:mlee]

Updated

•

11 years ago

Priority: -- → P1

Whiteboard: [c=progress p=3 s= u=] → [c=progress p=3 s= u=1.2]

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 62

•

11 years ago

Attached file Pull request at https://github.com/mozilla-b2g/gaia/pull/13397 — Details

This patch to gaiatest avoids repeated calls to mozApps.mgmt.getAll().  This is desirable because getAll() allocates new Application objects on each call.

Many thanks to :ferjm for identifying the problem and essentially writing this patch!

Dave, should I also make this change in gaia/tests/atoms/gaia_test.js?  That file is slightly different.  I don't know how to test it since I'm not sure what uses it.

Attachment #827591 - Flags: review?(dave.hunt)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Whiteboard: [c=progress p=3 s= u=1.2] → [c=progress p=4 s= u=1.2]

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 63

•

11 years ago

Comment on attachment 827591 [details] [review]
Pull request at https://github.com/mozilla-b2g/gaia/pull/13397

(In reply to Ben Kelly [:bkelly] from comment #62)
> Created attachment 827591 [details] [review]
> Pull request at https://github.com/mozilla-b2g/gaia/pull/13397
> 
> This patch to gaiatest avoids repeated calls to mozApps.mgmt.getAll().  This
> is desirable because getAll() allocates new Application objects on each call.
> 
> Many thanks to :ferjm for identifying the problem and essentially writing
> this patch!
> 
> Dave, should I also make this change in gaia/tests/atoms/gaia_test.js?  That
> file is slightly different.  I don't know how to test it since I'm not sure
> what uses it.

That file is left over from the previous JS integration tests. I don't believe it's used any more and will be merged/removed by bug 909839.

Landed as:
https://github.com/mozilla-b2g/gaia/commit/c1204e00cff95820b902c1b41dee229402c6a156

Attachment #827591 - Flags: review?(dave.hunt) → review+

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

11 years ago

Depends on: 935964

Mike Lee [:mlee]

Comment 64

•

11 years ago

(In reply to Dave Hunt (:davehunt) from comment #63)
> ...
> 
> Landed as:
> https://github.com/mozilla-b2g/gaia/commit/
> c1204e00cff95820b902c1b41dee229402c6a156

Ben should this bug be RESOLVED FIXED?

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 65

•

11 years ago

(In reply to Mike Lee [:mlee] from comment #64)
> (In reply to Dave Hunt (:davehunt) from comment #63)
> > ...
> > 
> > Landed as:
> > https://github.com/mozilla-b2g/gaia/commit/
> > c1204e00cff95820b902c1b41dee229402c6a156
> 
> Ben should this bug be RESOLVED FIXED?

No, this needs to stay open until the platform changes get review and land.  The b2gperf change was needed, but masks the very real platform problem.  Without the platform fix we cannot free DOMRequestHelper objects under memory pressure.

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 66

•

11 years ago

Updating the summary to reflect the current issue that needs to be fixed.

Summary: b2g browser launch times worsen over time when running b2gperf tests → strong observer ref prevents GC of DOMRequestHelper objects

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Blocks: 915138

Jason Smith [:jsmith]

Comment 67

•

11 years ago

It would be interesting btw to know if this could be causing bug 935273 to potentially reproduce.

Gene Lian (I already quit Mozilla)

Comment 68

•

11 years ago

How does this one relate to Bug 928389?

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 69

•

11 years ago

I don't know about causing bug 935273, but fixing the strong ref here so we can free the DOMRequestHelpers under memory pressure might help that bug.

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #826102 - Flags: feedback?(gene.lian)

Attachment #826102 - Flags: feedback?(ferjmoreno)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #826122 - Flags: feedback?(gene.lian)

Attachment #826122 - Flags: feedback?(ferjmoreno)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #827492 - Flags: feedback?(gene.lian)

Attachment #827492 - Flags: feedback?(ferjmoreno)

Gene Lian (I already quit Mozilla)

Updated

•

11 years ago

Attachment #826122 - Flags: feedback?(gene.lian) → feedback+

Gene Lian (I already quit Mozilla)

Comment 70

•

11 years ago

Comment on attachment 826102 [details] [diff] [review]
Part 2: Enable weak observer ref and enforce expected QI interfaces

Review of attachment 826102 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/DOMRequestHelper.jsm
@@ +141,5 @@
>     *    will be added as a weak referred one by default.
>     */
>    initDOMRequestHelper: function(aWindow, aMessages) {
> +    // Query our required interfaces to force a fast fail if they are not
> +    // provided.  These calls will throw if the interface is not available.

Nice! This will force the callers to use it correctly. :)

@@ +161,5 @@
>      }
>  
>      this._destroyed = false;
>  
> +    Services.obs.addObserver(this, "inner-window-destroyed", true);

So, you are forcing the "inner-window-destroyed" to be added as strong reference, which means the object won't be GC'ed unless the window is closed.

However, in addMessageListeners(), users can choose how to associate the listeners with the window (strongly or weakly). If the listeners attempt to be added weakly, this will become in vain because you always addObserver strongly.

Please correct me if I misunderstood. Thanks!

Attachment #826102 - Flags: feedback?(gene.lian)

Gene Lian (I already quit Mozilla)

Comment 71

•

11 years ago

Comment on attachment 827492 [details] [diff] [review]
Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior.

Review of attachment 827492 [details] [diff] [review]:
-----------------------------------------------------------------

Temporally clean up my feedback? until my suspicion for the part-2 patch gets clear.

Attachment #827492 - Flags: feedback?(gene.lian)

Gene Lian (I already quit Mozilla)

Comment 72

•

11 years ago

Comment on attachment 826102 [details] [diff] [review]
Part 2: Enable weak observer ref and enforce expected QI interfaces

Review of attachment 826102 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/DOMRequestHelper.jsm
@@ +161,5 @@
>      }
>  
>      this._destroyed = false;
>  
> +    Services.obs.addObserver(this, "inner-window-destroyed", true);

Sorry I think I misunderstand the use of addObserver(...) the other way around, where "true" means adds the observer as weak reference.

So, addMessageListeners() can still associate the listeners with the window in a way of strong reference, event if the "inner-window-destroyed" is added as weak reference (one strong at least, then it's all strong referenced). It also works for adding listeners as weak reference since all the listeners/observers are added as weak reference.

Attachment #826102 - Flags: feedback+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 73

•

11 years ago

(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #72)
> > +    Services.obs.addObserver(this, "inner-window-destroyed", true);
> 
> Sorry I think I misunderstand the use of addObserver(...) the other way
> around, where "true" means adds the observer as weak reference.

Correct.  I think previous incarnations of this code used |/* weak-ref */ true| for that argument.  I should probably add that back since its easy to miss this.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 74

•

11 years ago

Comment on attachment 827492 [details] [diff] [review]
Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior.

Thanks for the feedback on the first two patches!

Adding back the f? on the tests now that the observer weak ref question was answered.  Just wanted to make sure this didn't fall through the cracks.

Attachment #827492 - Flags: feedback?(gene.lian)

[:fabrice] Fabrice Desré

Updated

•

11 years ago

Attachment #826122 - Flags: review?(fabrice) → review+

[:fabrice] Fabrice Desré

Comment 76

•

11 years ago

Comment on attachment 826102 [details] [diff] [review]
Part 2: Enable weak observer ref and enforce expected QI interfaces

Review of attachment 826102 [details] [diff] [review]:
-----------------------------------------------------------------

that looks ok to me.

::: dom/base/DOMRequestHelper.jsm
@@ +141,5 @@
>     *    will be added as a weak referred one by default.
>     */
>    initDOMRequestHelper: function(aWindow, aMessages) {
> +    // Query our required interfaces to force a fast fail if they are not
> +    // provided.  These calls will throw if the interface is not available.

nit: I think there's an extra space before 'These'.

Attachment #826102 - Flags: review?(fabrice) → review+

[:fabrice] Fabrice Desré

Comment 77

•

11 years ago

Comment on attachment 827492 [details] [diff] [review]
Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior.

Review of attachment 827492 [details] [diff] [review]:
-----------------------------------------------------------------

looks good to me, thanks!

::: dom/base/test/test_domrequesthelper.xul
@@ +107,5 @@
> +     *                           function(uninitCalled) {
> +     *      // expect uninitCalled === false since GC with only weak refs
> +     *    });
> +     */
> +    const CWD_TOPIC = 'inner-window-destroyed'

nit: missing ';'
Also, use either simple or double quotes consistently in the whole file (it seems like double quotes were used before in this file).

Also, what does this 'CWD' prefix mean?

Attachment #827492 - Flags: review?(fabrice) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 78

•

11 years ago

Attached patch Part 1: Declare nsIObserver QI in DOMRequestHelper child classes. — Details — Splinter Review

Update part 1 to indicate r=fabrice.  Carry r+ forward.

Attachment #826122 - Attachment is obsolete: true

Attachment #826122 - Flags: feedback?(ferjmoreno)

Attachment #8335020 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 79

•

11 years ago

Attached patch Part 2: Use weak observer reference for DOMRequestHelper. — Details — Splinter Review

Update part 2 with following changes carrying r+ forward:

1) Add comment indicating boolean third arg to addObserver() means to use a weak-ref.
2) Remove space from comment.
3) Update subject to indicate reviewer.

Attachment #8335023 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Attachment #826102 - Attachment is obsolete: true

Attachment #826102 - Flags: feedback?(ferjmoreno)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 80

•

11 years ago

Attached patch Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior. — Details — Splinter Review

Update part 3 with following changes carrying forward r+:

1) Use double quotes instead single quotes.
2) Remove "cwd" prefix on variables except for dummy object.  We need it there to avoid shadowing previous variable.
3) Add comment indicating |cwdDummy| is prefixed to be specific to |checkWindowDestruction()|.
4) Update commit message to include reviewer.

Attachment #827492 - Attachment is obsolete: true

Attachment #827492 - Flags: feedback?(gene.lian)

Attachment #827492 - Flags: feedback?(ferjmoreno)

Attachment #8335026 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 81

•

11 years ago

Add checkin-needed for following patches:

  attachment 8335020 [details] [diff] [review]
  attachment 8335023 [details] [diff] [review]
  attachment 8335026 [details] [diff] [review]

Safe to resolve after these are committed.

Keywords: checkin-needed

Gene Lian (I already quit Mozilla)

Comment 82

•

11 years ago

remote:   https://hg.mozilla.org/integration/b2g-inbound/rev/05595c39906f
remote:   https://hg.mozilla.org/integration/b2g-inbound/rev/26ba59b5e2bd
remote:   https://hg.mozilla.org/integration/b2g-inbound/rev/18de3f6395e5

Keywords: checkin-needed

Fernando Jiménez Moreno [:ferjm]

Comment 83

•

11 years ago

Comment on attachment 8335023 [details] [diff] [review]
Part 2: Use weak observer reference for DOMRequestHelper.

Review of attachment 8335023 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the amazing work Ben! And sorry for the feedback delay.

Just one quick comment

::: dom/base/DOMRequestHelper.jsm
@@ +142,5 @@
>     */
>    initDOMRequestHelper: function(aWindow, aMessages) {
> +    // Query our required interfaces to force a fast fail if they are not
> +    // provided. These calls will throw if the interface is not available.
> +    this.QueryInterface(Ci.nsISupportsWeakReference);

It's ok to do this check for now, but we'll need to remove it once we default to strong refs, as weak refs will be rarely used then.

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 84

•

11 years ago

(In reply to Dave Hunt (:davehunt) from comment #63)
> Landed as:
> https://github.com/mozilla-b2g/gaia/commit/
> c1204e00cff95820b902c1b41dee229402c6a156

Uplifted to v1.2:
https://github.com/mozilla-b2g/gaia/commit/fe96c5969c9feb0fee9524492a1aad7929aa716e

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 85

•

11 years ago

(In reply to Fernando Jiménez Moreno [:ferjm] (use needinfo instead of CC, please) from comment #83)
> ::: dom/base/DOMRequestHelper.jsm
> @@ +142,5 @@
> >     */
> >    initDOMRequestHelper: function(aWindow, aMessages) {
> > +    // Query our required interfaces to force a fast fail if they are not
> > +    // provided. These calls will throw if the interface is not available.
> > +    this.QueryInterface(Ci.nsISupportsWeakReference);
> 
> It's ok to do this check for now, but we'll need to remove it once we
> default to strong refs, as weak refs will be rarely used then.

Hmm, I didn't realize we had plans to default to strong references in the future.  I guess, though, we could perform the check when the client enabled weak references.  How does that sound?

Ryan VanderMeulen [:RyanVM]

Comment 86

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/05595c39906f
https://hg.mozilla.org/mozilla-central/rev/26ba59b5e2bd
https://hg.mozilla.org/mozilla-central/rev/18de3f6395e5

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla28

Ryan VanderMeulen [:RyanVM]

Comment 87

•

11 years ago

Needs branch-specific patches for b2g26.

status-firefox26: --- → wontfix

status-firefox27: --- → wontfix

status-firefox28: --- → fixed

Flags: needinfo?(bkelly)

Keywords: branch-patch-needed

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 88

•

11 years ago

Attached patch b2g26 version of Part 1: Declare nsIObserver QI in DOMRequestHelper child classes. — Details — Splinter Review

Updated part 1 patch for b2g26_v1_2 branch.  Patches for part 2 and part 3 apply cleanly for me.

Mochitest try build to make sure I didn't break anything.  Since we throw if the interface is missing, this should catch any problems with the fix to part 1:

  https://tbpl.mozilla.org/?tree=Try&rev=a8ecb415ed31

Do I add checkin-needed keyword again if this ends up green?

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 89

•

11 years ago

Hmm... that try seems to have picked up extra revisions.  And further tries only show the empty try revision and not my patches.  I must be doing something wrong.  Building locally and will test in the morning.

Andrew McCreight [:mccr8]

Comment 90

•

11 years ago

try shows all new commits you push, and no others.  so the first time, you were the first person to push some of those other random commits.  then the second time, you'd already pushed those revisions, so it didn't show them. so, the behavior you saw was perfectly normal, if also confusing.

Ryan VanderMeulen [:RyanVM]

Comment 91

•

11 years ago

No need, I've got bug queries for it :)

Ryan VanderMeulen [:RyanVM]

Comment 92

•

11 years ago

https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/2171d9242ed4
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/d79d702bac48
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/132758892dc8

status-b2g-v1.2: affected → fixed

Keywords: branch-patch-needed

Jason Smith [:jsmith]

Updated

•

11 years ago

No longer blocks: 915138

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

Depends on: 944311

Dave Zeber [:dzeber]

Comment 93

•

11 years ago

I wanted to follow up on this from a data perspective. I spoke with some of you about the test data from before the fix, which we found showed increasing startup times across test run replicates, along with some other patterns that made the replicates look "not independent". Looking at some data from after the changes landed confirms that the patches seem to have resolved the problem of the increasing startup times - the strong increasing trends are mostly gone.

There are some other issues that have not been affected by the changes: some test runs still show weak increasing trends; in some cases the first few replicates are higher than the others and seem to "burn in" to a constant level; and in some cases there is a spike at the same place (eg. replicate 14) for several test runs. All of these issues are undesirable from the point of view of comparing test outcomes, as we'd want the replicates to be as similar and pattern-less as possible. However, these are not necessarily bugs, and could be expected behaviour, in which case we just need to be aware that they occur, and handle them appropriately.

I'm linking a couple of plots to illustrate the results. The top row has test runs from before the patches landed (up until late October), and the bottom row has runs from afterwards (beginning of November onward), grouped by app. Each test run across 30 replicates is plotted as a black line, several of which are shown on each plot. A red fitted trend line is superimposed on top. Ideally, the red line would be horizontal, meaning no trend or pattern as the test moves through the replicates. As we can see, the red lines in the bottom row are much closer to horizontal than in the top row.

https://drive.google.com/a/mozilla.com/file/d/0BzxrdTWOmycCVENfSzhPMmwwalU/edit?usp=sharing
https://drive.google.com/a/mozilla.com/file/d/0BzxrdTWOmycCX3JzaVdoX1V3Xzg/edit?usp=sharing

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

11 years ago

No longer depends on: 944311

Nobody; OK to take it and work on it

Updated

•

5 years ago

Component: DOM → DOM: Core & HTML

Screenshot from 2013-10-08 15:12:37.png 11 years ago Ben Kelly [:bkelly, not reviewing] 58.29 KB, image/png		Details
Screenshot from 2013-10-08 16:30:52.png 11 years ago Ben Kelly [:bkelly, not reviewing] 62.38 KB, image/png		Details
Debug nsFrameMessageManager and DOMRequestHelper 11 years ago Fernando Jiménez Moreno [:ferjm] 10.22 KB, patch		Details \| Diff \| Splinter Review
domrequesthelper-lazy-obs.patch 11 years ago Ben Kelly [:bkelly, not reviewing] 953 bytes, patch		Details \| Diff \| Splinter Review
Part 1: Add nsIObserver to all DOMRequestHelper QI statements 11 years ago Ben Kelly [:bkelly, not reviewing] 11.81 KB, patch		Details \| Diff \| Splinter Review
Part 2: Enable weak observer ref and enforce expected QI interfaces 11 years ago Ben Kelly [:bkelly, not reviewing] 1.71 KB, patch	fabrice : review+ airpingu : feedback+	Details \| Diff \| Splinter Review
Part 1: Declare nsIObserver QI in DOMRequestHelper child classes. 11 years ago Ben Kelly [:bkelly, not reviewing] 13.03 KB, patch	fabrice : review+ airpingu : feedback+	Details \| Diff \| Splinter Review
Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior. 11 years ago Ben Kelly [:bkelly, not reviewing] 6.88 KB, patch	fabrice : review+	Details \| Diff \| Splinter Review
Pull request at https://github.com/mozilla-b2g/gaia/pull/13397 11 years ago Ben Kelly [:bkelly, not reviewing] 46 bytes, text/x-github-pull-request	davehunt : review+	Details \| Review
Part 1: Declare nsIObserver QI in DOMRequestHelper child classes. 11 years ago Ben Kelly [:bkelly, not reviewing] 13.04 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Part 2: Use weak observer reference for DOMRequestHelper. 11 years ago Ben Kelly [:bkelly, not reviewing] 1.76 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Part 3: Add tests verifying DOMRequestHelper cleanup and weak ref behavior. 11 years ago Ben Kelly [:bkelly, not reviewing] 6.92 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
b2g26 version of Part 1: Declare nsIObserver QI in DOMRequestHelper child classes. 11 years ago Ben Kelly [:bkelly, not reviewing] 13.22 KB, patch		Details \| Diff \| Splinter Review