Closed
Bug 908261
Opened 12 years ago
Closed 12 years ago
Performance regression when running Marionette commands
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 910216
People
(Reporter: davehunt, Unassigned)
References
Details
(Keywords: perf, Whiteboard: [c= p= s=2013.11.08 u=])
I'm not sure if this is Marionette related yet, but we noticed that our B2G performance tests doubled in total run time recently. The results themselves do not appear to have been affected. I added a bunch of debug logging to the tool that's used to run the tests, and found the following:
Last good build:
2013-08-22 00:34:14 | DEBUG | Killing all running apps
2013-08-22 00:34:14 | DEBUG | Disabling WiFi
2013-08-22 00:34:14 | DEBUG | Waiting for 5 seconds
...
2013-08-22 01:54:38 | DEBUG | Killing all running apps
2013-08-22 01:54:39 | DEBUG | Disabling WiFi
2013-08-22 01:54:40 | DEBUG | Waiting for 5 seconds
First bad build:
2013-08-22 07:39:54 | DEBUG | Killing all running apps
2013-08-22 07:39:54 | DEBUG | Disabling WiFi
2013-08-22 07:39:55 | DEBUG | Waiting for 5 seconds
...
2013-08-22 09:42:57 | DEBUG | Killing all running apps
2013-08-22 09:43:05 | DEBUG | Disabling WiFi
2013-08-22 09:43:09 | DEBUG | Waiting for 5 seconds
The above shows a snippet of the console log at the beginning of the test run, and at the end. You can see that in the 'good' build the command go from completing in within one second to being separated by a second. In the 'bad' build they increase by several seconds. We launch at least 10 apps during the tests, and repeat each launch 30 times, so this small increase is enough to make a large impact.
The commands in question simply execute JavaScript snippets to kill all running apps, and disable the wifi. Of course, it may be something else that's causing these to run slowly, and may not simply be due to us calling these commands many times.
Last good build:
Gaia revision: 1a38ad4df2edb9b608f77f1691cb6c4450f98d8e
Gecko revision: a62c3344f16902abe6a924fd96eadd2ea14672a2
First bad build:
Gaia revision: 1a38ad4df2edb9b608f77f1691cb6c4450f98d8e
Gecko revision: f91fb63da2d6cead5c03f8d075afce68d7fe1921
Comment 1•12 years ago
|
||
That's bad.
Unfortunately, there was a significant Marionette landing made between the good and bad builds, the modal dialog support from bug 779284, so my first guess is that this introduced some sort of leak.
Malini, can you make a build with your change backed out and see if this problem still reproduces?
Comment 2•12 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #1)
> That's bad.
>
> Unfortunately, there was a significant Marionette landing made between the
> good and bad builds, the modal dialog support from bug 779284, so my first
> guess is that this introduced some sort of leak.
>
> Malini, can you make a build with your change backed out and see if this
> problem still reproduces?
Sure, Dave, how do I run these tests?
Reporter | ||
Comment 3•12 years ago
|
||
(In reply to Malini Das [:mdas] from comment #2)
> Sure, Dave, how do I run these tests?
To run the same as our CI you can pip install b2gperf and execute:
$ b2gperf --delay=5 Phone Contacts Messages Settings Gallery Video Music Camera Email Calendar Browser Clock FM Radio Usage
However if you would like the additional logging, you can install the latest from https://github.com/mozilla/b2gperf and include the command line argument --log-level=DEBUG
Also, the above command will take a while to run (~3hrs). You can reduce the delay and settle time using the --delay and --settle-time arguments, and can avoid a restart using --no-restart, however I'm unsure how these will affect your ability to replicate the issue. You can also reduce the number of apps under test.
Comment 4•12 years ago
|
||
Dave, can you confirm that this test is restarting B2G between each app launch? If that's the case, I don't see how this could be a Marionette (or gecko) issue at all, since restarting B2G would wipe the slate clean, memory-leak wise.
Also, do you notice this same regression in mozilla-b2g18, or does it just affect mozilla-central?
Reporter | ||
Comment 5•12 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #4)
> Dave, can you confirm that this test is restarting B2G between each app
> launch? If that's the case, I don't see how this could be a Marionette (or
> gecko) issue at all, since restarting B2G would wipe the slate clean,
> memory-leak wise.
>
> Also, do you notice this same regression in mozilla-b2g18, or does it just
> affect mozilla-central?
Yes, the b2g process is stopped/started between each block of 30 iterations. It's just affecting mozilla-central.
Comment 6•12 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #5)
> (In reply to Jonathan Griffin (:jgriffin) from comment #4)
> > Dave, can you confirm that this test is restarting B2G between each app
> > launch? If that's the case, I don't see how this could be a Marionette (or
> > gecko) issue at all, since restarting B2G would wipe the slate clean,
> > memory-leak wise.
> >
> > Also, do you notice this same regression in mozilla-b2g18, or does it just
> > affect mozilla-central?
>
> Yes, the b2g process is stopped/started between each block of 30 iterations.
> It's just affecting mozilla-central.
I haven't had the time to look at this yet, and I likely will not before I go on PTO, but the modal dialog change was added to both b2g18 and central, so it shouldn't be the cause of this regression.
Also, since b2g is being restarted periodically, marionette shouldn't be the cause since we don't persist anything past our log file.
Comment 7•12 years ago
|
||
Ah, jgriffin has shown that it can be leaking frames: https://bugzilla.mozilla.org/show_bug.cgi?id=779284#c51
Comment 8•12 years ago
|
||
Backing the modal dialog commit out of mozilla-central reduced the runtime of the master perf jobs from about 3hr to about 1.5hr, so apparently this was the cause. :(
Comment 9•12 years ago
|
||
This problem is caused by 909129; specifically, the frame manager patch in bug 779284 breaks session teardown in such a way that the imported script cache is never cleared.
Depends on: 909129
Comment 10•12 years ago
|
||
This bug dropped off my radar, but this slowdown issue is partly due to changes in WiFi management: https://bugzilla.mozilla.org/show_bug.cgi?id=910216#c10
There was no traction on it at the time, and since then, the Wifi code has changed. I'd like to confirm that the changes are still causing regression, and then file a new bug against this code.
Reporter | ||
Comment 11•12 years ago
|
||
The b2gperf tests have changed significantly since this bug was raised. I haven't seen any recent regressions in performance, and the data in Jenkins shows no overall increase since October 17th (we only keep data for the last 100 builds). I would be inclined to resolved this as fixed.
Comment 12•12 years ago
|
||
The issue seen here is going to be resolved with Bug 910216.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Updated•12 years ago
|
Whiteboard: [c= p= s= u=] → [c= p= s=2013.11.08 u=]
You need to log in
before you can comment on or make changes to this bug.
Description
•