Closed
Bug 1046967
Opened 10 years ago
Closed 10 years ago
Performance (b2gperf) tests crashing on b2g-inbound builds
Categories
(Firefox OS Graveyard :: Performance, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: davehunt, Assigned: hub)
References
Details
(4 keywords, Whiteboard: [c=automation p=1 s= u=])
Attachments
(2 files)
Since yesterday the performance tests have been crashing after just a few iterations. I can reproduce this locally and it doesn't appear to matter which application is being tested (replicated with both Phone and Contacts).
device_firmware_date: 1403855878
device_firmware_version_incremental: 110
device_firmware_version_release: 4.3
device_id: flame
Last good:
application_buildid: 20140730104209
application_changeset: b8d783033da7
build_changeset: 3aa6abd313f965a84aa86c6b213dc154e4875139
gaia_changeset: b67ddd7d40b52e65199478b8d6631c2c28fdf41d
gaia_date: 1406740488
platform_buildid: 20140730104209
platform_changeset: b8d783033da7
First bad:
application_buildid: 20140730105005
application_changeset: 4cc9e0c5dd67
build_changeset: 3aa6abd313f965a84aa86c6b213dc154e4875139
gaia_changeset: c2d7dafab9dcadf1b5a099972d4c7647dcc4e276
gaia_date: 1406740488
platform_buildid: 20140730105005
platform_changeset: 4cc9e0c5dd67
Comment 1•10 years ago
|
||
Hub,
I need you to look into this and identify the root cause.Please work with Dave Hunt and anyone else needed to resolve this.
Thanks,
Mike
Severity: normal → blocker
Status: NEW → ASSIGNED
Component: General → Performance
Keywords: perf
Priority: -- → P1
Whiteboard: [c=automation p= s= u=]
Reporter | ||
Comment 2•10 years ago
|
||
Example console output demonstrating the issue is below. When I've replicated locally I see the device perform a reboot.
b2gperf --address=localhost:2828 --device=356cd072 --delay=10 --sources=sources.xml --testvars=/home/webqa/webqa-credentials/b2g/b2g-13.1.json --dz-project=b2g --dz-branch=master --dz-device=flame --dz-key=**** --dz-secret=**** --dz-build-url=http://jenkins1.qa.scl3.mozilla.com/job/flame.b2g-inbound.perf.b2gperf/950/ --reset Phone Contacts Messages Settings Gallery Video Music Camera Email Calendar Clock FM Radio Usage Template Browser
2014-07-30 11:56:51,959 B2GPerfRunner INFO | Running B2GPerfLaunchTest
2014-07-30 11:58:37,820 B2GPerfRunner INFO | Phone [1/30]
2014-07-30 11:58:49,098 B2GPerfRunner INFO | Phone [2/30]
2014-07-30 11:59:00,470 B2GPerfRunner INFO | Phone [3/30]
2014-07-30 11:59:11,797 B2GPerfRunner INFO | Phone [4/30]
2014-07-30 11:59:23,183 B2GPerfRunner INFO | Phone [5/30]
2014-07-30 11:59:34,167 B2GPerfRunner INFO | Phone [6/30]
2014-07-30 11:59:45,144 B2GPerfRunner INFO | Phone [7/30]
2014-07-30 11:59:56,051 B2GPerfRunner INFO | Phone [8/30]
2014-07-30 12:00:07,158 B2GPerfRunner INFO | Phone [9/30]
Traceback (most recent call last):
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/bin/b2gperf", line 9, in <module>
load_entry_point('b2gperf==0.32', 'console_scripts', 'b2gperf')()
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 595, in cli
b2gperf.measure_app_perf(args)
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 201, in measure_app_perf
test.run()
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 338, in run
self.test()
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 377, in test
'launch("%s")' % self.app_name)
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/marionette/marionette.py", line 1166, in execute_async_script
filename=os.path.basename(frame[0]))
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/marionette/decorators.py", line 35, in _
return func(*args, **kwargs)
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/marionette/marionette.py", line 590, in _send_message
response = self.client.send(message)
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/marionette_transport/transport.py", line 100, in send
response = self.receive()
File "/var/jenkins/1/workspace/flame.b2g-inbound.perf.b2gperf/.env/local/lib/python2.7/site-packages/marionette_transport/transport.py", line 57, in receive
raise IOError(self.connection_lost_msg)
IOError: Connection to Marionette server is lost. Check gecko.log (desktop firefox) or logcat (b2g) for errors.
Reporter | ||
Updated•10 years ago
|
Summary: Performance tests crashing on b2g-inbound builds → Performance (b2gperf) tests crashing on b2g-inbound builds
Reporter | ||
Comment 3•10 years ago
|
||
I've just tested locally with the two b2g-inbound builds around the regression (without resetting gaia) and was unable to replicate the crash. This would imply it's a gaia issue between b67ddd7d40b52e65199478b8d6631c2c28fdf41d and c2d7dafab9dcadf1b5a099972d4c7647dcc4e276 however I've run out of time today for investigating this.
Comment 4•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #3)
> I've just tested locally with the two b2g-inbound builds around the
> regression (without resetting gaia) and was unable to replicate the crash.
> This would imply it's a gaia issue between
> b67ddd7d40b52e65199478b8d6631c2c28fdf41d and
> c2d7dafab9dcadf1b5a099972d4c7647dcc4e276 however I've run out of time today
> for investigating this.
https://github.com/mozilla-b2g/gaia/compare/b67ddd7d40b52e65199478b8d6631c2c28fdf41d...c2d7dafab9dcadf1b5a099972d4c7647dcc4e276
Comment 5•10 years ago
|
||
(In reply to Jason Smith [:jsmith] - At Work Week, Slow to Respond from comment #4)
> (In reply to Dave Hunt (:davehunt) from comment #3)
> > I've just tested locally with the two b2g-inbound builds around the
> > regression (without resetting gaia) and was unable to replicate the crash.
> > This would imply it's a gaia issue between
> > b67ddd7d40b52e65199478b8d6631c2c28fdf41d and
> > c2d7dafab9dcadf1b5a099972d4c7647dcc4e276 however I've run out of time today
> > for investigating this.
>
> https://github.com/mozilla-b2g/gaia/compare/
> b67ddd7d40b52e65199478b8d6631c2c28fdf41d...
> c2d7dafab9dcadf1b5a099972d4c7647dcc4e276
Maybe bug 1045132 caused this?
ahal - what do you think?
Flags: needinfo?(ahalberstadt)
Comment 6•10 years ago
|
||
(In reply to Jason Smith [:jsmith] - At Work Week, Slow to Respond from comment #5)
> (In reply to Jason Smith [:jsmith] - At Work Week, Slow to Respond from
> comment #4)
> > (In reply to Dave Hunt (:davehunt) from comment #3)
> > > I've just tested locally with the two b2g-inbound builds around the
> > > regression (without resetting gaia) and was unable to replicate the crash.
> > > This would imply it's a gaia issue between
> > > b67ddd7d40b52e65199478b8d6631c2c28fdf41d and
> > > c2d7dafab9dcadf1b5a099972d4c7647dcc4e276 however I've run out of time today
> > > for investigating this.
> >
> > https://github.com/mozilla-b2g/gaia/compare/
> > b67ddd7d40b52e65199478b8d6631c2c28fdf41d...
> > c2d7dafab9dcadf1b5a099972d4c7647dcc4e276
>
> Maybe bug 1045132 caused this?
>
> ahal - what do you think?
Ack. Mistyped the bug #. Meant to say bug 1045142.
Comment 7•10 years ago
|
||
b2g inbound push log as a point of reference - http://hg.mozilla.org/integration/b2g-inbound/pushloghtml?fromchange=b8d783033da7&tochange=4cc9e0c5dd67
Comment 8•10 years ago
|
||
I just dug into our smoketest reports for today as a point of comparison. The bug that's causing us trouble on our side is bug 1038854. It's causing the camera to fail to start (might also be the reason why email is crashing too).
Comment 9•10 years ago
|
||
Ting, can you help here? We need to understand what about your changes in bug 1038854 is causing these test crashes.
Flags: needinfo?(tchou)
Comment 10•10 years ago
|
||
ahal mentioned in IRC that bug 1045132 was unlikely to be the cause of this bug, as he thinks the runner service isn't used by anything yet.
Flags: needinfo?(ahalberstadt)
Comment 11•10 years ago
|
||
(In reply to Mike Lee [:mlee] from comment #9)
> Ting, can you help here? We need to understand what about your changes in
> bug 1038854 is causing these test crashes.
Note - if bug 1038854 is the cause, then this should be resolved when a new build gets spun with the backout included.
Assignee | ||
Comment 12•10 years ago
|
||
I noticed this yesterday on my device too. I'll update my tree and try again.
Assignee | ||
Comment 13•10 years ago
|
||
By notice this, I mean with |make test-perf|. It is an actual crash of Gecko as on the screen it says "B2G crashed".
Looking at bug 1045142 I doubt this code is used anywhere with |make test-perf|
Assignee | ||
Comment 14•10 years ago
|
||
I just update and rebuilt. My top gecko commit is:
commit c60b44a7b137ed1ebb3444efebb089d755424d54
Author: Wes Kocher <wkocher@mozilla.com>
Date: Thu Jul 31 15:04:49 2014 -0700
Backed out changeset f73cd738c1fe (bug 1038854) a=backout
which is the backout for the bug mentioned above.
It still crashes. I'll try to dig further, but we might need to bisect.
Assignee | ||
Comment 15•10 years ago
|
||
bisecting it right now.
Assignee | ||
Updated•10 years ago
|
Keywords: regression
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → hub
Whiteboard: [c=automation p= s= u=] → [c=automation p=1 s= u=]
Assignee | ||
Comment 17•10 years ago
|
||
to reproduce |APP=clock RESTART_B2G=0 make test-perf|
I crashes b2g when doing that.
Assignee | ||
Comment 18•10 years ago
|
||
I confirm that bug 1038854 isn't the source as the crash occurs before this bug was checked in and after it was reverted.
Comment 19•10 years ago
|
||
[Blocking Requested - why for this release]:
Regression in an existing test suite that must stay up to allow us to do performance measurements.
blocking-b2g: --- → 2.1?
Keywords: qablocker
Comment 20•10 years ago
|
||
(In reply to Hubert Figuiere [:hub] from comment #18)
> I confirm that bug 1038854 isn't the source as the crash occurs before this
> bug was checked in and after it was reverted.
hub - can you get a crash stack for the crash being seen here? I could dig through bugzilla here to see if there's a stack already with the crash you are seeing if I know the crash stack.
Flags: needinfo?(hub)
Comment 21•10 years ago
|
||
Comment 22•10 years ago
|
||
Comment 23•10 years ago
|
||
Not sure if I am running at the same issue here, but when I do |make test-perf|, eventually b2g process crashes, and keeps restarting and crashing even after reboot. The only way to fix is to reflash the phone. Attachment 8466516 [details] contains the dmesg output and Attachment 8466517 [details] shows the gdb stack trace.
Updated•10 years ago
|
Flags: needinfo?(hub)
That's bug 820716.
(You can work around by not using a debug build.)
Assignee | ||
Comment 26•10 years ago
|
||
I don't have the "keep restarting" here that Wander is saying he has though.
Bisect result
8062fdbcecee32574f64f4a0553a4da053a91d93 is the first bad commit
commit 8062fdbcecee32574f64f4a0553a4da053a91d93
Author: Sean Lin <selin@mozilla.com>
Date: Tue Jun 24 10:51:48 2014 +0800
Bug 874353 - Remove CPU wake lock control from ContentParent. r=gene, khuey
:040000 040000 08e93bb32d9c44606f7ea3860e37ed657258c16f f96f62965e32c498aafb5b384843b3bf08ac4dcc M dom
Assignee | ||
Comment 27•10 years ago
|
||
It is a git bisect using the B2G tree. sha1 needs to be map to the actual hg sha1. Just in case we weren't clear on that.
Assignee | ||
Comment 28•10 years ago
|
||
Also the patch has already been backed out due to bug 1046956 (possibly).
And after that it no longer crashes.
Backtracking to before the back out.
still crashes.
At the back out:
it no longer crashes.
Back out is git revision:
commit 19e5d2d26c417bd79a6c33d7fb1b4bedfb4ec713
Author: Kyle Huey <khuey@kylehuey.com>
Date: Fri Aug 1 11:02:55 2014 -0700
Back out bug 874353, which is suspected of causing bug 1046956. r=me a=backout
Depends on: 1048111
Comment 29•10 years ago
|
||
Hub - Can we close this then if we've confirmed this no longer reproduces with the backout of bug 874353?
Flags: needinfo?(hub)
Assignee | ||
Comment 30•10 years ago
|
||
Of course we can.
See comment 28 for the resolution.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: needinfo?(hub)
Resolution: --- → FIXED
Updated•10 years ago
|
blocking-b2g: 2.1? → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•