Closed Bug 1010748 Opened 11 years ago Closed 11 years ago

Running make test-perf never completes or times out when run in Jenkins

Categories

(Firefox OS Graveyard :: Gaia::PerformanceTest, defect, P1)

ARM
Gonk (Firefox OS)

Tracking

(b2g-v1.3T fixed, b2g-v1.4 wontfix, b2g-v2.0 affected, b2g-v2.1 fixed)

RESOLVED FIXED
2.0 S4 (20june)
Tracking Status
b2g-v1.3T --- fixed
b2g-v1.4 --- wontfix
b2g-v2.0 --- affected
b2g-v2.1 --- fixed

People

(Reporter: davehunt, Assigned: hub)

Details

(Keywords: perf, Whiteboard: [c=automation p=2 s= u=])

Attachments

(4 files, 1 obsolete file)

+++ This bug was initially created as a clone of Bug #962040 +++ Looks like we have another issue with make test-perf. It's affecting both the v1.3 and master jobs. It appears that the job never completes (it's been running for over 7 hours at times) and should really be timing out in Jenkins way before that. Here are a couple of example builds that have demonstrated this behaviour: http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Perf/job/b2g.tarako.mozilla-b2g28_v1_3t.v1.3t.mozperftest/171/ http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Perf/job/b2g.hamachi.mozilla-central.master.mozperftest/2701/
And I have no idea of what is happening. Even more since on 1.3T there are have been no change to the testing framework. But then everything is green now. Also the Tarako failure has this in the logs: find: `xulrunner-sdk-26/xulrunner-sdk/bin': No such file or directory dirname: missing operand Try `dirname --help' for more information. Which shouldn't happen.
Is this still happening?
Flags: needinfo?(dave.hunt)
Priority: -- → P1
Yes, in fact I've just had to abort a mozilla-central job that was running for over 9hrs and a v1.3t job that was over 6hrs.
Flags: needinfo?(dave.hunt)
so it is only on 1.3t (tarako)?
No, see comment 3, this is also affecting mozilla-central (hamachi).
Assignee: nobody → hub
Status: NEW → ASSIGNED
Whiteboard: [c=automation p= s= u=] → [c=automation p=1 s= u=]
Again, builds manually aborted after 22hrs (mozilla-central) and 27hrs (1.3t)
I have seen some hangs after b2g actually completely crashed. It was today with a reasonably recent m-c on Flame.
It is also happening in the android emulator. Seems to happen in settings. B2g crashes and for some reason the timeout in Sockit never kicks in.
What's happening is that we don't test for recv() == 0. This cause an infinite loop. poll([{fd=11, events=POLLIN}], 1, 50500) = 1 ([{fd=11, revents=POLLIN}]) recvfrom(11, "", 1, 0, NULL, NULL) = 0 poll([{fd=11, events=POLLIN}], 1, 50500) = 1 ([{fd=11, revents=POLLIN}]) recvfrom(11, "", 1, 0, NULL, NULL) = 0 poll([{fd=11, events=POLLIN}], 1, 50500) = 1 ([{fd=11, revents=POLLIN}]) recvfrom(11, "", 1, 0, NULL, NULL) = 0 poll([{fd=11, events=POLLIN}], 1, 50500) = 1 ([{fd=11, revents=POLLIN}]) recvfrom(11, "", 1, 0, NULL, NULL) = 0 Working on a fix in sockit-to-me
Whiteboard: [c=automation p=1 s= u=] → [c=automation p=2 s= u=]
I'd need a new release of sockit-to-me
Attachment #8431636 - Flags: review?(aus)
We haven't had a successful run on v1.3t for 11 days.
If we crash on v1.3t when running test-perf, maybe we have a more important bug...
I'll try to port the current patch to the 1.3t branch when it gets approved - just to make the test more robust and not hang. I need to get my Tarako reflashed to test again with it. But if we get crashes in b2g (I do on master with flame) that's part of the problem.
I ran |make test-perf| today with the latest pvt build, after reflashing the base build. Didn't get a crash for b2g, nor hang of the tests.
New version (0.2.2) of sockit-to-me has been pushed to npm.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Do you need to update node_modules in its separate repository, and package.json+hash in gaia?
Flags: needinfo?(hub)
The b2g.tarako.mozilla-b2g28_v1_3t.v1.3t.mozperftest job is still using sockit-to-me 0.1.8
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Julien Wajsberg [:julienw] from comment #17) > Do you need to update node_modules in its separate repository, and > package.json+hash in gaia? Yes. I do. I didn't notice the bug had been marked as "FIXED". *sigh* But I was about to go to that next.
Status: REOPENED → ASSIGNED
Flags: needinfo?(hub)
Things that need to happen: 1. build sockit-to-me modules in gaia node module 2. bump the version in marionette-js-client 3. update gaia node modules, and update the sha1 in gaia and 4. Backport to 1.3t For 1. I need to setup an Ubuntu VM on this machine as I'm travelling.
:aus I forgot check the binaries in. No biggie for the package, I take care of that in the gaia-node-modules PR. Thanks
Attachment #8435851 - Flags: review?(aus)
Comment on attachment 8435850 [details] [review] Link to Github pull-request: https://github.com/mozilla-b2g/gaia-node-modules/pull/45 I'll review stamp it as it's just node_modules. Just make sure we have a green build before landing in master, thanks!
Attachment #8435850 - Flags: review?(kgrandon) → review+
Comment on attachment 8435867 [details] [review] Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/20149 R+ assuming travis is green. Is there no package.json bump necessary?
Attachment #8435867 - Flags: review?(kgrandon) → review+
(In reply to Kevin Grandon :kgrandon from comment #26) > R+ assuming travis is green. Is there no package.json bump necessary? No. It is in marionette-client, and even then, just npm update pulled the right version.
Comment on attachment 8435867 [details] [review] Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/20149 Make obsolete: a later changed mode the gaia_node_module past this revision.
Attachment #8435867 - Attachment is obsolete: true
Marking as fixed. Gaia commit 5b6f7652dd492281fbcb31cbbcf9c2a2d43543c9 brought the new sockit-to-me as a side effect.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.0 S4 (20june)
Note: I don't know if we want this for Tarako.
Comment on attachment 8435851 [details] [review] Link to Github pull-request: https://github.com/mozilla-b2g/sockit-to-me/pull/17 This looks OK but we should disable support for node 0.8 (or fix the tests) before merging.
Attachment #8435851 - Flags: review?(aus) → review+
Hamachi on master looks to be very stable now, but this is still happening on Tarako on v1.3t.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Requesting approval. Note to uplifters: I'll do it, it is no trivial. Note to approval: NPOTB. This is only the tests.
Status: REOPENED → RESOLVED
blocking-b2g: --- → 1.3T?
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
You don't need a blocking flag, just land it with a=npotb.
blocking-b2g: 1.3T? → ---
Comment on attachment 8439211 [details] [review] Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/20442 Keving do you mind reviewing? This is tarako.
Attachment #8439211 - Flags: review?(kgrandon)
Comment on attachment 8439211 [details] [review] Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/20442 I did not test, but I generally don't mind review stamping these as they're not part of the build. Just make sure that travis or gaia-try looks ok (though I think we have a few intermittent failures there).
Attachment #8439211 - Flags: review?(kgrandon) → review+
Yep. looks like 1.3t is working well now. Thanks!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: