Closed Bug 817638 Opened 12 years ago Closed 11 years ago

Excludetest list of all content test failures in content for b2g.json

Categories

(Testing :: Mochitest, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla22

People

(Reporter: martijn.martijn, Assigned: martijn.martijn)

References

Details

Attachments

(2 files, 12 obsolete files)

30.26 KB, text/plain
Details
72.32 KB, patch
Details | Diff | Splinter Review
Attached file b2g.json for content (obsolete) —
I've attached a b2g.json file that made a succesful mochitest run inside the content directory on the b2g emulator.

There might still be intermittent failures lurking around, but I've made 3 succesful runs with it without any errors.

The excludelist is probably too big. There are probably tests in there, that do not fail.
I got various problems with timeouts while running the tests, with failures that don't seem to be related to the test that is failing.
Ok, sorry for the delay, this is an include list of all the contest tests that are passing for sure.
Those are 590 files and 55399 mochitests.
Depends on: 819793
(In reply to Jonathan Griffin (:jgriffin) from comment #2)
> pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=9aa32b2d8e09

Ok, that try run gives tons of failures.
I think this was from the corrupt b2g.json I posted accidentally, with the spurious comma at the end of the include list.
Pushed version 1 (the blacklist) to try:

https://tbpl.mozilla.org/?tree=Try&rev=535d4d036335
(In reply to Jonathan Griffin (:jgriffin) from comment #4)
> Pushed version 1 (the blacklist) to try:
> 
> https://tbpl.mozilla.org/?tree=Try&rev=535d4d036335

There are a bunch of content/media failures here that will need to be excluded.  I've retriggered chunk 3, since it caused a crash, but that may have nothing to do with the particular test that was running; see bug 821420.
(In reply to Jonathan Griffin (:jgriffin) from comment #5)
> There are a bunch of content/media failures here that will need to be
> excluded. 

The errors in that try-run seem to have happened because the media files didn't exist for some reason.
Attached file b2g.json for content, updated (obsolete) —
Retested, I got some new failures, so added those to the exclude list.
test_chaining.html didn't seem to be failing. It might work on try-server now, too.
Attachment #687817 - Attachment is obsolete: true
Attached file 709957: b2g.json for content, updated (obsolete) —
2 failures in the testrun:
test_input_sanitization.html | Test timed out.
test_input_typing_sanitization.html | [SimpleTest.finish()] this test already called finish!
I added those to the exclude list.
(In reply to Jonathan Griffin (:jgriffin) from comment #10)
> Pushed new version to try:
> https://tbpl.mozilla.org/?tree=Try&rev=38e06a2499f0

Part 6 of that run has errors like this one:
18:27:44 INFO - 265 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_reset_src.html | We expected 'tests/content/media/test/320x240.ogv' to exist, but it doesn't!

That is a similar error as the last tryserver run, but now only in this different testfile.
This is worrying, I don't get these errors.

The error is generated here:
http://mxr.mozilla.org/mozilla-central/source/content/media/test/manifest.js#224
That part of the code in manifest.js seems to be needed only by test_info_leak.html, but it is run by every test file that uses manifest.js.

So one way of fixing this is to move that block of code to test_info_leak.html.

But that is only hiding a deeper issue that is causing this, because this failure shouldn't be happening in the first place.
test_reset_src.html seems to be the first test that is calling manifest.js, that might give a clue.
That would mean if I would exclude test_reset_src.html, then this failure would go to the next test file that calls manifest.js.
Hmm, the other files that use manifest.js, are already excluded.
I'll just add test_reset_src.html to the exclude list for now, then.
I haven't tried it locally, but it should be green on my machine anyway.
Attachment #709957 - Attachment is obsolete: true
Attachment #710120 - Attachment is obsolete: true
Ok, I talked with Doug Turner about this.
He says that code like this:
216   var dirSvc = Cc["@mozilla.org/file/directory_service;1"].
217                getService(Ci.nsIProperties);
218   var f = dirSvc.get("CurWorkD", Ci.nsILocalFile);
doesn't work correctly in the child process (it should work fine in the chrome process).

I was looking at some other tests that were using similar code and I already excluded those.
Doug asked me to provide a simple testcase that is failing with this code.
There are 54209 tests that are being run with the last b2g.json file.
Currently running the mochitests with this patch on my device.
Ok, all tests pass on my machine with that latest patch. Total test run count: 222396
Tryrun is all green.
These chunks are getting to be very long-running; I will file a but to increase the number of chunks before turning this on.
Depends on: 840236
https://hg.mozilla.org/integration/mozilla-inbound/rev/b069f50c139e
Assignee: nobody → martijn.martijn
Target Milestone: --- → mozilla21
I can add the test failures to the exclude list, but the B2G process crashes are worrisome.
Yes, feel free to file specific bugs for these failures before adding them to the exclude list.
Ok, the test_text_selection.html file was checked in at 2013-02-11 17:22 +1100, so that was a case of a new test that is failing on b2g.
The same goes for test_bug839753.html, which was checked in at 2013-02-12 08:42 -0800.

I'm rerunning all the tests to catch all new tests that were checked in and I'll upload an updated patch after that.
I realize that in this case, the new tests came in before we landed the exclude list, so that makes sense. 

For anything that fails that's checked in after we land the exclude list, though, I would expect devs to be responsible for either making the test work or (less optimal) excluding it themselves on B2G. We'll have to figure out how to message that out.
That's how it will work.  If someone checks in a test that fails, the sheriffs will back it out, and it will be up to the test author to deal with that, either by changing the test or excluding it.  We won't be in the loop for that decision.
Attached patch b2g.json patch with content (obsolete) — Splinter Review
Ok, this is the latest diff.
Attachment #710687 - Attachment is obsolete: true
Attachment #711237 - Attachment is obsolete: true
Try-server all green.
pushed to try again, since 5 days elapsed since the last.  I'll land tomorrow if this is green.

https://tbpl.mozilla.org/?tree=Try&rev=49a73c62c13a
There were two oranges in the try run; I've retriggered them to see if they were just random B2G crashes that we sometimes experience during mochitests.
These retriggers disappeared for some reason, probably related to the fact that we've moved these test jobs over to Amazon VM's.  I've pushed to try again; these results should be back within 2 hours.

https://tbpl.mozilla.org/?tree=Try&rev=ef490be43574
Looks like this run has the same failures as the previous, so a few more excludes will be needed.
Attached patch b2g.json patch with content (obsolete) — Splinter Review
Yeah, content/xbl/test/test_bug821850.html and content/base/test/test_xbl_userdata.xhtml were added recently (22/2 and 23/2).

And it's kind of known that xbl is not working currently for b2g (might be as simple as flipping a pref to fix that).

So that seems to be the same failures still, I added those to this patch.
I think it is safe to land this on mozilla-central as it is.

It's weird, though, that those test failures cause tbpl to report "B2G process has crashed" in the log.
Attachment #715935 - Attachment is obsolete: true
This patch seems to have caused most B2G mochitest chunks to die in a weird way in inbound, so I've backed it out to see if it fixes it:

https://hg.mozilla.org/integration/mozilla-inbound/rev/d859fffb9648
Attached patch b2g.json patch with content (obsolete) — Splinter Review
My guess is that the parsing of the b2g.json file is taking a too long of a time, because the exclude list got much larger.

With this patch, I'm adding content, but am excluding the content/html/content/test/ and content/media/ subdirectories for now, because those would add a lot of entries to the exclude list.
With this the startup of the b2g mochitest run is reasonable, parsing of the b2g.json file seems to take <30s on my Macbook Pro, while it would take 1 minute or so with the previous patch.

I also removed all the error messages, since they aren't that useful anyway and it might speed up the parsing of the b2g.json also a little bit.

I'm currently rerunning with this patch applied, so far so good.
Attachment #719162 - Attachment is obsolete: true
It's not clear to me when the exclude list of the b2g.json file would be too big, but hopefully mozilla-central b2g mochitest run can handle this latest patch.

Then, I guess, it's a matter of reducing the amount of entries in the exclude list.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Ok, my test run didn't give any failures.
This patch doesn't apply cleanly to mozilla-central; can you provide a version that does?

applying b2g.json.diff
patching file testing/mochitest/b2g.json
Hunk #1 FAILED at 0
1 out of 1 hunks FAILED -- saving rejects to file testing/mochitest/b2g.json.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working dir
errors during apply, please fix and refresh b2g.json.diff
Attached patch b2g.json patch with content (obsolete) — Splinter Review
Ok, new patch, this should apply.
I haven't carried out a whole run with this patch applied yet (something went wrong last night, tryint to run it). It's currently in dom-level-2 and all green until there.
Attachment #719453 - Attachment is obsolete: true
Ok, test run was all green.
With an overabundance of caution, I pushed to try:

https://tbpl.mozilla.org/?tree=Try&rev=e6f98f402bf0
Attached patch updated patch (obsolete) — Splinter Review
There are 3 failures, I added 2 of them in this patch:
+        "content/base/test/test_bug282547.html":"",
+        "content/events/test/test_focus_disabled.html":"",

The dom/mobilemessage/tests/test_sms_basics.html failures seems to be an intermittent failure according to bug 795539, so that's nothing new here.

Currently rerunning the mochitests.
(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #46)
> Currently rerunning the mochitests.

No failures here.
Have been traveling for a few days.  Pushed to try again, just in case:

https://tbpl.mozilla.org/?tree=Try&rev=659de4c3fc2f
The failure in chunk 1 is caused by content/base/test/test_object.html , which appears to be a new test. I'll add that one to the exclude list.
The failure in chunk 2 seems to be an intermittent failure, for which various bugs have been filed (I don't really know which one of the bugs covers this issue). So I think we can ignore that one for now.
The failure in chunk 3 is the test_sms_basics.html one again.
(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #49)
> The failure in chunk 2 seems to be an intermittent failure, for which
> various bugs have been filed (I don't really know which one of the bugs
> covers this issue). So I think we can ignore that one for now.

A better way to put that would be "timing out after 1200 seconds without output is a totally generic way to fail, but looking at the log, it appears that b2g failed to even start up, so we'll have to retrigger that one since we didn't run any tests at all."

I also retriggered the mochitest-3, since a test which times out after 330 seconds without output is also fatal, so you didn't run any of the tests after that known failure.
Attached patch updated patch (obsolete) — Splinter Review
(In reply to Phil Ringnalda (:philor) from comment #50)
Yeah, you're right. Soon, I'll retest locally again.
This updated patch could be resubmitted to tryserver in the meantime.
Attachment #721322 - Attachment is obsolete: true
Attachment #722133 - Attachment is obsolete: true
Hrm, got already 1 failure in this run, so I'll need to update the patch.
I'm getting this failure (reproducible for me):
"dom/mobilemessage/tests/test_sms_basics.html":" navigator.mozSms should return null - got [object MozSmsManager], expected null"

This is happening because of the check-in of dom/mobilemessage/tests/test_sms_basics.html , the patch from bug 844429.
There, "dom/sms" was renamed to "dom/mobilemessage": "". My patch didn't have that change yet. Now I can replace it with "dom/mobilemessage/tests/test_sms_basics.html", since that's the only one that is failing.

Btw, I noticed that the longer the mochitests are running, the slower they get. It's really very slow at the end.
Attached patch updated patch (obsolete) — Splinter Review
Attachment #724462 - Attachment is obsolete: true
Attached patch updated patchSplinter Review
For some reason test_object.html was not included in the exclude list.
Attachment #724673 - Attachment is obsolete: true
Ok, all tests passing on my b2g test run with that latest patch.
(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #53)
> 
> Btw, I noticed that the longer the mochitests are running, the slower they
> get. It's really very slow at the end.

We've noticed this in other non-mochitest test runs as well; there seems likely to be some kind of memory leak happening.
(In reply to Jonathan Griffin (:jgriffin) from comment #57)
> > Btw, I noticed that the longer the mochitests are running, the slower they
> > get. It's really very slow at the end.
> 
> We've noticed this in other non-mochitest test runs as well; there seems
> likely to be some kind of memory leak happening.

I filed bug 851187 for it.
Everything is green on the try server, except for this one:
16:33:30 WARNING - TEST-UNEXPECTED-FAIL | /tests/dom/tests/mochitest/dom-level2-core/test_documentcreateattributeNS01.html | application timed out after 330 seconds with no output
16:34:35 ERROR - Return code: 1
16:34:36 ERROR - F/libc ( 817): Fatal signal 11 (SIGSEGV) at 0x43f00000 (code=2)
16:34:36 ERROR - This usually indicates the B2G process has crashed
16:34:36 ERROR - F/libc ( 853): Fatal signal 11 (SIGSEGV) at 0x00000030 (code=1)
16:34:36 ERROR - This usually indicates the B2G process has crashed

I don't know what to do with this one. I guess this is an intermittent orange?
I'm trying to imagine what an absolute joy it must be to not even know of the existence of bug 821420, but I just can't picture it.

That'd be bug 821420. When you wind up hitting the same pattern in reftests and crashtests, that'd be bug 818103.
Ok, after a retry of the failed run, it is green, so tentatively saying the last patch is ready to be checked in on mozilla-central.
https://hg.mozilla.org/mozilla-central/rev/9ed8fb86e214
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
It looks to me like this bug disabled the Web Audio tests on b2g?  Or some other patch did, since we used to run these tests on b2g unless we suddenly stopped.  It would be nice if you get an r+ from respective module owners when you stop running tests on b2g by editing b2g.json, otherwise everyone needs to watch the changes to this file to make sure their tests are not suddenly disabled without them knowing.
No, this bug enabled content/ mochitests for b2g and added those tests in content/ to the exclude list that were failing in b2g.
Afaik, the web audio tests didn't run on b2g on tbpl.

If I would stop running tests on b2g tbpl, I certainly would CC the relevant developers. And more likely on getting the tests to not fail anymore.\
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: