811779 - Expand set of reftests running on m-i/m-c/try

Assignee

Description

•

13 years ago

We are currently running b2g emulator reftest-sanity tests on the core branches. We should expand the set of reftests being run to all passing tests.

Andrew Halberstadt [:ahal]

Assignee

Comment 1

•

13 years ago

Attached patch Base patch (obsolete) — Details — Splinter Review

From Aug-Oct I triaged most of the failing/random reftests that cropped up and ended up with this patch. It's been awhile since I ran it so there are probably new failures by now. I plan on checking it in to cedar and trying to get a stable green run again.

Andrew Halberstadt [:ahal]

Assignee

Updated

•

13 years ago

Depends on: 811783

Andrew Halberstadt [:ahal]

Assignee

Comment 2

•

13 years ago

Pushed to cedar: https://hg.mozilla.org/projects/cedar/rev/a0b15032b295

Andrew Halberstadt [:ahal]

Assignee

Comment 3

•

13 years ago

Most of the chunks are getting killed because they are taking more than an hour to run. When running these on my desktop they took around ~20 min but I guess the slaves aren't as powerful. We'll want to: 1) Figure out how to speed them up 2) Use more chunks 3) Possibly get the emulators running on mac where wait times aren't as high This is all outside the scope of this bug. It'll be a slow process.

Aki Sasaki (not active)

Comment 4

•

13 years ago

I overwrote local changes on Cedar in the latest merge: https://hg.mozilla.org/projects/cedar/diff/5b7cce7a7f1b/layout/reftests/font-inflation/reftest.list

Andrew Halberstadt [:ahal]

Assignee

Comment 5

•

13 years ago

For posterity here is :cjones' rankings of reftest b2g importance: == Critical == I wouldn't consider shipping a phone without knowing the exact state of these tests. In order of importance. crashtests layout/reftests/reftest-sanity layout/reftests/bugs layout/reftests/invalidation == High priority == Tests that are critical to the project and for which desktop/android coverage is *not* mostly sufficient. In no particular order. content/canvas/test/reftest image/test/reftest gfx/tests/reftest layout/reftests/position-dynamic-changes layout/reftests/text layout/reftests/canvas layout/reftests/svg/smil layout/reftests/svg/as-image layout/reftests/font-inflation layout/reftests/transform layout/reftests/image layout/reftests/scrolling layout/reftests/forms layout/reftests/css-gradients layout/reftests/ogg-video layout/reftests/transform-3d layout/reftests/layers layout/reftests/flexbox layout/reftests/webm-video layout/reftests/selection layout/reftests/css-selectors layout/reftests/css-calc layout/reftests/font-face == Normal priority == Tests that we should run but for which desktop/android coverage *is* mostly sufficient. In no particular order. content/html/content/reftests content/test/reftest layout/reftests/border-radius layout/reftests/cssom layout/reftests/text-shadow layout/reftests/columns layout/reftests/list-item layout/reftests/table-width layout/reftests/css-ui-valid layout/reftests/css-optional layout/reftests/box-sizing layout/reftests/bidi layout/reftests/font-matching layout/reftests/table-background layout/reftests/text-indent layout/reftests/marquee layout/reftests/image-element layout/reftests/indic-shaping layout/reftests/line-breaking layout/reftests/datalist layout/reftests/css-transitions layout/reftests/svg layout/reftests/css-visited layout/reftests/css-charset layout/reftests/table-dom layout/reftests/counters layout/reftests/css-parsing layout/reftests/unicode layout/reftests/text-decoration layout/reftests/box-ordinal layout/reftests/abs-pos layout/reftests/table-overflow layout/reftests/css-placeholder layout/reftests/css-default layout/reftests/text-transform layout/reftests/text-overflow layout/reftests/pagination layout/reftests/image-rect layout/reftests/z-index layout/reftests/percent-overflow-sizing layout/reftests/object layout/reftests/font-features layout/reftests/image-region layout/reftests/inline-borderpadding layout/reftests/css-disabled layout/reftests/pixel-rounding layout/reftests/native-theme layout/reftests/box-shadow layout/reftests/table-bordercollapse layout/reftests/floats layout/reftests/css-import layout/reftests/text-svgglyphs layout/reftests/generated-content layout/reftests/table-anonymous-boxes layout/reftests/w3c-css layout/reftests/first-line layout/reftests/box-properties layout/reftests/css-mediaqueries layout/reftests/css-valid layout/reftests/css-invalid layout/reftests/first-letter layout/reftests/css-ui-invalid layout/reftests/box layout/reftests/css-enabled layout/reftests/backgrounds layout/reftests/ib-split layout/reftests/tab-size layout/reftests/border-image layout/reftests/css-submit-invalid layout/reftests/margin-collapsing layout/reftests/dom layout/reftests/css-required layout/reftests/css-valuesandunits layout/reftests/mathml parser/htmlparser/tests/reftest editor/reftests netwerk/test/reftest toolkit/content/tests/reftests widget/reftests == Completely worthless == (Just a waste of CPU cycles, please don't run.) dom/plugins/test/reftest editor/reftests/xul layout/reftests/printing layout/reftests/xul layout/reftests/xul-document-load layout/xul toolkit/themes/pinstripe/reftests

Andrew Halberstadt [:ahal]

Assignee

Comment 6

•

13 years ago

I disabled everything except the critical and high priority tests on cedar. Unfortunately because chunking doesn't take into account skipped tests, and due to the uneven distribution of skipped tests, some chunks are still timing out (since they are still running 1000+ tests while other chunks are only running ~100-200 tests).

Andrew Halberstadt [:ahal]

Assignee

Comment 7

•

13 years ago

The easiest solution would probably be to create a separate reftest_b2g.list root manifest. This way we wouldn't technically be skipping everything and all chunks would see an even distribution of tests.

cmtalbert

Comment 8

•

13 years ago

Yes, for now, comment 7 is the way forward. The chunking problem itself is being worked on in bug 818156.

Depends on: 818156

Andrew Halberstadt [:ahal]

Assignee

Updated

•

13 years ago

Depends on: 820958

Andrew Halberstadt [:ahal]

Assignee

Comment 9

•

12 years ago

Attached patch Patch 1.0 - Enable larger set of b2g reftests — Details — Splinter Review

I'm fairly confident that the set of reftests enabled by this patch is green enough on cedar to get the ball rolling. Notes: * the two important files to look at are layout/reftests/reftest.list to see the overall set of tests that will be run and layout/tools/reftest/runreftestb2g.py since I had to turn off <iframe mozbrowser> due to bug 785074 which causes tons of additional failures * I used skip-if instead of random or fails-if to avoid unnecessary test slave load * after landing this patch we'll need to update the mozharness configs to point to the root manifest instead of reftest-sanity * if you'd rather I create a separate root manifest for B2G as opposed to skip-if'ing everything in the main one, I can attach a new patch There are obviously still some fundamental problems with the reftest harness on B2G and this patch isn't going to make anyone happy (including myself). But at the end of the day it will put us in a better position in terms of test coverage than we are currently.

Attachment #681547 - Attachment is obsolete: true

Attachment #699322 - Flags: review?(jgriffin)

Attachment #699322 - Flags: feedback?(jones.chris.g)

Jonathan Griffin (:jgriffin)

Comment 10

•

12 years ago

Comment on attachment 699322 [details] [diff] [review] Patch 1.0 - Enable larger set of b2g reftests Review of attachment 699322 [details] [diff] [review]: ----------------------------------------------------------------- Looks like we're still getting the odd random orange on cedar; I guess we can cover those with new bugs, or skip those as well if they become too frequent.

Attachment #699322 - Flags: review?(jgriffin) → review+

Andrew Halberstadt [:ahal]

Assignee

Comment 11

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/d932f2172ce2

Whiteboard: [automation-needed-in-aurora][automation-needed-in-b2g18]

Ed Morley [:emorley]

Comment 12

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/d932f2172ce2

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla21

Ed Morley [:emorley]

Comment 13

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/d932f2172ce2

Ed Morley [:emorley]

Comment 14

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/d932f2172ce2

Andrew Halberstadt [:ahal]

Assignee

Comment 15

•

12 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/885f829b692b This patch applied cleanly to aurora, but there were massive differences on the b2g-18 branch. I don't think that merging it by hand will produce a green test run anyway, so I'd advocate not turning these tests on there and waiting for the next merge.

Andrew Halberstadt [:ahal]

Assignee

Updated

•

12 years ago

Whiteboard: [automation-needed-in-aurora][automation-needed-in-b2g18]

Andrew Halberstadt [:ahal]

Assignee

Comment 17

•

12 years ago

I made a typo when merging the root manifest to aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/a6a8dd94822b

Jonathan Griffin (:jgriffin)

Comment 18

•

12 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #15) > https://hg.mozilla.org/releases/mozilla-aurora/rev/885f829b692b > > This patch applied cleanly to aurora, but there were massive differences on > the b2g-18 branch. I don't think that merging it by hand will produce a > green test run anyway, so I'd advocate not turning these tests on there and > waiting for the next merge. There aren't any planned merges to b2g18. We may need to bite the bullet and hide the tests on b2g18 until we can exclude all of the failures there.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 19

•

12 years ago

Comment on attachment 699322 [details] [diff] [review] Patch 1.0 - Enable larger set of b2g reftests Sorry, this f? hit me at a really bad crunch time. I didn't look through all the manifest changes but it's usually bad form to disable tests without a bug to re-enable or comment explaining why.

Attachment #699322 - Flags: feedback?(jones.chris.g)

Andrew Halberstadt [:ahal]

Assignee

Comment 20

•

12 years ago

(In reply to Chris Jones [:cjones] [:warhammer] from comment #19) > Sorry, this f? hit me at a really bad crunch time No worries, I mostly just wanted you to be aware of this horrible patch and the fact that more tests are running. > it's usually bad form to disable tests without > a bug to re-enable or comment explaining why. Agreed, though: A) We are running 10 chunks at 30+ minutes each for over 5 hours of B2G reftest per push. Realistically these tests are never coming back on with emulators as we just don't have capacity for even this much. When we switch to pandaboards I'll re-enable everything, re-triage on pandas and emulator reftests will be phased out. B) There are so many failures (possibly in the thousands) that I don't know if it is harness, platform, emulator or test related. The best I could do is comment with a tracking bug which isn't much more useful than nothing at all.

Daniel Holbert [:dholbert] (vacation until Jun 23)

Comment 21

•

12 years ago

Are there any tracking bugs filed on further-increasing the number of reftests that are run on B2G? We've got a frightening number of entire subdirectories marked as "skip-if(B2G)", from this bug's changeset - this part in particular, tweaking the toplevel reftest.list file: http://hg.mozilla.org/mozilla-central/diff/d932f2172ce2/layout/reftests/reftest.list (I'm assuming the situation was worse beforehand, but this still leaves us in a pretty bad state, reftest-coverage-wise, and I'm hoping we have plans to get better. :))

Jonathan Griffin (:jgriffin)

Comment 22

•

12 years ago

ahal, can you answer dholbert?

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Assignee

Comment 23

•

12 years ago

(In reply to Daniel Holbert [:dholbert] from comment #21) > Are there any tracking bugs filed on further-increasing the number of > reftests that are run on B2G? > > We've got a frightening number of entire subdirectories marked as > "skip-if(B2G)", from this bug's changeset - this part in particular, > tweaking the toplevel reftest.list file: > http://hg.mozilla.org/mozilla-central/diff/d932f2172ce2/layout/reftests/ > reftest.list > > (I'm assuming the situation was worse beforehand, but this still leaves us > in a pretty bad state, reftest-coverage-wise, and I'm hoping we have plans > to get better. :)) Yes, I'd really like to get more tests enabled as well. The main problem is that reftests can't be run on the Ubuntu AWS VM's (bug 818968) so we have to keep them running on actual hardware. Combined with the fact that they are *very* slow on the emulators (~40 minutes for 500 tests) we can't just wholesale enable them without getting long test backups. That being said, since we've moved other tests off of the Fedora pool, we do have a bit of spare capacity to enable some more tests. I've blogged/posted to dev.b2g about this in the past but no one seemed interested. To answer your question, there aren't any bugs filed to enable specific swathes of tests, but if you would like to, feel free to make them block the 'b2g-reftest' main tracking bug. Or if you want to give me a list of which tests are currently disabled that you think would be most useful to disable, I'd be happy to enable them when I have a few spare cycles.

Flags: needinfo?(ahalberstadt)

Phil Ringnalda (:philor)

Comment 24

•

12 years ago

Not really 40 minutes for 500 tests, it's actually more like 25 minutes for 500 tests, and 15 minutes of setup/teardown time per hunk. We could easily add another 1000 tests and not lose anything, just by switching from 10 hunks to 5.

Base patch 13 years ago Andrew Halberstadt [:ahal] 245.42 KB, patch		Details \| Diff \| Splinter Review
Patch 1.0 - Enable larger set of b2g reftests 12 years ago Andrew Halberstadt [:ahal] 323.54 KB, patch	jgriffin : review+	Details \| Diff \| Splinter Review