Closed
Bug 623613
Opened 14 years ago
Closed 14 years ago
Enable crashtests for out-of-process content on Fedora machines
Categories
(Release Engineering :: General, defect, P2)
Tracking
(fennec2.0+)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
fennec | 2.0+ | --- |
People
(Reporter: cjones, Assigned: armenzg)
References
Details
(Whiteboard: [unittest][new suite])
Attachments
(2 files, 3 obsolete files)
1.15 KB,
patch
|
lsblakk
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
2.75 KB,
patch
|
lsblakk
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
We would like to run reftests with out-of-process content in order to test the layout and gfx code used by fennec. This is a simple intermediate step to running reftests on n900s and/or tegras, which involves an unknown amount of work. We also need this support in place for multi-process FF.
To enable out-of-process content, the unit-tests script just needs to set the pref
browser.tabs.remote=true
If that works correctly, the reftest harness will print
REFTEST INFO | [CONTENT] Using browser remote=true
It would also be nice to run reftests with out-of-process content and GL compositing, which we're going to ship in fennec, but I don't know what's required to enable GL on our fedora slaves. Will file another bug.
Reporter | ||
Comment 1•14 years ago
|
||
bjacob tells me the slaves use the proprietary nvidia driver. We're passing all reftests with out-of-process content and GL enabled on my desktop with an nvidia driver, so I think we'll get reftest-ipc-gpu this "for free" with the other prereqs here.
It's probably best to turn on non-GL first.
Reporter | ||
Comment 2•14 years ago
|
||
I think this should block because it will likely be our only feasible path to automated m-c testing of rendering regressions in fennec, in the 4.0 timeframe. Desktop linux is the closest approximation to fennec-on-device.
tracking-fennec: --- → ?
Comment 3•14 years ago
|
||
Do you want this pref set for the existing reftest / crashtest runs, or do you want an additional set of runs with this pref set?
Which branches should this be applied to?
Reporter | ||
Comment 4•14 years ago
|
||
These need to be additional runs for now, say RIPC (reftest ipc) and CIPC (crashtest IPC) for short.
Re: branches, let's say
- m-c
- tracemonkey
- places
- project twigs
- electrolysis
in that order of importance.
Updated•14 years ago
|
tracking-fennec: ? → 2.0-
Reporter | ||
Comment 5•14 years ago
|
||
It's ridiculous to me that we'd blocking- a bug to *turn on code that already exists for running automated rendering tests on a product we're shipping branded Firefox*. And without rationale.
Comment 6•14 years ago
|
||
(In reply to comment #5)
> It's ridiculous to me that we'd blocking- a bug to *turn on code that already
> exists for running automated rendering tests on a product we're shipping
> branded Firefox*. And without rationale.
We shouldn't be branding these as Firefox. Blocking- doesn't mean we shouldn't do it, it just means we won't hold the release for it.
Comment 7•14 years ago
|
||
(In reply to comment #5)
> It's ridiculous to me that we'd blocking- a bug to *turn on code that already
> exists for running automated rendering tests on a product we're shipping
> branded Firefox*. And without rationale.
I was told that blocking- only means "we would not stop shipping a product, but
the bug is still important"
Reporter | ||
Comment 8•14 years ago
|
||
(In reply to comment #6)
Please see comment 0 and comment 2. I don't particularly care about running reftests on the linux-desktop fennec builds we ship to extension developers; I care about running them on the fennec we ship to users. Running them on linux desktop (with FORCE_SHMEM, i.e. avoid X11) is the closest approximation to that feasible in the 4.0 timeframe
(In reply to comment #7)
> I was told that blocking- only means "we would not stop shipping a product, but
> the bug is still important"
Why doesn't *turning on* these automated rendering tests block? *Not* turning them on means fennec folks will continue wasting time fixing regressions that tests could have caught. E.g., if someone pushes a green patch to m-c that breaks fennec in the eleventh hour before fennec release, fennec folk will have unpleasant options: (i) frantically try to fix bustage in the eleventh hour; (ii) slip release; (iii) beg and plead to have the m-c push backed out; (iv) rely on QA to do machines' jobs.
Reporter | ||
Comment 9•14 years ago
|
||
Let me disentangle comment 8 a bit. The important goals here are, in order
(1) know whether fennec renders our reftests correctly and doesn't crash on the crashtests
(2a) know which m-c and m-b pushes break reftests/crashtests
(2b) conserve developer time by running (ref|crash)test-ipc on m-c and tryserver, instead of tracking down regressions days after the fact
Whether (1) is blocking- or not is a decision for product management. I would be sad if the blocking- on this bug encompassed this goal too; a blocking- for bug 615386 would be equivalent though. If (1) itself blocks, and this bug *doesn't* block, then we need another system for periodically running these tests. Whose problem does that become? QA's?
If (2) doesn't block, fine, but IMHO the decision needs to be made wrt the risks from not having this infrastructure (v. comment 8).
(I'm assuming here that other groups aren't going to take on tasks that only affect blocking- bugs, before 4.0 ships.)
Updated•14 years ago
|
Priority: -- → P3
Whiteboard: [unittest][new suite]
Updated•14 years ago
|
tracking-fennec: 2.0- → 2.0+
Comment 10•14 years ago
|
||
This is the only current approach to getting e10s (<browser remote="true"/>)
reftests, so this blocks for Fennec
Comment 12•14 years ago
|
||
I think these are the proper buildbot config changes. Note this will enable these on all OS's, so they may not be exactly right.
Attachment #511159 -
Flags: feedback?(catlee)
Comment 13•14 years ago
|
||
These are the changes to add the new suite to the unittest steps. It's untested, the entire tools staging area is booked for other efforts right now, so I'll need your help (catlee, releng, etc) testing this, or I can test it once our staging area frees up next week.
I have two questions here, both for :cjones
1. The direct3D and opengl tests set the 'MOZ_ACCELERATED' environment variable. Should the reftest-ipc/crashtest-ipc tests do the same? You don't have that in your makefile, so I didn't think so, but I figured it wouldn't hurt to double check.
2. When a reftest runs without USE_WIDGET_LAYERS in the canvas context, then we will get the following output:
REFTEST TEST-UNEXPECTED-FAIL | | can't drawWindow remote content
REFTEST INFO | WARNING: USE_WIDGET_LAYERS disabled
Now, the way things are set up, the 'TEST-UNEXPECTED-FAIL' will definitely cause an orange to show up in the test automation system. So, as long as we get both lines, we're ok. However, if we only get the "INFO" line, then we won't see anything go orange. Is that what you want? From our discussion earlier, I tend to think that you'd want the WARNING: USE_WIDGET_LAYERS line to also cause an orange so it should be marked with a "TEST-UNEXPECTED-FAIL" string. I don't think we'll ever actually not see both of these messages but as they are in two different if statements in reftest.js [1] I want to be sure we cover our bases. If you want, I'll attach an additional patch to change the "WARNING" line to a "UNEXPECTED-FAIL" line as that would be cleaner than trying to change buildbot to ignore all "INFO" lines except this one. (buildbot currently ignores all INFO lines)
[1]: http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#942
Attachment #511162 -
Flags: feedback?(catlee)
Reporter | ||
Comment 14•14 years ago
|
||
(In reply to comment #13)
> I have two questions here, both for :cjones
> 1. The direct3D and opengl tests set the 'MOZ_ACCELERATED' environment
> variable. Should the reftest-ipc/crashtest-ipc tests do the same?
No, those are unsupported configurations currently. In fact, on windows we have to force acceleration *off*. See
http://mxr.mozilla.org/mozilla-central/source/testing/testsuite-targets.mk#115
> 2. When a reftest runs without USE_WIDGET_LAYERS in the canvas context, then we
> will get the following output:
> REFTEST TEST-UNEXPECTED-FAIL | | can't drawWindow remote content
Correct, this is the error that means the reftest isn't actually testing anything.
> Now, the way things are set up, the 'TEST-UNEXPECTED-FAIL' will definitely
> cause an orange to show up in the test automation system. So, as long as we get
> both lines, we're ok. However, if we only get the "INFO" line, then we won't
> see anything go orange. Is that what you want?
In the current code, any time we try to drawWindow() but can't USE_WIDGET_LAYERS, we'll get "REFTEST TEST-UNEXPECTED-FAIL | | can't drawWindow remote content" in the log and the test will fail. This is by design and won't change. So we're OK. (In fact, because of this, I incidentally discovered that our 10.6 slaves were set at too low a resolution, bug 622229.)
Updated•14 years ago
|
Attachment #511159 -
Flags: feedback?(catlee) → feedback+
Updated•14 years ago
|
Attachment #511162 -
Flags: feedback?(catlee) → feedback+
Reporter | ||
Comment 15•14 years ago
|
||
Almost forgot --- to use more of the code that fennec-on-device does, we should run these tests with the environment variable MOZ_LAYERS_FORCE_SHMEM_SURFACES=1 (in the same way that MOZ_ACCELERATED is used for windows/mac).
Updated•14 years ago
|
Assignee: nobody → ctalbert
Comment 16•14 years ago
|
||
Is this ready to land?
Assignee | ||
Updated•14 years ago
|
Assignee: ctalbert → armenzg
Priority: P3 → P2
Assignee | ||
Comment 17•14 years ago
|
||
Unless I am doing completely wrong we should not enable these test suites until we have all oranges fixed.
elif suite in ('reftest', 'direct3D', 'opengl'):
return ['reftest/tests/layout/reftests/reftest.list']
+ elif suite in ('reftest-ipc'):
+ return ['--setpref=browser.tabs.remote=true',
+ '--setpref=layers.accelerate-all=true',
+ 'reftest/tests/layout/reftests/reftest.list']
Reftest-ipc 42 failures:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1298935831.1298937989.24362.gz&fulltext=1
FTR changeset f7de2d113096
NOTE: I have to re-run crashtest-ipc as I used the reftest.list rather than crashtest.ipc.
Assignee | ||
Comment 18•14 years ago
|
||
Does this look good for crashtest-ipc?
+ elif suite == 'crashtest-ipc':
+ return ['--setpref=browser.tabs.remote=true',
+ '--setpref=layers.accelerate-all=true',
+ 'reftest/tests/layout/crashtest/crashtest.list']
Notice that for reftest we are getting to run hardware acceleration for the first time.
Comment 19•14 years ago
|
||
Just peeked at one failure in this log that caught my eye:
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/reftest/tests/modules/plugin/test/reftest/plugin-sanity.html | image comparison (==)
First, both images have a vertical scrollbar(?!), which seems bound to cause problems. Second, the test plugin doesn't appear to be installed, as the test image is showing the plugin placeholder UI. The reftest commandline does have "--extra-profile-file=bin/plugins", so unless the test plugin is missing or isn't loading for some reason, I don't know what's up there.
Comment 20•14 years ago
|
||
Looking up in the log, the test plugin appears to be there:
Archive: firefox-4.0b13pre.en-US.linux-i686.tests.zip
inflating: bin/plugins/libnptest.so
Reporter | ||
Comment 21•14 years ago
|
||
(In reply to comment #17)
> Unless I am doing completely wrong we should not enable these test suites until
> we have all oranges fixed.
> elif suite in ('reftest', 'direct3D', 'opengl'):
> return ['reftest/tests/layout/reftests/reftest.list']
> + elif suite in ('reftest-ipc'):
> + return ['--setpref=browser.tabs.remote=true',
> + '--setpref=layers.accelerate-all=true',
> + 'reftest/tests/layout/reftests/reftest.list']
>
(In reply to comment #18)
> Does this look good for crashtest-ipc?
> + elif suite == 'crashtest-ipc':
> + return ['--setpref=browser.tabs.remote=true',
> + '--setpref=layers.accelerate-all=true',
> + 'reftest/tests/layout/crashtest/crashtest.list']
>
> Notice that for reftest we are getting to run hardware acceleration for the
> first time.
For both of these
- We don't want layers.accelerate-all=true set for reftest runs, because that's not the default configuration on fennec.
- We also would like MOZ_LAYERS_FORCE_SHMEM_SURFACES=1 set in the environment. This has the reftests use more of the code that we use on fennec.
(In reply to comment #19)
> Just peeked at one failure in this log that caught my eye:
> REFTEST TEST-UNEXPECTED-FAIL |
> file:///home/cltbld/talos-slave/test/build/reftest/tests/modules/plugin/test/reftest/plugin-sanity.html
> | image comparison (==)
>
> First, both images have a vertical scrollbar(?!), which seems bound to cause
> problems. Second, the test plugin doesn't appear to be installed, as the test
> image is showing the plugin placeholder UI. The reftest commandline does have
> "--extra-profile-file=bin/plugins", so unless the test plugin is missing or
> isn't loading for some reason, I don't know what's up there.
Yeah, windowless plugin tests ought to be working fine, they were on my desktop last I checked. Maybe the content process can't find the plugins dir for some reason?
I pushed a tryserver run last week that forced on reftest-ipc, and I'm aware of the failures. The tests that are failing have regressed in the last couple of months. Haven't had time to fix them again yet.
Reporter | ||
Comment 22•14 years ago
|
||
(In reply to comment #21)
> - We don't want layers.accelerate-all=true set for reftest runs,
Oh and I just remembered, that pref was changed in the last month or so. The new way to force GPU rendering of layers is "layers.acceleration.force-enabled=true", but again to be clear we don't want that for the fedora reftests.
Assignee | ||
Comment 23•14 years ago
|
||
Attachment #511159 -
Attachment is obsolete: true
Attachment #515960 -
Flags: review?(lsblakk)
Comment 24•14 years ago
|
||
Comment on attachment 515960 [details] [diff] [review]
[buildbot-configs] add crashtest-ipc and reftest-ipc for Fedora32 machines
looks fine to me.
Attachment #515960 -
Flags: review?(lsblakk) → review+
Assignee | ||
Comment 25•14 years ago
|
||
Attachment #511162 -
Attachment is obsolete: true
Attachment #515962 -
Flags: review?(lsblakk)
Assignee | ||
Comment 26•14 years ago
|
||
crashtest-ipc 0 failures:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299006497.1299006857.12634.gz&fulltext=1
reftest-ipc 42 failures:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299007022.1299009197.25812.gz&fulltext=1
I see MOZ_LAYERS_FORCE_SHMEM_SURFACES=1 in both test runs.
If I am not setting hardware acceleration why do I still see this for reftest-ipc?
> REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 801,1000; test browser size = 800,1000
Am I missing something?
Comment 27•14 years ago
|
||
In there current state, this doesn't block Fennec release. Moving to 2.0next
tracking-fennec: 2.0+ → 2.0next+
Reporter | ||
Comment 28•14 years ago
|
||
Thanks!
(In reply to comment #26)
> If I am not setting hardware acceleration why do I still see this for
> reftest-ipc?
> > REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 801,1000; test browser size = 800,1000
>
> Am I missing something?
DRAWWINDOW_USE_WIDGET_LAYERS is independent from hardware accel; it just means that the reftest is taking a snapshot of what would have been drawn to firefox's actual window, instead of having to draw temporary content.
Reporter | ||
Comment 29•14 years ago
|
||
So that we can turn on at least some testing on tinderbox, we'll get crashtest-ipc ready to go and work on reftest-ipc in a separate bug. Even if there's a single subdir that passes all reftests initially, we should turn on tests just for that dir so we can detect wholesale bustage.
Summary: Enable reftests+crashtests for out-of-process content on Fedora machines → Enable crashtests for out-of-process content on Fedora machines
Reporter | ||
Comment 30•14 years ago
|
||
From comment 26, it appears that crashtest-ipc is ready to go in opt builds. I saw two issues (IIRC) in debug builds with my last push that I or someone else can fix in the near future.
Reporter | ||
Comment 31•14 years ago
|
||
Filed bug 637858 for reftests.
Reporter | ||
Comment 32•14 years ago
|
||
For my own records: on my last try push (263d877928ac)
- the linux opt crashtests were green
- the linux debug crashtests both failed with logs like
s: talos-r3-fed-008
REFTEST TEST-UNEXPECTED-PASS | file:///home/cltbld/talos-slave/test/build/reftest/tests/editor/libeditor/html/crashtests/615015-1.html | assertion count 0 is less than expected 6 assertions
REFTEST TEST-UNEXPECTED-PASS | file:///home/cltbld/talos-slave/test/build/reftest/tests/layout/generic/crashtests/586806-1.html | assertion count 0 is less than expected 11 assertions
Bug 590426 - Intermittent 586806-1.html | timed out waiting for reftest-wait to be removed (after onload fired) TEST-UNEXPECTED-FAIL | tab process 1984 | automationutils.processLeakLog() | leaked 1388 bytes during test execution
TEST-UNEXPECTED-FAIL | tab process 1984 | automationutils.processLeakLog() | leaked 1 instance of AtomImpl with size 20 bytes
TEST-UNEXPECTED-FAIL | tab process 1984 | automationutils.processLeakLog() | leaked 4 instances of FontTableBlobData with size 12 bytes each (48 bytes total)
TEST-UNEXPECTED-FAIL | tab process 1984 | automationutils.processLeakLog() | leaked 2 instances of PuppetWidget with size 216 bytes each (432 bytes total)
TEST-UNEXPECTED-FAIL | tab process 1984 | automationutils.processLeakLog() | leaked 1 instance of gfxASurface with size 24 bytes
TEST-UNEXPECTED-FAIL | tab process 1984 | automationutils.processLeakLog() | leaked 1 instance of gfxFcFontSet with size 36 bytes
Reporter | ||
Comment 33•14 years ago
|
||
(In reply to comment #27)
> In there current state, this doesn't block Fennec release. Moving to 2.0next
Armen, these test suites are ready to be enabled
- crashtest-ipc on linux opt builds
- crashtest-ipc on windows opt builds
If the config changes for one or both of these are ready to go, it would be great to get them turned on, regardless of the blocking flags.
Comment 34•14 years ago
|
||
OK, based on these bug comments and some IRC chatter, it looks like we could have these turned by end of week.
That would be a big help in testing Fennec during the RC cycle coming up very soon. Unless there are hidden issues I am unaware of, let's get the last little bit of work done on this.
tracking-fennec: 2.0next+ → 2.0+
Assignee | ||
Comment 35•14 years ago
|
||
Please don't overuse the blocking flag because it makes me have to drop other important work and goals for this quarter.
If it does block from shipping fennec or a feature from landing please keep the flag up but if it doesn't please adjust accordingly.
I am going to test this one more time:
- linux opt reftest-ipc (limited subset)
- linux opt crashtest-ipc
- winXp and Win7 crashtest-ipc
If any of the above won't make it for tomorrow's deployment. I am sorry it will have to wait for later on as I can't be reshuffling work all the time. We can deal with oranges today if anyone arises.
If you believe that other platforms or a new set of test suites are "now ready" please file a separate a bug and we will deal with it in the future. Every new platform or new test suite requires me to test from scratch on staging and it is quite tiresome to deal with machines to do what I need them to do. I would rather deal with 3-4 platforms and new suites in one shot four weeks from now than be adding one or two every other week. Please take it into consideration.
I hope this makes sense and I don't sound like an ogre because I am trying to help things get going (as I have already shown) in a reasonable matter without overburdening me or others in my team. I know you are grateful and I love working with you but I thought of making it clear what makes things easier for me.
Please let me know if you have any questions and/or if you want to discuss the scope I just mentioned.
Assignee | ||
Comment 36•14 years ago
|
||
Both Windows crashtest-ipc builds crashed:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299085149.1299085775.20007.gz&fulltext=1
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299084400.1299084903.14376.gz&fulltext=1
Could you please check if I did something incorrectly?
Linux crashtest-ipc and reftest-ipc passed:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299084975.1299085051.15632.gz&fulltext=1
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299085226.1299085606.19259.gz&fulltext=1
Would you prefer to focus on just Linux and leave Windows for another time?
I could deploy Fedora test suites tomorrow.
Comment 37•14 years ago
|
||
(In reply to comment #36)
> Would you prefer to focus on just Linux and leave Windows for another time?
>
> I could deploy Fedora test suites tomorrow.
This bug is only about Linux, so let's move ahead with getting the Linux tests up and running. Thanks for getting this wrapped up! We _do_ love you too :)
Reporter | ||
Comment 38•14 years ago
|
||
(In reply to comment #36)
> Both Windows crashtest-ipc builds crashed:
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299085149.1299085775.20007.gz&fulltext=1
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299084400.1299084903.14376.gz&fulltext=1
>
> Could you please check if I did something incorrectly?
>
No, your configuration looks correct. The first crash was due to out-of-memory. The second is a hang on startup with no other useful info. (These are probably recent regressions.)
> Linux crashtest-ipc and reftest-ipc passed:
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299084975.1299085051.15632.gz&fulltext=1
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1299085226.1299085606.19259.gz&fulltext=1
>
> Would you prefer to focus on just Linux and leave Windows for another time?
Yes, that sounds good.
> I could deploy Fedora test suites tomorrow.
Thanks!
Assignee | ||
Comment 39•14 years ago
|
||
lsblakk this is the last patch that needs review. I would like to land this tomorrow morning.
The only difference is that we run a subset of reftests for ipc instead of the whole thing.
cjones, mfinkle: Almost done! Your welcome :D
Attachment #515962 -
Attachment is obsolete: true
Attachment #516328 -
Flags: review?(lsblakk)
Attachment #515962 -
Flags: review?(lsblakk)
Comment 40•14 years ago
|
||
Comment on attachment 516328 [details] [diff] [review]
[buildbotcustom] changes to add reftest-ipc and crashtest-ipc (v0)
looks good.
Attachment #516328 -
Flags: review?(lsblakk) → review+
Assignee | ||
Comment 41•14 years ago
|
||
Comment on attachment 515960 [details] [diff] [review]
[buildbot-configs] add crashtest-ipc and reftest-ipc for Fedora32 machines
http://hg.mozilla.org/build/buildbot-configs/rev/183ab6d02f19
Attachment #515960 -
Flags: checked-in+
Assignee | ||
Comment 42•14 years ago
|
||
Comment on attachment 516328 [details] [diff] [review]
[buildbotcustom] changes to add reftest-ipc and crashtest-ipc (v0)
http://hg.mozilla.org/build/buildbotcustom/rev/30893ed24016
Both changes to be deployed in production tomorrow morning.
Attachment #516328 -
Flags: checked-in+
Comment 43•14 years ago
|
||
This was deployed last night in a reconfig.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•