Closed Bug 514334 Opened 16 years ago Closed 16 years ago

Intermittent crash when running reftests

Categories

(Core :: General, defect)

1.9.2 Branch
ARM
Windows CE
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- wanted

People

(Reporter: cmtalbert, Assigned: vlad)

References

Details

(Whiteboard: [nv])

Attachments

(1 file)

There is an intermittent crash that occurs when running our remote reftests. It doesn't appear to be related to any test because it crashes in different places. = Steps = 1. Install the remotereftest_final.xpi (http://people.mozilla.org/~ctalbert/gfxtest/remotereftest_final.xpi) on your build 2. Go to Tools -> Remote Reftest after you restart 3. Put a http address for a 1.9.2 build tree that has been loaded into a webserver using the httpd.js script (instructions: http://wiki.github.com/jonallengriffin/moz-remote-reftest) 4. Put 3698 (or the current count of reftests in your tree) as the "number per chunk" value so that all tests run in one chunk (without restarting the window that reftests run inside). = Expected = Tests would run to completion with no crashes = Actual = The tests will crash at some point, though it isn't deterministic where that point is. We also see crashes when running in chunks of 50 (every 50 tests we close and restart the window that runs the reftests). It's not yet known whether this is the same crash or a different one. If you are trying to repo this on the corp network, I'm serving the reftests at: http://10.250.5.119:8888/layout/reftests/reftest.list (it's a 1.9.2 tree that is updated daily to be in sync with the nightly builds)
Whiteboard: [nv]
Hrm.. I'll try with this extension, but just running the reftests with the stock code, I'm on test 2000 or so with no crash so far.
Assignee: nobody → vladimir
I should ask -- with which build is this crash seen?
The actual crash is most likely caused by bug 508860; the reftests do some aggressive caching since bug 467987, so the problem was that there were a number of 800x1000 canvas elements being "cached", thus exhausting all available directdraw/vram space for offscreen surfaces. Even when they're released in JS, it still takes a GC run to actually destroy the xpcom objects, so we had a bunch sitting around at once. We eventually failed to create a new one, and we'd run into bug 508860 because the surface was NULL. So, here's a patch that adds a new reftest flag, -reftestnocache, that disables the reftest caching. We should use this on mobile/limited memory devices. (We should maybe even detect how much memory we have, and turn on nocache if < 1GB?)
Attachment #398762 - Flags: review?(roc)
Can't we detect that the offscreen surface allocation failed and try allocating a gfxImageSurface instead? That should work.
It might, but at this point we're at heavy memory pressure already -- with 12 outstanding canvases, that's 12*800*1000*4 =~ 38MB in use just for that. Some of these devices only have 128MB total, which is shared between CPU and GPU. On the device I was testing on, running with this no cache flag actually sped up reftests by a decent amount, because there wasn't as much memory thrashing/management needing to be done by the OS and/or video drivers.
Running the reftests on a debug build built last Thursday, I reproduced this crash but the active sync connection died and I wasn't able to get a stack trace from visual studio. I'm running the configuration again with these changes and the gNoCanvasCache flag set to True. I'm running this set on a release build: Mozilla/5.0 (Windows; U; WindowsCE 6.0; en-US; rv:1.9.2a2pre) Gecko/20090908 Namoroka/3.6a2pre. The tests are certainly running a lot faster. I'll update this bug tomorrow once the tests are finished.
All the tests ran in one chunk without any crashes at all. It must have been memory issues that were crashing us. Can we get this option checked in to the reftest source? I think this will help us with running these tests on memory constrained devices in the future. So, the crash is solved. The failures that we are seeing on the device remain consistent - a mix of things we expect to fail (color depth, printing, video) mixed with an entire set of things we don't expect to fail. I've uploaded the data from this run at: http://people.mozilla.org/~ctalbert/wince/reftests/20090908
http://hg.mozilla.org/mozilla-central/rev/740d8456dd78 I think we're going to want this on 1.9.2 as well, since that's what we're testing on mobile devices.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Depends on: 520303
This doesn't look like it was ever landed on 1.9.2 -- still need it there?
nope, not needed -- we're not shipping this on mobile any more
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: