Closed Bug 918419 Opened 11 years ago Closed 9 years ago

Intermittent bmp-corrupted/wrapper.html?invalid-compression.bmp | image comparison (==), max difference: 255, number of differing pixels: 242

Categories

(Core :: Graphics: ImageLib, defect)

x86
Windows 8
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: emorley, Assigned: smaug)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

WINNT 6.2 mozilla-central opt test reftest on 2013-09-19 09:03:24 PDT for push c85a238fa3ab

slave: t-w864-ix-069

https://tbpl.mozilla.org/php/getParsedLog.php?id=28097281&tree=Mozilla-Central

{
09:08:06     INFO -  REFTEST TEST-UNEXPECTED-FAIL | file:///C:/slave/test/build/tests/reftest/tests/image/test/reftest/bmp/bmp-corrupted/wrapper.html?invalid-compression.bmp | image comparison (==), max difference: 255, number of differing pixels: 242
}
Attached file reftest log
Blocks: 813742
This orange happens regularly with the parallel reftests patch in bug 813742 applied on Try; the likeliest cause is because of --this-chunk and --total-chunks being used.  Ideally finding values for those two switches that include this test should help pinpoint what's going on with this failure.

Um, hm, Joe seems to be unavailable.  Daniel, do you know of anybody who would be available to help with this?
Flags: needinfo?(dholbert)
hg blame (plus a few levels of indirection) says Brian Bondy added wrapper.html and invalid-compression.bmp back in http://hg.mozilla.org/mozilla-central/rev/5401cb3d7350 for bug 600556, so he's the first person I'd suggest asking.
Blocks: 600556
Flags: needinfo?(dholbert) → needinfo?(netzen)
Joe is unfortunately no longer working here.  Does --this-chunk control how much data is passed to imagelib at a time to decode? If so, I haven't worked on it for a long time, but I remember there being various bugs in our decoders if the chunk size is too small. I'd suggest using larger values.
Flags: needinfo?(netzen)
(In reply to Brian R. Bondy [:bbondy] from comment #19)
> Joe is unfortunately no longer working here.  Does --this-chunk control how
> much data is passed to imagelib at a time to decode? If so, I haven't worked
> on it for a long time, but I remember there being various bugs in our
> decoders if the chunk size is too small. I'd suggest using larger values.

--total-chunks splits the total # of reftests into N conceptual chunks; --this-chunk picks one of those chunks (containing several tests) to run.

My working hypothesis is that running some subset of the tests consistently bypasses some piece of initialization (that would be triggered by tests running before the subset in a full run) that's important for tests in that subset, which causes intermittent oranges in that subset to permaorange.  This hypothesis doesn't really explain why the intermittent is there in the first place, but the hypothesis seems consistent with the symptoms and with other problems seen with parallel reftests.
Gotcha, so a completely different "chunk" concept than I was thinking :)

To the best of my knowledge, no test here depends on any other tests, but perhaps there is a bug that makes it so.

Could you explain a bit more about how it runs so I can look at the code for some ideas. 
Does each chunk correspond to a single process, and then each process runs #tests_total/N_chunks tests sequentially?  And each chunk process runs concurrently?
(In reply to Brian R. Bondy [:bbondy] from comment #21)
> Gotcha, so a completely different "chunk" concept than I was thinking :)

Indeed! :)

> Could you explain a bit more about how it runs so I can look at the code for
> some ideas. 
> Does each chunk correspond to a single process, and then each process runs
> #tests_total/N_chunks tests sequentially?  And each chunk process runs
> concurrently?

That's the gist of it, yes (there's a bit of additional fiddling with needs-focus tests that get run entirely separately).  Theoretically, if this test is in failing chunk M/N when running in parallel, you should be able to run the complete reftests.list with:

  --this-chunk=M --total-chunks=N --focus-filter-mode=non-needs-focus

and the test should consistently fail.

That doesn't feel like I added much information, but maybe confirmation of your intuition is sufficient.  Please let me know if you need more information.
Sorry I haven't had time to look at this yet, if you can find another owner for this bug please do.
I think I need to take this since changes to event loop ends up triggering this all the time.
Assignee: nobody → bugs
Depends on: 1051530
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: