Closed Bug 2036329 Opened 18 days ago Closed 12 days ago

jxl decoder flush pixels when nothing has changed

Categories

(Core :: Graphics: ImageLib, defect)

defect

Tracking

()

RESOLVED FIXED
152 Branch
Tracking Status
firefox-esr140 --- unaffected
firefox150 --- unaffected
firefox151 --- disabled
firefox152 --- fixed

People

(Reporter: tnikkel, Assigned: tnikkel)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(1 file)

We call FlushPartialFrame after every time we add jxl-rs to process data and it returns. Smaller issue: we should probably only do this when we run out of data to process. Main issue: the flush_pixels interface makes it impossible to tell when any pixel data has actually changed, and also impossible to tell if there even is any pixel data, so we might be processing all zero pixel data. This needs a library side fix. https://github.com/libjxl/jxl-rs/pull/755

I assume bug 2023488 caused this. Bug 2034984 makes this worse because it makes us repeat the CMS step every time (whereas before that was handled internally by jxl-rs so it knew when there was new pixels to call cms for). And in fact bug 2034984 makes one of our gtest go past the timeout limit on tsan builds.

Set release status flags based on info from the regressing bug 2023488

Depends on: 2036780

Pref:image.jxl.enabled

Every time we call flush_pixels we have no idea if jxl-rs put new pixels in the buffer. So we might do a lot of extra processing on unchanged pixels. Even worse, we have no idea if jxl-rs ever put pixels in the buffer, so we might be processing all zeroes, which will cause us to prematurely show an all black image. Although that might be preferable to all white when in dark mode it goes against our convention to use all white, but in this situation we should actually just not output at all until jxl-rs tells us that it rendered something.

We hit the JxlDecoderStatus::NeedMoreData case in more cases then just when we have exhausted our data, so move the FlushPartialFrame call into DoDecode where we are sure that we are pausing (or stopping). We need to call in the SourceBufferIterator::COMPLETE case too because this might be the last and only chance to do that. (I added gtests for that specific situation in upcoming patches.)

I added a gtest for the original bug as observed in this patch by just counting the number of WritePixelRowsToPipe calls. Since this is important to not waste effort and not regress I think it's worth a debug only field to implement this.

We only stumbled upon this because one of our gtests that was feeding a large jxl in small byte chunks was timing out on linux tsan jobs. Bug 2034984 moved cms handling out of internal calls of jxl-rs and into the C++ pipeline, and thus moved it from a place where jxl-rs knew when it had to call CMS again because new rendering happened to a place where we did not know that and hence caused us to do useless CMS work on top of the existing work and that tipped that test over the timeout threshold.

I also wrote gtests for the early black issue mentioned above, coming up in later patches. The regression test in this patch is focused on the initial issue of this bug (too many useless WritePixelRowsToPipe calls).

Backed out for causing rust failures.

Flags: needinfo?(tnikkel)
Flags: needinfo?(tnikkel)

Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/59686 for changes under testing/web-platform/tests

Upstream PR merged by moz-wptsync-bot

Status: NEW → RESOLVED
Closed: 12 days ago
Resolution: --- → FIXED
Target Milestone: --- → 152 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: