Closed Bug 1258741 Opened 4 years ago Closed 3 years ago

Firefox treats invalid image opened in background and foreground tab in a different ways

Categories

(Core :: ImageLib, defect, P3)

37 Branch
x86_64
Windows 7
defect

Tracking

()

VERIFIED FIXED
mozilla52
Tracking Status
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- wontfix
firefox-esr38 --- wontfix
firefox-esr45 --- affected
firefox50 --- fix-optional
firefox51 --- fix-optional
firefox52 --- verified

People

(Reporter: arni2033, Assigned: aosmond, NeedInfo)

References

()

Details

(Keywords: regression, Whiteboard: [gfx-noted])

Attachments

(3 files, 6 obsolete files)

>>>   My Info:   Win7_64, Nightly 48, 32bit, ID 20160320030409
STR_A:
1. Open new tab, paste https://bug12460.bmoattachments.org/attachment.cgi?id=1385
   in urlbar, then press Enter

AR:
 There's a text line on a white background:   "The image “https://bug12460.bmoattachments.org/attachment.cgi?id=1385” cannot be displayed because it contains errors."

STR_B:
1. Middle-click link https://bug12460.bmoattachments.org/attachment.cgi?id=1385
   to open image in a new background tab, then switch to that tab (Ctrl+Tab)

AR:  There're 2 possible results, depending on how many times you repeated Step 1
 (1) Either a small white rectangle with no text (and the tab has no favicon), OR
 (2) completely black page (and the tab has Bugzilla favicon)

ER:
 STR_B should have the same result as STR_A

Note:
 Firefox 38 ESR behaves the way described above.
 Firefox 32 showed the image in both cases == handled (A) and (B) the same way.
User Agent 	Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0

Hello.
I have tested your issue on latest FF release (version: 45.0.1, build ID: 20160315153207) and latest Nightly build (version: 48.0a1, build ID: 20160327030437) and managed to reproduce it. I have opened the image from STR_B by clicking on the scroll mouse button. I couldn't reproduce the issue every time I opened the link this way. The issue does not reproduce if the image is opened by right clicking on the link and then clicking on "Open in new tab".
Component: Untriaged → ImageLib
OS: Unspecified → Windows 7
Hardware: Unspecified → x86_64
Can we get a regression range?  Sounds like it broke some time between 32 and 38?  Parking with me until we get the regression range.
Assignee: nobody → milan
Whiteboard: [gfx-noted]
This is probably a race condition on when we detect the error in the image. Looks like we are able to decode enough of the image to draw something before we encounter the error. But if decoding gets to the error before we lay out the document we might never draw it and just report the error.

I would expect this bug to have existed for a long time, but other changes could have made it easier to trigger.
That regression range doesn't make much sense. Since this bug is timing related it would easily happen or not happen in any build, making it hard to get an accurate regression window.
(In reply to Timothy Nikkel (:tnikkel) from comment #6)
> That regression range doesn't make much sense. Since this bug is timing related it would easily
> happen or not happen in any build, making it hard to get an accurate regression window.
How can you say that?! Pushlog (сomment 5) is wrong, but not because the issue it's hard to detect!
This ISN'T the first time I find people unable to follow my STR, yes. I marked comment 5 as obsolete

What's important here is when this started happening, at any percentage. It's ~50% reproducible on 2015-01-17 build and 0% reproducible on 2015-01-16 build. How did _I_ perform the testing:
(1) open this bug in a build   (2) perform STR_B  100 (ONE_HUNDRED) times   (3) check the result
If there's at least one tab not saying "image contains errors" (or not showing image) - it's a bug.

This was caused by (almost sure) bug 1079627. Regression range:
> https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cac6192956ab&tochange=369a8f14ccf8
See Also: → 1079627
(In reply to arni2033 from comment #7)
> (In reply to Timothy Nikkel (:tnikkel) from comment #6)
> > That regression range doesn't make much sense. Since this bug is timing related it would easily
> > happen or not happen in any build, making it hard to get an accurate regression window.
> How can you say that?!

You said the issue only happens 50% of the time, so if someone testing only did one test on each build they could easily get led astray when trying to find a regression range.

Thank you for providing the real regression range.
Sorry, I indeed haven't mentioned that it happens in ~50% cases on old builds because I didn't know it back then. I just overreact sometimes.
Though double-checking regression range with a "stress-tests" like comment 7 is rather a good thing
Assignee: milan → seth
Version: Trunk → 37 Branch
The displayed error is controlled by the status in the LOAD_COMPLETE message. When the image is opened via a background tab, LOAD_COMPLETE happens before the decode error, so only the status bits SIZE_AVAILABLE and LOAD_COMPLETE are present; no update is ever issued, so the listener is unaware of the failure, hence the empty white box. When the image is opened directly, LOAD_COMPLETE happens after the decode error, so the status bit ERROR is set, which shows the error message.
Assignee: seth.bugzilla → aosmond
Status: NEW → ASSIGNED
Probably won't get fixed for 49, so bumping this out one version. I confirmed the issue affects 51 as well.
(In reply to Andrew Osmond [:aosmond] from comment #10)
> The displayed error is controlled by the status in the LOAD_COMPLETE
> message. When the image is opened via a background tab, LOAD_COMPLETE
> happens before the decode error, so only the status bits SIZE_AVAILABLE and
> LOAD_COMPLETE are present; no update is ever issued, so the listener is
> unaware of the failure, hence the empty white box. When the image is opened
> directly, LOAD_COMPLETE happens after the decode error, so the status bit
> ERROR is set, which shows the error message.

Actually it appears the ERROR status bit is never set in the background case.
First remove Decoder::WasAborted because this logic is not in use, and complicates the event dispatching.
Second, switch ImageDocument to listen to DECODE_COMPLETE instead of LOAD_COMPLETE. We guarantee we will always dispatch DECODE_COMPLETE at the very end once we either have the image or not, which seems to be what ImageDocument really wants; LOAD_COMPLETE just means we have all of the required data from the network/disk, not that we have the image processed.

Also, consistently report FLAG_HAS_ERROR when Decoder::HasError is true. If we don't do this, we have a split brain effect where some parts of the code are aware the image did not properly decode, and others are not. This should not affect truncated images as they do not set Decoder::mError (streaming lexer transitions to TerminalState::SUCCESS and we determine whether the image is good or not based on if we have any frame data).

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f5ed1e7879cc
Appears to have broken reftests, see https://treeherder.mozilla.org/logviewer.html#?job_id=27735297&repo=try#L155072.


 14:19:10     INFO - Assertion failure: mPresContext->mLayoutPhaseCount[eLayoutPhase_Paint] == 0 (constructing frames in the middle of a paint), at /builds/slave/try-m64-d-00000000000000000000/build/src/layout/base/nsAutoLayoutPhase.cpp:51
 14:19:10     INFO - #01: PresShell::ContentRemoved(nsIDocument*, nsIContent*, nsIContent*, int, nsIContent*) [mfbt/RefPtr.h:306]
 14:19:10     INFO - #02: nsNodeUtils::ContentRemoved(nsINode*, nsIContent*, int, nsIContent*) [xpcom/glue/nsTObserverArray.h:353]
 14:19:10     INFO - #03: nsINode::doRemoveChildAt(unsigned int, bool, nsIContent*, nsAttrAndChildArray&) [dom/base/nsINode.cpp:1914]
 14:19:10     INFO - #04: mozilla::dom::FragmentOrElement::RemoveChildAt(unsigned int, bool) [xpcom/glue/nsCOMPtr.h:402]
 14:19:10     INFO - #05: nsContentUtils::SetNodeTextContent(nsIContent*, nsAString_internal const&, bool) [dom/base/nsContentUtils.cpp:4744]
 14:19:10     INFO - #06: nsDocument::SetTitle(nsAString_internal const&) [dom/base/nsDocument.cpp:6690]
 14:19:10     INFO - #07: mozilla::dom::MediaDocument::UpdateTitleAndCharset(nsACString_internal const&, nsIChannel*, char const* const*, int, int, nsAString_internal const&) [dom/html/MediaDocument.cpp:424]
 14:19:10     INFO - #08: mozilla::dom::ImageDocument::UpdateTitleAndCharset() [xpcom/string/nsTSubstring.h:95]
 14:19:11     INFO - #09: mozilla::dom::ImageDocument::OnLoadComplete(imgIRequest*, nsresult) [dom/html/ImageDocument.cpp:569]
 14:19:11     INFO - #10: mozilla::dom::ImageDocument::Notify(imgIRequest*, int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) [dom/html/ImageDocument.cpp:502]
 14:19:11     INFO - #11: non-virtual thunk to mozilla::dom::ImageDocument::Notify(imgIRequest*, int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) [dom/html/ImageDocument.cpp:481]
 14:19:11     INFO - #12: nsImageLoadingContent::Notify(imgIRequest*, int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) [dom/base/nsImageLoadingContent.cpp:151]
 14:19:11     INFO - #13: imgRequestProxy::Notify(int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) [xpcom/glue/nsCOMPtr.h:402]
 14:19:11     INFO - #14: mozilla::image::ImageObserverNotifier<const mozilla::image::ObserverTable *>::operator()<(lambda at /builds/slave/try-m64-d-00000000000000000000/build/src/image/ProgressTracker.cpp:355:12)> [mfbt/RefPtr.h:40]
 14:19:11     INFO - #15: void mozilla::image::SyncNotifyInternal<mozilla::image::ObserverTable const*>(mozilla::image::ObserverTable const* const&, bool, unsigned int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&) [image/ProgressTracker.cpp:358]
 14:19:11     INFO - #16: mozilla::image::ProgressTracker::SyncNotifyProgress(unsigned int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&) [image/CopyOnWrite.h:42]
 14:19:11     INFO - #17: mozilla::image::RasterImage::NotifyProgress(unsigned int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<unsigned int> const&, mozilla::image::DecoderFlags, mozilla::image::SurfaceFlags) [mfbt/RefPtr.h:40]
 14:19:11     INFO - #18: mozilla::image::RasterImage::NotifyDecodeComplete(mozilla::image::DecoderFinalStatus const&, mozilla::image::ImageMetadata const&, mozilla::image::DecoderTelemetry const&, unsigned int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<unsigned int> const&, mozilla::image::DecoderFlags, mozilla::image::SurfaceFlags) [image/RasterImage.cpp:1613]
 14:19:11     INFO - #19: mozilla::image::IDecodingTask::NotifyDecodeComplete(mozilla::NotNull<mozilla::image::RasterImage*>, mozilla::NotNull<mozilla::image::Decoder*>) [image/IDecodingTask.cpp:78]
 14:19:11     INFO - #20: mozilla::image::DecodedSurfaceProvider::FinishDecoding() [mfbt/RefPtr.h:77]
 14:19:11     INFO - #21: mozilla::image::DecodedSurfaceProvider::Run() [xpcom/glue/Mutex.h:169]
 14:19:11     INFO - #22: mozilla::image::LaunchDecodingTask [image/RasterImage.cpp:1098]
 14:19:11     INFO - #23: mozilla::image::RasterImage::Decode(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, unsigned int, mozilla::image::PlaybackType) [mfbt/RefPtr.h:40]
 14:19:11     INFO - #24: mozilla::image::RasterImage::LookupFrame(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, unsigned int, mozilla::image::PlaybackType) [image/RasterImage.cpp:344]
 14:19:11     INFO - #25: mozilla::image::RasterImage::Draw(gfxContext*, mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, mozilla::image::ImageRegion const&, unsigned int, mozilla::gfx::SamplingFilter, mozilla::Maybe<mozilla::SVGImageContext> const&, unsigned int) [mozglue/misc/TimeStamp.h:422]

It appears triggering on the DECODE_COMPLETE caused us to update ImageDocument later than expected. Elsewhere we seem to get around LOAD_COMPLETE not having the error yet by checking the image status during reflow/etc. Will need to see if a similar solution is available to us for ImageDocument.
Second kick at the can. Should pass the failing reftest now as I moved only the error checking to DECODE_COMPLETE (when we should know for certain if it has failed or not) and left the success case to LOAD_COMPLETE. Also confirmed that it was only the updating of the title on DECODE_COMPLETE which was the problem for that reftest -- even if an error occurs, setting the error message during the paint appears to be safe.

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a7e4e9ef3fc
Attachment #8793029 - Attachment is obsolete: true
Attachment #8793028 - Flags: review?(edwin)
Attachment #8793028 - Flags: review?(edwin) → review?(tnikkel)
Attachment #8793299 - Flags: review?(tnikkel)
Comment on attachment 8793028 [details] [diff] [review]
Part 1. Remove dead/unused image decoder aborted flag. v1

Sweet, code removal.
Attachment #8793028 - Flags: review?(tnikkel) → review+
Comment on attachment 8793299 [details] [diff] [review]
Part 2. Make image decoders consistently report errors to higher layers. v2

>@@ -200,19 +200,19 @@ Decoder::CompleteDecode()
>-    // Even if we encountered an error, we're still usable if we have at least
>-    // one complete frame.
>-    if (GetCompleteFrameCount() > 0) {
>+    // We're still usable if we have at least one complete frame and we did not
>+    // encounter an error (e.g. source was truncated).
>+    if (!HasError() && GetCompleteFrameCount() > 0) {
>       // We're usable, so do exactly what we should have when the decoder
>       // completed.

We only have the ability to convey error/not error, but we're really trying to convey more than 2 options: Ok/usable image data but some error/error and no usable image data.

This change will probably cause us to not display images that we displayed before. So I'm not sure which is the better state to be in.
(In reply to Timothy Nikkel (:tnikkel) from comment #18)
> Comment on attachment 8793299 [details] [diff] [review]
> 
> We only have the ability to convey error/not error, but we're really trying
> to convey more than 2 options: Ok/usable image data but some error/error and
> no usable image data.
> 
> This change will probably cause us to not display images that we displayed
> before. So I'm not sure which is the better state to be in.

I will rewrite the patch to flip the condition so that we consistently *accept* partially decoded images. The "usable image data but some error" state should be internal to the Decoder, and RasterImage should only care if it has all of the necessary data to render something (at least one frame, metadata, etc).
This actually renders something useful out of the attached image, essentially the same as Chrome without the last partially decoded frame with the errors.

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2a391a73790e
Attachment #8793299 - Attachment is obsolete: true
Attachment #8793299 - Flags: review?(tnikkel)
Attachment #8794942 - Flags: review?(tnikkel)
Fixed minor style nit.
Attachment #8794942 - Attachment is obsolete: true
Attachment #8794942 - Flags: review?(tnikkel)
Attachment #8794944 - Flags: review?(tnikkel)
...

Actually attach updated patch file this time. Apologies for the review email spam :).
Attachment #8794944 - Attachment is obsolete: true
Attachment #8794944 - Flags: review?(tnikkel)
Attachment #8794946 - Flags: review?(tnikkel)
Waiting for try to open again. This should fix the reftest/mochitest failures.
Attachment #8794946 - Attachment is obsolete: true
Attachment #8794946 - Flags: review?(tnikkel)
try again on new revision as the last run had some interesting image related failures that don't *seem* related to the above patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2dbb6a06ecf755c556e99481cab08b549182c05c
Attachment #8795023 - Flags: review?(tnikkel)
Comment on attachment 8795023 [details] [diff] [review]
Part 2. Ensure we consistently render partially decoded images., v2

>@@ -190,46 +190,44 @@ Decoder::CompleteDecode()
>+  // If the implementation left us mid-frame, finish that up. Note that it may
>+  // have left us transparent.
>+  if (mInFrame) {
>+    PostHasTransparency();
>     PostFrameStop();
>   }

In the old code we would call PostHasTransparency anytime mDecodeDone was false, so we should probably continue to do so even if a decoder had finished the current frame.

>+  // Only act on errors if we have no usable frames from the decoder.
>+  if (aStatus.mHadError &&
>+      (!mAnimationState || mAnimationState->KnownFrameCount() == 0)) {
>+    DoError();
>+  }

Is the condition |(!mAnimationState || mAnimationState->KnownFrameCount() == 0)| a sneaky way of including metadata decodes too? Can we make just that explicit?

Can we also call DoError if a metadata decode failed to get a size like before?

I assume that it is intentional that we call DoError whether or not aStatus.mFinished is true now (whereas we didn't before)?

>-    // If we were waiting to fire the load event, go ahead and fire it now.
>-    if (mLoadProgress && aStatus.mWasMetadataDecode) {
>-      NotifyForLoadEvent(*mLoadProgress);
>-      mLoadProgress = Nothing();
>-      NotifyProgress(FLAG_ONLOAD_UNBLOCKED);
>-    }
>+  // XXX(aosmond): Can we get this far without mFinished == true?
>+  if (!aStatus.mFinished || !aStatus.mWasMetadataDecode) {
>+    return;
>+  }

I prefer to use early returns like this only for unusual cases; I think it makes it easier to follow the "usual case". So we can wrap the remaining bits of code in this function in a |if (aStatus.mFinished && !aStatus.mWasMetadataDecode)|?

>+  // If we were waiting to fire the load event, go ahead and fire it now.
>+  if (mLoadProgress) {
>+    NotifyForLoadEvent(*mLoadProgress);
>+    mLoadProgress = Nothing();
>+    NotifyProgress(FLAG_ONLOAD_UNBLOCKED);
>   }
>
>   // If we were a metadata decode and a full decode was requested, do it.
>-  if (aStatus.mFinished && aStatus.mWasMetadataDecode && mWantFullDecode) {
>+  if (mWantFullDecode) {
>     mWantFullDecode = false;
>     RequestDecodeForSize(mSize, DECODE_FLAGS_DEFAULT);
>   }

Sorry to be so nitpicky, but in the past these functions have proven to be delicate.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c60c6132de31de2db785d1740a40008fe6a90a76

(In reply to Timothy Nikkel (:tnikkel) from comment #26)
> Comment on attachment 8795023 [details] [diff] [review]
> Part 2. Ensure we consistently render partially decoded images., v2
> 
> >@@ -190,46 +190,44 @@ Decoder::CompleteDecode()
> >+  // If the implementation left us mid-frame, finish that up. Note that it may
> >+  // have left us transparent.
> >+  if (mInFrame) {
> >+    PostHasTransparency();
> >     PostFrameStop();
> >   }
> 
> In the old code we would call PostHasTransparency anytime mDecodeDone was
> false, so we should probably continue to do so even if a decoder had
> finished the current frame.
> 

Done, with the ammendum that it originally only called PostHasTransparency if mDecodeDone *and* the frame count is > 0.

> >+  // Only act on errors if we have no usable frames from the decoder.
> >+  if (aStatus.mHadError &&
> >+      (!mAnimationState || mAnimationState->KnownFrameCount() == 0)) {
> >+    DoError();
> >+  }
> 
> Is the condition |(!mAnimationState || mAnimationState->KnownFrameCount() ==
> 0)| a sneaky way of including metadata decodes too? Can we make just that
> explicit?
> 
> Can we also call DoError if a metadata decode failed to get a size like
> before?
> 

I wrote it with the intention of excluding the case where we have frames to display. Since an error always aborts the current frame it is working on, still images never have a frame, nor can metadata decodes, so we only need to check the animated case.

I added the explicit metadata decode size check back however it should be unreachable code. Decoder::CompleteDecode always happens before RasterImage::NotifyDecodeComplete, and Decoder::CompleteDecode sets aStatus.mHadError if it is a metadata decode and no size was determined.

> I assume that it is intentional that we call DoError whether or not
> aStatus.mFinished is true now (whereas we didn't before)?
> 
> >-    // If we were waiting to fire the load event, go ahead and fire it now.
> >-    if (mLoadProgress && aStatus.mWasMetadataDecode) {
> >-      NotifyForLoadEvent(*mLoadProgress);
> >-      mLoadProgress = Nothing();
> >-      NotifyProgress(FLAG_ONLOAD_UNBLOCKED);
> >-    }
> >+  // XXX(aosmond): Can we get this far without mFinished == true?
> >+  if (!aStatus.mFinished || !aStatus.mWasMetadataDecode) {
> >+    return;
> >+  }
> 
> I prefer to use early returns like this only for unusual cases; I think it
> makes it easier to follow the "usual case". So we can wrap the remaining
> bits of code in this function in a |if (aStatus.mFinished &&
> !aStatus.mWasMetadataDecode)|?
> 

Done.

> >+  // If we were waiting to fire the load event, go ahead and fire it now.
> >+  if (mLoadProgress) {
> >+    NotifyForLoadEvent(*mLoadProgress);
> >+    mLoadProgress = Nothing();
> >+    NotifyProgress(FLAG_ONLOAD_UNBLOCKED);
> >   }
> >
> >   // If we were a metadata decode and a full decode was requested, do it.
> >-  if (aStatus.mFinished && aStatus.mWasMetadataDecode && mWantFullDecode) {
> >+  if (mWantFullDecode) {
> >     mWantFullDecode = false;
> >     RequestDecodeForSize(mSize, DECODE_FLAGS_DEFAULT);
> >   }
> 
> Sorry to be so nitpicky, but in the past these functions have proven to be
> delicate.

No, no, I have been burned by these events too many times, the caution is much appreciated :).
Attachment #8795023 - Attachment is obsolete: true
Attachment #8795023 - Flags: review?(tnikkel)
Attachment #8802119 - Flags: review?(tnikkel)
Comment on attachment 8802119 [details] [diff] [review]
Part 2. Ensure we consistently render partially decoded images., v3

Thanks!
Attachment #8802119 - Flags: review?(tnikkel) → review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5238fbaf49fa
Part 1. Remove dead/unused image decoder aborted flag. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/d56e9b123ed6
Part 2. Ensure we consistently render partially decoded images. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/5238fbaf49fa
https://hg.mozilla.org/mozilla-central/rev/d56e9b123ed6
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Hi arni2033, 
could you please verify this issue is fixed as expected on a latest Nightly build? Thanks!
Flags: needinfo?(arni2033)
Flags: qe-verify+
Duplicate of this bug: 1338551
I've reproduced the issue described in comment 0 using an affected build: Firefox Nightly 48.0a1(Build Id:20160320030409).
I have verified that the issue is not reproducible using 54.0a1(Build Id:20170303030202) and using 52.0 (Build Id:20170302120751) on Windows 7 64bit.
Flags: qe-verify+
Status: RESOLVED → VERIFIED
Flags: in-qa-testsuite+
You need to log in before you can comment on or make changes to this bug.