Closed Bug 763593 Opened 13 years ago Closed 13 years ago

Firefox 13 incorrectly displays photos on Skydrive

Categories

(Core :: Graphics: ImageLib, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16
Tracking Status
firefox14 + disabled
firefox15 + fixed
firefox16 + fixed

People

(Reporter: robert.h.henson, Assigned: justin.lebar+bug)

References

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20100101 Firefox/13.0 Build ID: 20120601045813 Steps to reproduce: I tried to display photographs on a Skydrive page using Firefox 13 running under Windows 7. http://tinyurl.com/bv6uy2t This was repeated on another set of photos on a similar Skydrive page. A further similar page was tried, but on Dropbox. Actual results: The page displayed a row of thumbnails of the photos. Attempting to display the photos by clicking the thumbnails resulted in some of the photos being displayed, but not others. Refreshing the page did not help. Using the back and forward arrows failed too. It was the same photos that did not display each time. Confirmation by many other people revealed that all worked correctly and as expected with Firefox 12, but not with Firefox 13. It did not help to run Firefox in Safe Mode, ruling out extensions and plug-ins as a source of problems. It did not help to reinstall Firefox 13 (two people tried) I repeated the check in Linux with Firefox 12 - the photographs displayed just fine. I upgraded to Firefox 13 and immediately repeated the test - it now failed in identical manner to all the other attempts. Clicking "View Original" correctly displayed the photos one at a time. Viewing the photos in Internet Explorer 9 or Chrome displays them correctly, only Firefox and then only version 13 fails. Numerous other people have had the same result. Some of the results are discussed in the mozilla.firefox.support newsgroup under the heading "Firefox 13 problem displaying photos". The bug could not be reproduced on a similar page on Dropbox - it only happens on Skydrive pages (two different ones tried with identical results). Expected results: The photos should all have displayed have displayed from the thumbnails and from the left/right arrows. There should be no difference in this regard between Firefox 12 and Firefox 13. There should be no difference between the way Firefox handled sets of photos on Dropbox and on Skydrive.
Is this reproducible in a new Profile too?
It looks like Skydrive does some magic where they serve a low-res version of the photo and then serve a higher-res version of the photo. (Whatever's going on, it's likely due to an interaction with this magic -- possibly from an intermediate broken-image/franken-image getting cached somehow? Assuming this is either an ImageLib or a networking bug. Kicking to ImageLib for now.) I can hit this bug (or something like it) in Nightly -- paging through the photos quickly with leftarrow / rightarrow will usually hit at least one photo that doesn't load (the screen is just gray). I also noticed a number of messages like this in my Error Console when testing this: { Error: Image corrupt or truncated: https://byfiles.storage.live.com/y1pBZxTIhDXcp0qWlb7D5iQW33zRM4-oN69YtJ_n1lvnNEenioyLEzAQ3KkLJc5n_11cNfwyV9i48U/Aviemore_004.jpg?psid=1 Source File: https://byfiles.storage.live.com/y1pBZxTIhDXcp0qWlb7D5iQW33zRM4-oN69YtJ_n1lvnNEenioyLEzAQ3KkLJc5n_11cNfwyV9i48U/Aviemore_004.jpg?psid=1 Line: 0 } (Note: refreshing the page _did_ help for me, though -- unlike comment 0. Bob: are you sure it doesn't help for you? What if you do a "full" refresh with Ctrl+Shift+R? (no resources loaded from cache))
Status: UNCONFIRMED → NEW
Component: Untriaged → ImageLib
Ever confirmed: true
Product: Firefox → Core
QA Contact: untriaged → imagelib
Version: 13 Branch → Trunk
My nightly build was: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/16.0 Firefox/16.0a1 and (per comment 1) I was indeed able to reproduce it in a fresh profile.
OS: Windows 7 → All
Hardware: x86_64 → All
works in Firefox 12, broken in Firefox 13 (and all later.) need a regression window.
> (Note: refreshing the page _did_ help for me, though -- unlike comment 0. > Bob: are you sure it doesn't help for you? What if you do a "full" refresh > with Ctrl+Shift+R? (no resources loaded from cache)) No - I did that, as was suggested to me, but it made no difference.
Regression window(m-c) Good: http://hg.mozilla.org/mozilla-central/rev/c33438bd5706 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120311 Firefox/13.0a1 ID:20120311005632 Bad: http://hg.mozilla.org/mozilla-central/rev/5ec9524de1af Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120312 Firefox/13.0a1 ID:20120312031136 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c33438bd5706&tochange=5ec9524de1af Regression window(m-i) Good: http://hg.mozilla.org/integration/mozilla-inbound/rev/bd50a33d8c69 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120309 Firefox/13.0a1 ID:20120309133232 Bad: http://hg.mozilla.org/integration/mozilla-inbound/rev/55e63a03ccad Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120309 Firefox/13.0a1 ID:20120309152032 Pushlog: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=bd50a33d8c69&tochange=55e63a03ccad Suspected: Bug 731419
FYI image.mem.discardable = false helps
As far as I can tell, this is a real bug, unrelated to bug 731419. What happens is, setting the src of an image eventually triggers a call to nsImageLoadingContent::ClearCurrentRequest(). ClearCurrentRequest unlocks the image, and, *if we haven't finished decoding it yet* we discard it. It's that dependence on having finished decoding which makes this bug appear intermittently. Discarding cancels the decode, which causes us to throw the "Image corrupt or truncated:" error. That error, in this case, should instead say "*we* truncated this image." mozilla::image::Decoder::Finish at src/image/src/Decoder.cpp:106 mozilla::image::RasterImage::ShutdownDecoder at src/image/src/RasterImage.cpp:2325 mozilla::image::RasterImage::UnlockImage at src/image/src/RasterImage.cpp:2688 imgRequestProxy::UnlockImage at src/image/src/imgRequestProxy.cpp:336 nsDocument::RemoveImage at src/content/base/src/nsDocument.cpp:8249 nsImageLoadingContent::UntrackImage at src/content/base/src/nsImageLoadingContent.cpp:1161 nsImageLoadingContent::ClearCurrentRequest at src/content/base/src/nsImageLoadingContent.cpp:1046 nsImageLoadingContent::PrepareCurrentRequest at src/content/base/src/nsImageLoadingContent.cpp:997 nsImageLoadingContent::MakePendingRequestCurrent at src/content/base/src/nsImageLoadingContent.cpp:1021 nsImageLoadingContent::LoadImage at src/content/base/src/nsImageLoadingContent.cpp:753 nsImageLoadingContent::LoadImage at src/content/base/src/nsImageLoadingContent.cpp:656 nsHTMLImageElement::SetAttr at src/content/html/content/src/nsHTMLImageElement.cpp:428 nsGenericHTMLElement::SetAttr at src/content/html/content/src/nsGenericHTMLElement.h:163 nsGenericHTMLElement::SetAttrHelper at src/content/html/content/src/nsGenericHTMLElement.cpp:2830 nsHTMLImageElement::SetSrc at src/content/html/content/src/nsHTMLImageElement.cpp:115
Okay, this is in fact being tickled by bug 731419; if I take out the call to RasterImage::RequestDiscard in nsDocument::RemoveImage, the bug goes away. But I don't think the problem is in the nsDocument code. If I modify RasterImage::UnlockImage and take out ShutdownDecoder(eShutdownIntent_Interrupted); ForceDiscard(); then everything also works properly. Since once an image isn't displayed, it's never displayed for the whole session, I think we're somehow getting into a state where we cancel the decode but don't clean out the RasterImage sufficiently and the image becomes permanently corrupted. I'm scared to try to figure out what this page is doing, but it seems to start and cancel many image decodes every time we navigate the page.
Tracking this regression, do we need to look into backing out bug 731419 or will a forward fix be investigated for this and bug 762460?
> will a forward fix be investigated for this and bug 762460? If it's not clear from comments 8 and 9, I'm investigating a forward fix here. Worst case, I don't think we need to back out all of bug 731419, as I indicated in comment 9.
> Worst case, I don't think we need to back out all of bug 731419, as I indicated in comment 9. Well, I guess I didn't really indicate as much there. Bug 731419 actually makes two changes. When a tab is closed, we immediately discard its image data. And when an image is "removed from the document", we also immediately discard it. It's only this second change which appears to be causing problems. But again, I think there's a deeper problem here. Joe et al., I could use some help understanding what's going on in comment 9. Again, it seems like the decoder shutdown gets us into a borked state from which we never recover.
> If I modify RasterImage::UnlockImage and take out > > ShutdownDecoder(eShutdownIntent_Interrupted); > ForceDiscard(); > > then everything also works properly. If I take out only the ForceDiscard(), the bug still manifests. I can't take out ShutdownDecoder and leave ForceDiscard, because you can't discard while there's an active decoder, but this suggests that the problem is indeed with the decoder and not with discarding per se.
Okay, I think I have this. I'll post a patch tonight or tomorrow.
Assignee: nobody → justin.lebar+bug
Attached patch Patch, v1 (obsolete) — Splinter Review
I have to admit, I'm not entirely sure how we get into this state to begin with, or why it's intermittent. But it seems what's going wrong is that MakePendingRequestCurrent() is called when both the pending and current requests point to the same image. We unlock the current request, killing its decoder, and then swap in the pending request, which is now borked (I'm not exactly sure how). I think I could write this patch differently: Lock the current request only if the current and pending requests are both for the same image. But even then, I'd have to check whether the image was locked; I don't want to lock an unlocked image, because that will kick off decodes and whatnot. Not checking that the two requests are for the same image seemed simpler to me. Because I don't quite understand how we get into this state, I haven't written a test. We really should figure this out, though. I'm also not confident we should take this patch on branches. There's a simpler one-line fix to nsDocument, which I'll post here in a moment.
Attachment #634303 - Flags: review?(joe)
Attachment #634303 - Flags: review?(bzbarsky)
Attachment #634304 - Flags: review?(bzbarsky)
Comment on attachment 634303 [details] [diff] [review] Patch, v1 r=me
Attachment #634303 - Flags: review?(bzbarsky) → review+
Comment on attachment 634304 [details] [diff] [review] Patch for branches, v1 Hmm. Why is it ok to do this?
> Why is it ok to do this? The code this patch removes was added in bug 731419. We won't discard these images as quickly, but we'll still eventually discard them. (Alternatively, we could take the other patch on branches; it's not *too* scary, except for the interface changes.)
Comment on attachment 634304 [details] [diff] [review] Patch for branches, v1 Ah. r=me, I guess. The interface changes worry me on beta, at least... I'm not sure whether taking this patch on beta is better than living with the regression for that cycle, though.
Attachment #634304 - Flags: review?(bzbarsky) → review+
Comment on attachment 634304 [details] [diff] [review] Patch for branches, v1 [Approval Request Comment] Bug caused by: bug 731419 User impact if declined: This bug, bug 762460. Testing completed (on m-c, etc.): Pushed to try. :) Risk to taking this patch (and alternatives if risky): See comment 20. This may increase memory usage temporarily for some users, which can lead to OOMs. But it's no worse than we were in FF12. String or UUID changes made by this patch: None.
Attachment #634304 - Flags: approval-mozilla-beta?
Comment on attachment 634303 [details] [diff] [review] Patch, v1 Review of attachment 634303 [details] [diff] [review]: ----------------------------------------------------------------- r=me, but I'd like it if we had an RAII class even more. ::: content/base/src/nsImageLoadingContent.cpp @@ +1030,5 @@ > + currentRequest->IsLocked(&currentRequestIsLocked); > + } > + if (currentRequestIsLocked) { > + currentRequest->LockImage(); > + } You should make an RAII class to do this! ::: image/public/imgIRequest.idl @@ +166,5 @@ > */ > void unlockImage(); > > /** > + * Checks whether the image is locked. s/Checks w/W/
Attachment #634303 - Flags: review?(joe) → review+
> s/Checks w/W/ s/Checks/Gets/? I'd like it to be a complete sentence, since it's a method. I guess we could make it an attribute... > You should make an RAII class to do this! It's kind of tricky because we only want to lock if the image is already locked. So it's either a really specific RAII class, or otherwise it has to be one of those Init()'able RAII classes.
(In reply to Justin Lebar [:jlebar] from comment #24) > > s/Checks w/W/ > > s/Checks/Gets/? I'd like it to be a complete sentence, since it's a method. > I guess we could make it an attribute... A readonly attribute, sure. > > You should make an RAII class to do this! > > It's kind of tricky because we only want to lock if the image is already > locked. So it's either a really specific RAII class, or otherwise it has to > be one of those Init()'able RAII classes. You could just lock the image unconditionally. Locking doesn't kick off decodes!
> You could just lock the image unconditionally. Locking doesn't kick off decodes! Oh, duh. Okay, that sounds much better.
Attached patch Patch v2Splinter Review
RAII, unconditional locking. Much smaller patch! No interface changes, so I guess we could take this on beta if we wanted.
Attachment #634303 - Attachment is obsolete: true
Attachment #634937 - Flags: review?(joe)
Comment on attachment 634937 [details] [diff] [review] Patch v2 Lovely
Attachment #634937 - Flags: review?(joe) → review+
Comment on attachment 634937 [details] [diff] [review] Patch v2 [Approval Request Comment] See approval request comment in comment 22. We don't need both this patch and the other in this bug approved for beta; just one or the other. The other patch is a straight backout, so is less risky, but this should have fewer negative effects (increased memory usage).
Attachment #634937 - Flags: approval-mozilla-beta?
Attachment #634937 - Flags: approval-mozilla-aurora?
Comment on attachment 634303 [details] [diff] [review] Patch, v1 Rescinding my review on the older patch, given how this bug developed.
Attachment #634303 - Flags: review+ → review-
Comment on attachment 634304 [details] [diff] [review] Patch for branches, v1 [Triage comment] Let's take this backout since it's less risky and puts us more comparably to where we're at with FF12 - thank you Justin for providing options. I'll leave the approval-aurora nom on the other one and we can re-evaluate it after the patch in comment 29 lands on central.
Attachment #634304 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 634937 [details] [diff] [review] Patch v2 taking the patch in comment 22 for beta (backout) instead.
Attachment #634937 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
Comment on attachment 634937 [details] [diff] [review] Patch v2 [Triage Comment] We can take the forward fix for 15.
Attachment #634937 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
When trying to verify this on Firefox 15 beta 4, I've noticed a number of error messages in the Error Console on Windows 7: Timestamp: 8/14/2012 4:48:33 PM Error: The character encoding of the plain text document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the file needs to be declared in the transfer protocol or file needs to use a byte order mark as an encoding signature. Source File: https://secure.wlxrs.com/wkbwMVPzjLD-5BZn35tBbw/Bucket2c.js Line: 0 On Windows 7 - all the photos are properly displayed. Moreover, as on Firefox 13, not all the photos from Skydrive are displayed on Firefox 15 beta 4 on Ubuntu 12.04 and Mac OS X 10.7. The same error as the one mentioned in Comment 2 is displayed in the Error Console.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: