Closed Bug 715308 Opened 13 years ago Closed 13 years ago

Decode ::Draw()'n images before other images

Categories

(Core :: Graphics: ImageLib, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [snappy])

Attachments

(5 files, 11 obsolete files)

69.08 KB, patch
joe
: review+
Details | Diff | Splinter Review
8.68 KB, patch
Details | Diff | Splinter Review
31.10 KB, patch
joe
: review+
Details | Diff | Splinter Review
142 bytes, text/plain
joe
: review+
Details
3.18 KB, patch
joe
: review+
Details | Diff | Splinter Review
RasterImage::Draw() is called when an image is drawn on the screen, even if the image is not yet decoded. So I think we can and should prioritize the decoding of Draw()'n images. This should improve our "decode many images" story greatly.
Blocks: 676270
Bug 674547 (which I hoping to finish Real Soon Now) should be of help since the decode worker will have a queue of images that are waiting to decode. My local WIP is round-robin (i.e. do some decoding then punt to the end of the queue if it doesn't finish) to match the current behavior, but changing it to be sorted by some sort of priority in this bug should fairly straightforward, I think.
Yes, bug 674547 should block this bug. I'd forgotten about that one (even though I filed it!), and I was planning to do that work as part of this bug. I don't want to steal bug 674547 away from you, but we really want this one, and the sooner the better, because we've had problems with regressions in the past. Maybe you could post your incomplete patches there?
Depends on: 674547
Depends on: 715405
Adding [snappy]: When switching tabs to an image-heavy page with discarded images, we should decode the images which appear on the screen before the images offscreen. This will make decoding seem much faster.
Whiteboard: [snappy]
Assignee: nobody → justin.lebar+bug
I don't have the patches on hand, unfortunately (they're on a SATA drive, but the older P4 system I'm using in the interim can only use IDE drives). However, I got the build environment up and running and I don't think it would take too long to duplicate the work I had. I'll see what I can get done over the weekend. If it looks like I won't finish in a timely manor, I'll gladly hand off bug 674547. I'd hate for this to miss the Fx12 train due to being blocked on me and my personal issues.
> I'll see what I can get done over the weekend. Don't worry about it; I have a patch for this that's 95% done. I'd definitely like your feedback on the patch, if you're able to have a look! I'll mark it when I post the patch.
Attached patch Patch v1 (obsolete) — Splinter Review
Comment on attachment 586557 [details] [diff] [review] Patch v1 Note the dependency on bug 715405, which adds the linked list class.
Attachment #586557 - Flags: review?(joe)
Attachment #586557 - Flags: feedback?(rclickenbrock)
Blocks: 716103
Blocks: 716113
Attached patch Patch v1.1 (rebased to tip) (obsolete) — Splinter Review
Attachment #586602 - Flags: review?(joe)
Attachment #586602 - Flags: feedback?(rclickenbrock)
Attachment #586557 - Attachment is obsolete: true
Attachment #586557 - Flags: review?(joe)
Attachment #586557 - Flags: feedback?(rclickenbrock)
Attached patch Patch v1.2 (rebase properly) (obsolete) — Splinter Review
Comment on attachment 586602 [details] [diff] [review] Patch v1.1 (rebased to tip) Sorry for the churn; still working out the kinks in this git workflow I'm trying out.
Attachment #586602 - Attachment is obsolete: true
Attachment #586602 - Flags: review?(joe)
Attachment #586602 - Flags: feedback?(rclickenbrock)
Attachment #586607 - Flags: review?(joe)
Attachment #586607 - Flags: feedback?(rclickenbrock)
Whiteboard: [snappy] → [snappy][needs review]
Blocks: 716140
This is leaking on try because it's calling ClearOnShutdown() after XPCOM shutdown. Why on earth someone is decoding an image after XPCOM shutdown is beyond me...
Depends on: 716523
Well, now that I think about it, we don't need to ClearOnShutdown() the decode worker. Weak references should do the trick. (Unless trace-malloc will complain that I'm leaking a weakref, in which case I may throttle it.)
Attachment #586607 - Flags: review?(joe)
Whiteboard: [snappy][needs review] → [snappy]
Okay, fixed the leak. Still some test failures. https://tbpl.mozilla.org/?tree=Try&rev=eb29adb5609e
Attachment #587543 - Attachment description: Bug 715308 - Decode all available bytes in SourceDataComplete(). → Part 2 (straw man?) - Decode all available bytes on call to SourceDataComplete()
Attached patch Part 1, v2 (obsolete) — Splinter Review
Comment on attachment 587544 [details] [diff] [review] Part 1, v2 I think this would be cleaner if the code that does the actual decoding was factored out to a method on the RasterImage (RasterImage::Decode(), perhaps). That way in cases where immediate decoding is needed, we won't have to go through the DecodeWorker. > nsresult decoderStatus = decoder->GetDecoderError(); > if (NS_FAILED(decoderStatus)) { > DoError(); > return decoderStatus; > } > >- // Kill off the worker >- mWorker = nsnull; >- mWorkerPending = false; >+ // Kill off our decode request, if it's pending. (If not, this call is >+ // harmless.) >+ DecodeWorker::Singleton()->StopDecoding(this); It looks like StopDecoding() won't get called if finalizing the decoder triggers an error as we'll return before we make it here. This should probably happen before the decoder is finalized. > if (!mDecoded && mHasSourceData) { > mDrawStartTime = TimeStamp::Now(); >+ >+ // We're drawing this image, so indicate that we should decode it as soon >+ // as possible. >+ DecodeWorker::Singleton()->MarkAsASAP(this); > } The mHasSourceData condition will prevent prioritization of images that are mid-download. Is that what we actually want? It seems to me like we'd still want to prioritize the decoding even if we don't have all the data yet. >+ // If we aren't yet finished decoding and we have more data in hand, re-post >+ // ourselves to the decode worker. >+ if (aImg->mDecoder && !aImg->IsDecodeFinished() && haveMoreData) { >+ RequestDecode(aImg); >+ } nit: The !IsDecodeFinished() condition can be dropped. In the case where the decode has finished, we won't have a decoder because it would have been shutdown in the code just before this. Aside from that the patch looks good to me. I don't appear to be able to set flags, but feedback+ from me.
Comment on attachment 587543 [details] [diff] [review] Part 2 (straw man?) - Decode all available bytes on call to SourceDataComplete() I need to do some testing to determine how bad this is. The right solution might involve bug 505385, which would let us separate out the network load-done event from the decoder's decode-done event (and would, in general, untangle the mess that is these events). But that looks more complicated than I'd like to take on. I guess I could replicate what the code currently does -- frob the decoder and let it run for however long it wants to run. Joe, do you think that's a better solution than decoding all available data in SourceDataComplete()?
Thanks for the feedback, Robert! I'll have a closer look in the morning. (You should have editbugs privilege now, by the way. Let me know if you still can't set f+ and so on.)
Comment on attachment 587543 [details] [diff] [review] Part 2 (straw man?) - Decode all available bytes on call to SourceDataComplete() This looks good on try: https://tbpl.mozilla.org/?tree=Try&rev=35ac1b918eb5
Attachment #586607 - Flags: feedback?(rclickenbrock)
Attachment #587544 - Flags: feedback+
(In reply to Justin Lebar [:jlebar] from comment #18) > I guess I could replicate what the code currently does -- frob the decoder > and let it run for however long it wants to run. Joe, do you think that's a > better solution than decoding all available data in SourceDataComplete()? I think I need more specific verbs than "frob" to evaluate the solution :)
Heh, okay. :) So the three options are: 1) What we currently do: On SourceDataComplete(), synchronously invoke the decoder and let it run for one timeslice, where a timeslice is defined as however long the decoder decides to run for before it yields. 2) On SourceDataComplete(), synchronously invoke the decoder and wait for it to finish decoding the whole image. 3) On SourceDataComplete(), asynchronously invoke the decoder. Then we need some way to call back into the imgRequest and tell it when the decode has finished. There are tradeoffs associated with each of these options: (1) This is semantically incorrect -- we may fire onload before an image is decoded, and we may not notice when an image has an error. It also may defeat some of what we're trying to accomplish in this bug, in that we want to avoid each image being responsible for its own decoding, so that N images don't hog N timeslices in the event loop. (2) This may block the event loop for a long time if there's a lot to decode; it's no better and may be much worse than (1) in this respect. (3) This is probably the right approach, but it's complicated, particularly because the decoder notifications are a giant mess (bug 505385). I pushed a patch to try with approach (3), but it's a total hack: I add a new event, OnDecoderShutdown to a new interface, imgDecoderObserver2. When imgRequest calls RasterImage::SourceDataComplete(), it passes in a pointer to itself. The RasterImage keeps this pointer around, and, in ShutdownDecoder(), calls OnDecoderShutdown(rv). I feel bad adding more noodles to this spaghetti code (option 3). But I also don't like writing semantically-incorrect code (option 1).
Option 1 sucks, but it (importantly) doesn't suck more than we currently suck. So it's my second choice. Option 2 is right out because of the event loop blocking. Option 3 is what I prefer, and here's how I'd go about doing it: 1) Make imgIDecoderObserver::onStopDecode actually signal the end of decoding. This is likely to be difficult/messy, but is the right long-term solution. See bug 505385 and bug 435296, and the brain of Bobby Holley. 2) Inside imgStatusTracker, wait for *both* onStopDecode and onStopRequest to fire either of them to an observer. That way, one or the other can happen first, but we always send them out in the correct order to observers.
(In reply to Justin Lebar [:jlebar] from comment #22) > (1) This is semantically incorrect -- we may fire onload before an image is > decoded, and we may not notice when an image has an error. One would think so, but this isn't how the de-facto standard plays out. I was worried about this back when I was doing decode-on-draw, because there's really no way to tell if an image has errors without decoding it, but we need to fire either |onload| or |onerror| at _load_ time, not at decode time. But I investigated it, and it turned out that browsers only fire |onerror| if the network load fails, or if the mimetype sniff fails (ie, the first few bytes are garbage). See bug 435296 comment 27. So firing onload before an image is decoded isn't a big deal in itself. However, bz thought that any image that has _already_ started to decode should be allowed to finish before onload is fired. This is where the OnStartDecode->OnStopContainer onload blocking came from.
Decoding just 64 bytes on SourceDataComplete(), only if we've previously decoded 0 bytes, seems to work. That would be awesome...
Try run for 0375f9ea82c5 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=0375f9ea82c5 Results (out of 218 total builds): exception: 69 success: 86 warnings: 27 failure: 36 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-0375f9ea82c5
Comment 26 is tbplbot being pretty slow. The decode-64-bytes patch is https://tbpl.mozilla.org/?tree=Try&rev=f852066a6eae, which has some orange, but not as much.
Try run for f852066a6eae is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=f852066a6eae Results (out of 272 total builds): exception: 2 success: 233 warnings: 33 failure: 4 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-f852066a6eae Timed out after 06 hours without completing.
Blocks: 702579
Blocks: image-suck
Review: :JOEDREW!
Attachment #586607 - Attachment is obsolete: true
Attachment #587543 - Attachment is obsolete: true
Attachment #587544 - Attachment is obsolete: true
Attachment #588116 - Flags: review?(joe)
Attachment #588117 - Flags: review?(joe)
Try run for e28f6812d616 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=e28f6812d616 Results (out of 273 total builds): exception: 1 success: 236 warnings: 25 failure: 11 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-e28f6812d616
That line of red TP4 runs is fishy. It looks unrelated, but I'll dig in.
Comment on attachment 588116 [details] [diff] [review] Part 1, v3: Decode RasterImage:Draw()'n images before other images. Review of attachment 588116 [details] [diff] [review]: ----------------------------------------------------------------- The biggest problems I have with this patch is that 1) ASAP requests can starve normal requests, and 2) AFAICT we don't round-robin through requests, so a long-decoding image can stop other images from getting any work done at all, which is a functional change from before. I think if we just toss the image to the end of the linked list after work has been done on it, that would satisfy me for #2. ::: image/src/RasterImage.cpp @@ +58,5 @@ > #include "ImageLogging.h" > +#include "mozilla/TimeStamp.h" > +#include "mozilla/Telemetry.h" > +#include "mozilla/Preferences.h" > +#include "mozilla/ClearOnShutdown.h" All four of these are included again below. You should just remove these additions. @@ +231,5 @@ > + > + DecodeWorker() > + : mPendingInEventLoop(false) > + { > + } {} @@ +2543,5 @@ > // If we get this far, dispatch the worker. We do this instead of starting > // any immediate decoding to guarantee that all our decode notifications are > // dispatched asynchronously, and to ensure we stay responsive. > + DecodeWorker::Singleton()->RequestDecode(this); > + return NS_OK; Add a blank line before the return. @@ +2906,5 @@ > + } > + > + request->mIsASAP = true; > + > + if (request->isInList()) { FWIW, I hope review comments for the LinkedList class cause LinkedList's capitalization to be Gecko-style. :) @@ +2909,5 @@ > + > + if (request->isInList()) { > + // If the decode request is in a list, it must be in the normal decode > + // requests list -- if it had been in the ASAP list, then mIsASAP would > + // have been true above. Move the request to the ASAP list. You should add a comment saying that we should already be pending in this case, because RequestDecode() below depends on it. You could add an assert/NS_ABORT_IF_FALSE if you wanted, for both pending and that the request is in the normal decode requests list. @@ +2928,5 @@ > + > + if (request->mIsASAP) { > + mASAPDecodeRequests.insertBack(request); > + } > + else { Cuddle else please. @@ +2961,5 @@ > + // During this trip through the event loop, we'll decode only ASAP requests > + // or only normal requests, but not both. We need to spin the event loop > + // when we transition from ASAP to normal requests so the ASAP requests have > + // a chance to be drawn. > + LinkedList<DecodeRequest> &requestList = LinkedList<DecodeRequest>& requestList @@ +2963,5 @@ > + // when we transition from ASAP to normal requests so the ASAP requests have > + // a chance to be drawn. > + LinkedList<DecodeRequest> &requestList = > + !mASAPDecodeRequests.isEmpty() ? > + mASAPDecodeRequests : mNormalDecodeRequests; Put this all on one line for ease of reading. (I'd actually prefer if/else, but that's impossible with references.) @@ +2969,5 @@ > + while (true) { > + TimeStamp startIteration = TimeStamp::Now(); > + > + // Have we been decoding for too long? > + if ((startIteration - eventStart).ToMilliseconds() >= gMaxMSBeforeYield) { I'd prefer if this while loop wasn't a while (true); perhaps a do... while loop would be better. @@ +2986,5 @@ > + > + // If decode requests are pending, re-post ourself to the event loop. > + if (!mASAPDecodeRequests.isEmpty() || !mNormalDecodeRequests.isEmpty()) { > + mPendingInEventLoop = true; > + NS_DispatchToCurrentThread(this); Put this (and the similar code in RequestDecode) inside a method: void EnsurePending() { if (!mPendingInEventLoop) // etc } @@ +2990,5 @@ > + NS_DispatchToCurrentThread(this); > + } > + > + Telemetry::Accumulate(Telemetry::IMAGE_DECODE_LATENCY, > + PRUint32((TimeStamp::Now() - eventStart).ToMilliseconds())); I think you need to test whether this is a size decode before accumulating, as they throw off the numbers. @@ +3082,5 @@ > + > + if (NS_FAILED(aImg->ShutdownDecoder(RasterImage::eShutdownIntent_Done))) { > + aImg->DoError(); > + } > + MOZ_ASSERT(!request->mIsASAP); I don't understand why this assert is here. @@ +3088,5 @@ > + > + // If we aren't yet finished decoding and we have more data in hand, re-post > + // ourselves to the decode worker. > + if (aImg->mDecoder && !aImg->IsDecodeFinished() && haveMoreData) { > + RequestDecode(aImg); I think this is already handled in ::Run(), above, right? We should only do it in one spot, probably Run(). And if it *is* Run(), we should rename this function, since it only decodes part of the image. ::: image/src/RasterImage.h @@ +168,3 @@ > class Decoder; > > +namespace internal { Why not make this a subclass of RasterImage? @@ +182,5 @@ > + DecodeRequest(RasterImage* aImage) > + : mImage(aImage) > + , mIsASAP(false) > + { > + } {} on the same line is the usual convention, to show that there is intentionally no body. ::: modules/libpref/src/init/all.js @@ +3283,5 @@ > // Chunk size for calls to the image decoders > pref("image.mem.decode_bytes_at_a_time", 4096); > > // The longest time we can spend in an iteration of an async decode > +pref("image.mem.max_ms_before_yield", 10); Why should we change this to 10 ms as part of this patch? Also: As currently written, this same time will be used as the timeout both for individual images and the multiple-image-decoder worker. Why should they have the same timeout?
Attachment #588116 - Flags: review?(joe) → review-
Comment on attachment 588117 [details] [diff] [review] Part 2, v1: Sync-decode the beginning of the image in RasterImage::SourceDataComplete() Review of attachment 588117 [details] [diff] [review]: ----------------------------------------------------------------- ::: image/src/RasterImage.cpp @@ +239,5 @@ > > + /* Decode some chunks of the given image. If aBriefDecode is set, decode > + * just a few bytes if no bytes have been decoded yet -- do just enough for > + * us to figure out if there's an error in the image. */ > + nsresult DecodeImage(RasterImage* aImg, bool aBriefDecode = false); Instead of a boolean, use an enumerated type. @@ +3019,5 @@ > // If an error is flagged, it probably happened while we were waiting > // in the event queue. Bail early, but no need to bother the run queue > // by returning an error. > if (aImg->mError) > + return NS_OK; You went to a bit of work removing the nsresults from imgDecodeWorker::Run() when moving it to DecodeImage, and then put them back in this patch. Maybe it'd be better just to leave them in. @@ +3031,5 @@ > > + PRUint32 maxBytes; > + if (!aBriefDecode) { > + // We really do mean it when we say "brief". > + maxBytes = 32; This should be a pref. @@ +3052,5 @@ > // 1) We don't have any data left to decode > // 2) The decode completes > // 3) We hit the deadline and need to yield to keep the UI snappy > while (haveMoreData && !aImg->IsDecodeFinished() && > + TimeStamp::Now() < deadline) { Unrelated change; you can put it in patch 1 if you want. @@ +3067,5 @@ > haveMoreData = aImg->mSourceData.Length() > aImg->mBytesDecoded; > + > + // Iterate only once if we're a brief decode. > + if (aBriefDecode) > + break; It would be cleaner if this was changed into a do...while loop, because this could be added to the condition.
Attachment #588117 - Flags: review?(joe) → review+
> 1) ASAP requests can starve normal requests Is this a problem? By definition, a normal request isn't onscreen. So why do we care if it's starved to make room for ASAP requests? (It'll become ASAP as soon as it is Draw()'n.) > 2) AFAICT we don't round-robin through requests, so a long-decoding image can stop other images from > getting any work done at all, which is a functional change from before. In DecodeWorker::Run, we popFirst(), but in DecodeWorker::RequestDecode() we insertBack(). So we do round-robin. > Also: As currently written, this same time will be used as the timeout both for individual images > and the multiple-image-decoder worker. Why should they have the same timeout? Why is it better to always multiplex one timeslice across multiple images? Currently, we basically do A) 5ms decode | 5ms decode | ... | main event loop | 5ms decode | ... But with this patch, we'll do B) 5ms decode | main event loop | 5ms decode | main event loop | In either case, each 5ms decode is usually just for one image. (In (A), if an image finishes before its 5ms is up, we move on to the next event immediately. In (B), if an image finishes before its 5ms is up, we move on to another image for the remaining 5ms.)
> + PRUint32 maxBytes; > + if (!aBriefDecode) { > + // We really do mean it when we say "brief". > + maxBytes = 32; > This should be a pref. I'm surprised you didn't give me more grief over this, because I can't prove that 32 is the right number here, short of pointing to a green run on try. :) It's a correctness issue that this value be big enough, so I'm not sure it should be a pref. > Why should we change this to 10 ms as part of this patch? We used 5ms previously not because we wanted to yield to the event loop every 5ms, but because N images could stack up in the event loop, and we ended up yielding after 5*N-ms. 5ms was about the smallest value I could use and still give us reasonably-sized chunks to decode. We can afford longer time-slices with this patch, since we don't have this 5*N-ms behavior. I picked 10ms because it's under 60fps. I can spin it off into a separate commit, if that's all you mean. Or we can leave it at 5ms, which I'm also fine with.
> Why not make [DecodeRequest] a subclass of RasterImage? You say "composition", I say "inheritance." I say "inheritance", you say "composition"... I did it to match the style of DiscardTrackerNode. But I can change it if you want.
Joe explained on IRC that he meant that DecodeRequest could be an inner class of RasterImage. That's fine with me!
>@@ +2961,5 @@ >> + // During this trip through the event loop, we'll decode only ASAP requests >> + // or only normal requests, but not both. We need to spin the event loop >> + // when we transition from ASAP to normal requests so the ASAP requests have >> + // a chance to be drawn. >> + LinkedList<DecodeRequest> &requestList = > >LinkedList<DecodeRequest>& requestList This file (and this patch) puts |*| on the variable names. Why put |&| on the typename? >@@ +2963,5 @@ >> + // when we transition from ASAP to normal requests so the ASAP requests have >> + // a chance to be drawn. >> + LinkedList<DecodeRequest> &requestList = >> + !mASAPDecodeRequests.isEmpty() ? >> + mASAPDecodeRequests : mNormalDecodeRequests; > >Put this all on one line for ease of reading. (I'd actually prefer if/else, but that's impossible with references.) That makes this a 121-character-long line. I made requestList a pointer so we don't have to have this argument. :) >@@ +2969,5 @@ >> + while (true) { >> + TimeStamp startIteration = TimeStamp::Now(); >> + >> + // Have we been decoding for too long? >> + if ((startIteration - eventStart).ToMilliseconds() >= gMaxMSBeforeYield) { > >I'd prefer if this while loop wasn't a while (true); perhaps a do... while loop would be better. I did it this way so I don't have to read the time twice, at the top of the loop (to time the decode) and at the bottom of the loop, for the while() condition. >> + NS_DispatchToCurrentThread(this); >> + } >> + >> + Telemetry::Accumulate(Telemetry::IMAGE_DECODE_LATENCY, >> + PRUint32((TimeStamp::Now() - eventStart).ToMilliseconds())); > >I think you need to test whether this is a size decode before accumulating, as they throw off the numbers. Well, except we might have done multiple things while we were here. It could have been two size decodes, or a size decode plus a normal decode. I think this is a pretty un-interesting piece of telemetry, tbh. But the goal is to tell us how long we spend holding up the event loop, so maybe it should stay how it is? You can now filter out small values in the telemetry dashboard, btw. >@@ +3088,5 @@ >> + >> + // If we aren't yet finished decoding and we have more data in hand, re-post >> + // ourselves to the decode worker. >> + if (aImg->mDecoder && !aImg->IsDecodeFinished() && haveMoreData) { >> + RequestDecode(aImg); > >I think this is already handled in ::Run(), above, right? No, but there is a bug here. We need to RequestDecode() here so that *this particular image* is decoded again sometime; we pop the decode request off the list before calling DecodeImage(). But the code in Run() to post back to the event loop is also necessary, because it might happen that we call DecodeImage(), finish decoding the whole image (so don't call RequestDecode), and then notice that we're out of time. In this case, we need to post ourselves back to the event loop! That's what the code in Run() does. However, the code in Run() should call EnsurePendingInEventLoop(), because we might already be pending; it currently assumes we're not pending, which is wrong. >We should only do it in one spot, probably Run(). And if it *is* Run(), we should rename this function, since it only decodes part of the image. Once you call RequestDecode() on an image, it'll get fully decoded. It just might take a few async steps. I guess it's kind of confusing that the method is used internally to mean "post me back and decode more of this image", but to the outside caller, this is just like recursion.
(In reply to Justin Lebar [:jlebar] from comment #38) > > This should be a pref. > > I'm surprised you didn't give me more grief over this, because I can't prove > that 32 is the right number here, short of pointing to a green run on try. > :) It's a correctness issue that this value be big enough, so I'm not sure > it should be a pref. I'm a bit concerned about this. What exactly goes orange on try if this is too low?
What is this "yield"? Don't all the operating systems do preemptive multitasking? If not, what purpose does yielding serve on preemptive environments, and if so, why shouldn't this be designed to let the operating system handle it? Is the answer just that the operating system isn't smart enough to?
(In reply to Robert Claypool from comment #43) > What is this "yield"? Don't all the operating systems do preemptive > multitasking? If not, what purpose does yielding serve on preemptive > environments, and if so, why shouldn't this be designed to let the operating > system handle it? > Is the answer just that the operating system isn't smart enough to? Everything here is on one thread. When we say "yield" here we're just referring to "returning from the current function so that the event loop can schedule something else".
(In reply to Justin Lebar [:jlebar] from comment #41) > >@@ +2961,5 @@ > >> + LinkedList<DecodeRequest> &requestList = > > > >LinkedList<DecodeRequest>& requestList > > This file (and this patch) puts |*| on the variable names. Why put |&| on > the typename? New code should put * and & on typenames. We just haven't bothered changing old code. > >@@ +2969,5 @@ > >> + while (true) { > >> + TimeStamp startIteration = TimeStamp::Now(); > >> + > >> + // Have we been decoding for too long? > >> + if ((startIteration - eventStart).ToMilliseconds() >= gMaxMSBeforeYield) { > > > >I'd prefer if this while loop wasn't a while (true); perhaps a do... while loop would be better. > > I did it this way so I don't have to read the time twice, at the top of the > loop (to time the decode) and at the bottom of the loop, for the while() > condition. If you feel strongly, add a comment :) > >> + Telemetry::Accumulate(Telemetry::IMAGE_DECODE_LATENCY, > >> + PRUint32((TimeStamp::Now() - eventStart).ToMilliseconds())); > > > >I think you need to test whether this is a size decode before accumulating, as they throw off the numbers. > > Well, except we might have done multiple things while we were here. It > could have been two size decodes, or a size decode plus a normal decode. > > I think this is a pretty un-interesting piece of telemetry, tbh. But the > goal is to tell us how long we spend holding up the event loop, so maybe it > should stay how it is? You can now filter out small values in the telemetry > dashboard, btw. Jeff, you added the test for size decodes. What do you think? > >We should only do it in one spot, probably Run(). And if it *is* Run(), we should rename this function, since it only decodes part of the image. > > Once you call RequestDecode() on an image, it'll get fully decoded. It just > might take a few async steps. I guess it's kind of confusing that the > method is used internally to mean "post me back and decode more of this > image", but to the outside caller, this is just like recursion. It's a little incestuous, though: > In DecodeWorker::Run, we popFirst(), but in DecodeWorker::RequestDecode() we > insertBack(). So we do round-robin. I'd *much* rather it be only in Run(), rather than in two different places.
> If you feel strongly, add a comment :) Oh, I was totally out to lunch. A do/while loop is much better here. Sorry! (We call TimeStamp::Now() way too much, still.)
I'll change all my pointers to type* name, but only on the condition I get to stand on the soapbox for a moment: The point of having a style guide is so that you all your code looks the same. Consistency is much more important than any particular style point. So it is in fact counter-productive to the basic purpose of a style guide to say "new code in existing files follows rule X; we just haven't changed the old code yet." If the old code's style is so egregious, then it can be fixed. But I seriously doubt that anyone is going to go through RasterImage and change all its pointers. We all have better things to do. So by enforcing this rule, we're actually making our style *worse*. Thank you. I will now proceed to do as I was told.
> I'd *much* rather it be only in Run(), rather than in two different places. This breaks the invariant you asked me to add in MarkAsASAP. Now request->isInList() does not imply mPendingInEventLoop, because we specifically add requests to the list without ensuring that the worker is pending.
> So it is in fact counter-productive to the basic purpose of a style guide to say "new code in > existing files follows rule X; we just haven't changed the old code yet." Note that this is basically the same argument you made in bug 715405 comment 22. :)
(In reply to Justin Lebar [:jlebar] from comment #48) > > I'd *much* rather it be only in Run(), rather than in two different places. > > This breaks the invariant you asked me to add in MarkAsASAP. Now > request->isInList() does not imply mPendingInEventLoop, because we > specifically add requests to the list without ensuring that the worker is > pending. Maybe that shouldn't be possible, then: >+ DecodeRequest *request = requestList.popFirst(); >+ >+ if (!request) { >+ // Nothing more to decode! >+ break; >+ } >+ >+ DecodeImage(request->mImage); if (!request->mImage->IsDecoded()) ContinueDecoding(request); (with an obvious implementation of IsDecoded, and ContinueDecoding() calling EnsurePendingInEventLoop(), and DecodeImage renamed to DecodeSomeImageData() or whatever.) Then you can remove the >+ // If decode requests are pending, re-post ourself to the event loop. >+ if (!mASAPDecodeRequests.isEmpty() || !mNormalDecodeRequests.isEmpty()) { >+ mPendingInEventLoop = true; >+ NS_DispatchToCurrentThread(this); >+ } hunk altogether.
(In reply to Justin Lebar [:jlebar] from comment #49) > > So it is in fact counter-productive to the basic purpose of a style guide to say "new code in > > existing files follows rule X; we just haven't changed the old code yet." > > Note that this is basically the same argument you made in bug 715405 comment > 22. :) Yeah, but I'm the module owner of Imagelib. :) (FWIW, I would accept a patch to switch us wholesale.)
No longer depends on: 674547
> Then you can remove the [if decode requests are pending, re-post ourself to the event loop] hunk > altogether. Sorry, I don't see how this solves the problem. To wit: Suppose we have two pending decodes. We call DecodeImage and decode the entirety of one image. Then we check the time and see that we've exhausted our timeslice. At this point, we must re-post ourselves to the event loop, because we still have a decode pending. But we're not going to re-post ourselves on behalf of the image we just finished decoding; that is, the ContinueDecoding(request) call in comment 50 will not be hit. So we still need the hunk at the end of comment 50. I don't think we can remove the [repost if decodes are still pending] hunk. We can make ContinueDecoding() not cause us to repost ourselves, and I'll post a patch with this. I'll let you be the judge of whether it's clearer this way.
Attached patch Patch v4 (obsolete) — Splinter Review
Outstanding questions: * Justin: Exactly what fails when the immediate-decode size is too small? (bholley, comment 42) * Jeff: Is the change to IMAGE_DECODE_LATENCY ok? (joe, comment 45)
One more oustanding question: * Joe: Is the new DecodeWorker API, where during DecodeWorker::Run, we add images to the list of pending decodes but don't post the worker to the event loop until it's done with its timeslice, better? Note the comment in MarkAsASAP; it's now incorrect to call that during DecodeWorker::Run, because of the issue with the invariant.
> I don't think we can remove the [repost if decodes are still pending] hunk. We > can make ContinueDecoding() not cause us to repost ourselves, and I'll post a > patch with this. I'll let you be the judge of whether it's clearer this way. I'll judge once I see the patch, but you do make a good point and I'm reasonably convinced. I think this answers comment 55; let me know if this is not the case.
Attachment #588117 - Attachment is obsolete: true
Attachment #588116 - Attachment is obsolete: true
Attachment #590003 - Flags: review?(joe)
Attachment #590003 - Flags: feedback?(jmuizelaar)
> * Justin: Exactly what fails when the immediate-decode size is too small? (bholley, comment 42) We'll find out in https://tbpl.mozilla.org/?tree=Try&rev=07afd9abe685
(In reply to Justin Lebar [:jlebar] from comment #57) > > * Justin: Exactly what fails when the immediate-decode size is too small? (bholley, comment 42) > > We'll find out in https://tbpl.mozilla.org/?tree=Try&rev=07afd9abe685 Looks green for decoding 4 bytes. Are you curious to go smaller than that, Bobby?
(In reply to Justin Lebar [:jlebar] from comment #58) > Looks green for decoding 4 bytes. Are you curious to go smaller than that, > Bobby? I'm not concerned with the actually number, just the question of what's limiting us here. You mentioned earlier (unless I'm mistaken) that there were failures when the number was too small, so I'm wondering (a) how small is too small, (b) what those failures are, and (c) why they fail. What happens if we write zero bytes?
> You mentioned earlier (unless I'm mistaken) that there were failures when the number was too small Yes, but in my mind, I meant that, if we don't write anything (which may not be the same as writing 0 bytes!), we know it's not going to work. Pushed to try with 0 bytes: https://tbpl.mozilla.org/?tree=Try&rev=53cfc1a752fe
(In reply to Justin Lebar [:jlebar] from comment #60) > Yes, but in my mind, I meant that, if we don't write anything (which may not > be the same as writing 0 bytes!) Wait, what? How so?
Maybe I'm being too coy -- knowing nothing about the code, it's certainly *possible* that writing 0 bytes would be different than not invoking the decoder. Maybe it is the same!
Comment on attachment 590003 [details] [diff] [review] Patch v4 Review of attachment 590003 [details] [diff] [review]: ----------------------------------------------------------------- ::: image/src/RasterImage.cpp @@ +3143,5 @@ > + // If we aren't yet finished decoding and we have more data in hand, re-post > + // ourselves to the decode worker. > + if (aImg->mDecoder && !aImg->IsDecodeFinished() && haveMoreData) { > + AddDecodeRequest(&aImg->mDecodeRequest); > + } Some quick feedback - Why can't this be in the Run() loop above, thereby centralizing all modification of the queues?
(In reply to Joe Drew (:JOEDREW!) from comment #63) > Some quick feedback - Why can't this be in the Run() loop above, thereby > centralizing all modification of the queues? FWIW, the refactoring I want to do in bug 718927 would be cleaner if this stayed as part of the decoding code. I'm planning on moving the decoding code here to a method on the RasterImage, after which it won't necessarily be called only by the DecoderWorker. For example, in bug 685516 we want to sync decode for a timeslice in RasterImage::RequestDecode() then decode the rest asynchronously. After the synchronous decoding yields, we need to check if we decoded all the data and invoke more decoding if we haven't. As part of the decoding code, this would only need to happen in once place rather than being duplicated at each callsite.
(In reply to Robert Lickenbrock [:rclick] from comment #64) > this would only need to happen in once place rather than being > duplicated at each callsite. s/once/one
> Some quick feedback - Why can't this be in the Run() loop above, thereby centralizing all > modification of the queues? I think that's a good idea, at least from the perspective of this patch. I'm tempted to do it; Robert, do you mind changing it when you do your refactoring? I hope it won't mess you up too much.
> Maybe I'm being too coy -- knowing nothing about the code, it's certainly *possible* that writing 0 > bytes would be different than not invoking the decoder. Maybe it is the same! Indeed, sync-writing 0 bytes is much, much worse than just not writing anything at all. 0 bytes turns almost every test suite orange, whereas not writing at all just failed a few suites. https://tbpl.mozilla.org/?tree=Try&rev=53cfc1a752fe
(In reply to Justin Lebar [:jlebar] from comment #66) > I think that's a good idea, at least from the perspective of this patch. > I'm tempted to do it; Robert, do you mind changing it when you do your > refactoring? I hope it won't mess you up too much. Changing it if I need to won't be a problem. Go ahead and do it. (In reply to Justin Lebar [:jlebar] from comment #67) > Indeed, sync-writing 0 bytes is much, much worse than just not writing > anything at all. 0 bytes turns almost every test suite orange, whereas not > writing at all just failed a few suites. The orange test suites all crash somewhere under DecodeImage(), so I suspect that writing 0 bytes to the decoder is broken in general. What kind of failures are there if nothing is written at all?
> What kind of failures are there if nothing is written at all? From comment 14: https://tbpl.mozilla.org/?tree=Try&rev=eb29adb5609e
(In reply to Justin Lebar [:jlebar] from comment #69) > > What kind of failures are there if nothing is written at all? > > From comment 14: https://tbpl.mozilla.org/?tree=Try&rev=eb29adb5609e So it causes us to report an incorrect size? This is probably indicative of a bug, since the decoders are supposed to wait until they have enough data before parsing out a size. Do we need a check? An assert? I just want to make sure that we don't sweep this issue under the rug. It might be the kind of thing that could cause intermittent orange.
> So it causes us to report an incorrect size? What makes you say that? I'd looked at [1], which is a reftest failure caused by us not noticing that an image is malformed, without the brief sync-decode. [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=8428944&tree=Try#error0
(In reply to Justin Lebar [:jlebar] from comment #71) > > So it causes us to report an incorrect size? > > What makes you say that? https://tbpl.mozilla.org/php/getParsedLog.php?id=8428494&tree=Try
> This is probably indicative of a bug, since the decoders are supposed to wait until they have enough > data before parsing out a size. It could be that we don't have a size but onload gets called anyway, and then content notices that we don't have the size. Is there a better way to do a sync size decode than sync decoding 8 bytes of the image?
(In reply to Justin Lebar [:jlebar] from comment #73) > > This is probably indicative of a bug, since the decoders are supposed to wait until they have enough > > data before parsing out a size. > > It could be that we don't have a size but onload gets called anyway, and > then content notices that we don't have the size. In the current world, onload doesn't fire until all images fire the OnStopDecode+OnStopRequest siamese twins. This doesn't happen until a full size decode has happened, which means that every byte of data needs to have passed through the decoder, even if the decoder ignored most of it (because it was in size decode mode). So we've got some kind of bug here.
> In the current world, onload doesn't fire until all images fire the OnStopDecode+OnStopRequest > siamese twins. This doesn't happen until a full size decode has happened, which means that every > byte of data needs to have passed through the decoder, even if the decoder ignored most of it > (because it was in size decode mode). Are you sure this is correct? AIU the current code, onload is unblocked by SourceDataComplete. And SourceDataComplete invokes the decoder for as long as it wants to run; in particular, the decoder might not run to completion. If we take out this invocation, then tests fail as in comment 72 -- this indicates to me that the decoder is not looking at every byte of data before onload.
(In reply to Justin Lebar [:jlebar] from comment #76) > AIU the current code, onload is unblocked by SourceDataComplete. Not really. It's called from Decoder::PostDecodeDone() (superclass method), which is called at different times for different decoder implementations. In particular, some decoders call it when they detect the end of useful data (like the png and jpg decoders), while others call it in FinishInternal, which gets called when the decoder gets shut down. This only happens in SourceDataComplete() if we're not storing source data (which we almost always are). I'm pretty sure that the common case for decoder shutdown (in the BMP case) is in imgDecodeWorker::Run(), once all the bytes have been written through.
(In reply to Bobby Holley (:bholley) from comment #74) > In the current world, onload doesn't fire until all images fire the > OnStopDecode+OnStopRequest siamese twins. This doesn't happen until a full > size decode has happened, which means that every byte of data needs to have > passed through the decoder, even if the decoder ignored most of it (because > it was in size decode mode). The patch here makes the decoding in AddSourceData() and SourceDataComplete() asynchronous, so there's no guarantee the size decode will finish before OnStopDecode+OnStopRequest (and thus onload) fire. (In reply to Justin Lebar [:jlebar] from comment #73) > Is there a better way to do a sync size decode than sync decoding 8 bytes of > the image? One way of doing it is to check what type of decoder you have in SourceDataComplete(). If it's a size decoder, synchronously write all the data in one big chunk. The decoder will bail as soon as it's parsed and posted the dimensions, so this is cheap and shouldn't be a performance concern. AFAICT you should be able to do things asynchronously if it's a full decoder. In that case, Decoder::Init() fires OnStartDecode which triggers onload blocking until the decode finishes.
> One way of doing it is to check what type of decoder you have in SourceDataComplete(). I'd tried this, but it does not work. In [1], test_fileapi_slice passes, but test_bug369370 fails in the same way. This suggests that the issue is not only with size decodes. [1] https://tbpl.mozilla.org/php/getParsedLog.php?id=8428494&tree=Try
I talked with bholley about this on IRC, and it turns out that we should ignore most of what he's posted here: <bholley> jlebar: I was looking at the other siamese twins :-( <bholley> jlebar: I was looking at the internal OnStopDecoder/OnStopContainer, rather than the external OnStopDecode/OnStopRequest <bholley> jlebar: bottom line - my comment was wrong, OnStopDecode is fired on imgRequest::OnStopRequest So this business about onload not being fired until the full decode has happened isn't correct. There's still the question of: How many bytes should we decode in SourceDataComplete, and why is it necessary to do this at all?
(In reply to Justin Lebar [:jlebar] from comment #80) > I talked with bholley about this on IRC, and it turns out that we should > ignore most of what he's posted here Only starting from comment 74. Everything else should be valid.
It seems that we should decode at least as many bytes as we need for the longest mime-type sniff. 32 ought to be more than enough for now and the conceivable future, right?
Attached patch Patch v5 (obsolete) — Splinter Review
I reset the brief decode length to 32, since that should be more than long enough for any MIME type sniff, and made it a constant.
Attachment #590003 - Attachment is obsolete: true
Attachment #590003 - Flags: review?(joe)
Attachment #590003 - Flags: feedback?(jmuizelaar)
Attachment #590729 - Flags: review?(joe)
Attachment #590729 - Flags: feedback?(jmuizelaar)
Just tested on my Mac, and this patch apparently delays tab switching until all images on the tab are decoded. :-/
Attachment #590729 - Flags: review-
Whether the delay is long or not seems to be timing-dependent, at least on my Linux box. It's usually pretty fast to switch tabs, but sometimes it's slow like on my mac. (On the mac, it appears to be consistently very slow.)
> + } while ((TimeStamp::Now() - eventStart).ToMilliseconds() >= gMaxMSBeforeYield); Oh, well that would do it. <= works much better.
Attached patch Part 1, v6 (obsolete) — Splinter Review
Review: :joedrew! Tested on Linux and Mac; this works far better than any previous version. :)
Attachment #590729 - Attachment is obsolete: true
Attachment #590729 - Flags: review?(joe)
Attachment #590729 - Flags: feedback?(jmuizelaar)
Attachment #590957 - Flags: review?
Attachment #590957 - Flags: review?(joe)
Attachment #590957 - Flags: feedback?(jmuizelaar)
Attachment #590957 - Flags: review?
In case it got lost in the noise here, Jeff, the f? is on changing Telemetry::IMAGE_DECODE_LATENCY, per comment 45.
Comment on attachment 590957 [details] [diff] [review] Part 1, v6 Review of attachment 590957 [details] [diff] [review]: ----------------------------------------------------------------- ::: image/src/RasterImage.cpp @@ +176,5 @@ > } > > namespace mozilla { > namespace image { > +namespace internal { Can you move this to be a child class of RasterImage? class RasterImage { class DecodeWorker { // @@ +181,5 @@ > + > +/* DecodeWorker is a singleton class which we use when decoding large images. > + * > + * When we decode an image larger than gMaxBytesForSyncDecode, we call > + * DecodeWorker::RequestDecode() for the image. The DecodeWorker instance Here you should reword to make it a little more clear that the image gets added to a queue. @@ +199,5 @@ > + > + /** > + * Ask the DecodeWorker to asynchronously decode some of this image. > + */ > + void RequestDecode(RasterImage* aImg); It's really an implementation detail of this function that it only decodes some of the image, right? If we really want to make a distinction, we could have a private function to enqueue an image, and have this just call that. @@ +258,5 @@ > + /* Decode some chunks of the given image. If aDecodeType is brief, decode > + * just a few bytes if no bytes have been decoded yet -- do just enough for > + * us to figure out if there's an error in the image. */ > + nsresult DecodeImage(RasterImage* aImg, > + DecodeType aDecodeType = DECODE_TYPE_NORMAL); DecodeSomeOfImage? Decode5msOfImage()? You should change this name, as it doesn't decode the whole image, it just decodes until it runs out of time. @@ +261,5 @@ > + nsresult DecodeImage(RasterImage* aImg, > + DecodeType aDecodeType = DECODE_TYPE_NORMAL); > + > + LinkedList<DecodeRequest> mASAPDecodeRequests; > + LinkedList<DecodeRequest> mNormalDecodeRequests; Right above here, can you add |private: // members| (and similarly put |private: // statics| and |private: // methods|)? I like the separation. @@ +1658,5 @@ > } > > + // If there's a decoder open, synchronously decode the beginning of the image > + // to check for errors. (If the beginning has already been decoded, this > + // does nothing.) Then kick off an async decode of the rest of the image. Given this explicit behaviour (I recognize that this is not a change), I would like to see a test that documents and ensures what we do with onload and onerror in the case of images that are invalid in the first 32 bytes and invalid past that. @@ +3016,5 @@ > + // this request to the back of the list. > + if (image->mDecoder && > + !image->mError && > + !image->IsDecodeFinished() && > + image->mSourceData.Length() > image->mBytesDecoded) { It would be nice to encapsulate this in a RasterImage::IsFinishedDecoding() or similar. Actually, why doesn't RasterImage::IsDecodeFinished() suffice? @@ +3071,5 @@ > + // Size decodes are cheap and we more or less want them to be > + // synchronous. Write all the data in that case, otherwise write a > + // chunk. > + maxBytes = aImg->mDecoder->IsSizeDecode() > + ? aImg->mSourceData.Length() : gDecodeBytesAtATime; Can you make this an if? The ternary operator makes my tummy hurt. Alternately, put this on the same line.
Attachment #590957 - Flags: review?(joe) → review+
> + // this request to the back of the list. > + if (image->mDecoder && > + !image->mError && > + !image->IsDecodeFinished() && > + image->mSourceData.Length() > image->mBytesDecoded) { > > It would be nice to encapsulate this in a RasterImage::IsFinishedDecoding() or similar. > > Actually, why doesn't RasterImage::IsDecodeFinished() suffice? Well, you need the mDecoder check, because IsDecodeFinished() doesn't work without that. And it's not clear that you don't need the !mError clause. However, I can certainly get rid of mSourceData clause. Would that be sufficient?
> Given this explicit behaviour (I recognize that this is not a change), I would like to see a test > that documents and ensures what we do with onload and onerror in the case of images that are invalid > in the first 32 bytes and invalid past that. Which one of these "invalid"s should be "valid"?
Neither! I want tests for onload and/or onerror in the case of: - Valid images - Invalid image in the first 32 bytes - Invalid image past the first 32 bytes
> - Valid images I presume this is already covered e.g. by any simple reftest which uses a valid image? > - Invalid image in the first 32 bytes I think this is covered by layout/reftests/object/malformed-should-fallback.html. It does <object data="extra/malformed_image.png">PASS</object> where the contents of malformed_image.png are "FAIL". > - Invalid image past the first 32 bytes I'll add a test for this one.
> where the contents of malformed_image.png are "FAIL". That is, the bytes of malformed_image.png are the ASCII characters "FAIL".
No, sorry, I specifically want tests to make sure that onload *does* fire and onerror *does not* fire on a valid image, that <img src="extra/malformed_image.png"> does fire onerror (I don't remember whether onload fires on it too), and to test and document what happens for <img src="extra/big_malformed_image.png"> wrt onload/onerror.
Okay. WRT <object data="big_malformed_image.png">FOO</object>, it looks like this patch introduces a change in behavior. AFAICT, we'd previously fall back to "FOO" if the error occurred within the first decoded-chunk of the image (as much as we can decode in 5ms), but now we'll fall back to "FOO" only if the error occurs within the first 32 bytes of the image. I don't think there's a change in onload/onerror behavior, however.
By the way, exactly what constitutes an "error" is somewhat wishy-washy, and probably depends on the specifics of the decoder. I was able to make a PNG which rendered only halfway, but didn't trigger an error. So a site which relies on onerror being fired for any erroneous image is already screwed.
Joe, is this what you had in mind? These tests pass with and without the patch here.
Attachment #591543 - Flags: review?(joe)
Crap, forgot to send this message: (In reply to Justin Lebar [:jlebar] from comment #96) > Okay. > > WRT <object data="big_malformed_image.png">FOO</object>, it looks like this > patch introduces a change in behavior. AFAICT, we'd previously fall back to > "FOO" if the error occurred within the first decoded-chunk of the image (as > much as we can decode in 5ms), but now we'll fall back to "FOO" only if the > error occurs within the first 32 bytes of the image. > > I don't think there's a change in onload/onerror behavior, however. OK. I'm fine with changing behaviour, but we should make sure we don't accidentally change it further. Be sure to add/modify tests to ensure we know our current state wrt both onload/onerror and fallbacks.
Comment on attachment 591543 [details] [diff] [review] Part 0, v1: onload/onerror tests Review of attachment 591543 [details] [diff] [review]: ----------------------------------------------------------------- This is great for testing onload/onerror.
Attachment #591543 - Flags: review?(joe) → review+
<jrmuizel> jlebar: can you summarize the telemetry change? <jlebar> jrmuizel, So there's IMAGE_DECODE_LAG, or whatever it's called. It measures how long we sit in the event loop before yielding. <jlebar> At the moment, if we decode a bunch of small images, we'll get a bunch of small values added to that histogram. <jlebar> Also, at the moment, size decodes are ignored for the purposes of the histogram. <jlebar> With the patch, the histogram still tells you how long we sit in the event loop before yielding. <jlebar> But we may decode multiple images before yielding. <jlebar> So if you decode a bunch of small images, you may get just one large value put into the histogram. <jlebar> Similarly, we may do some size decodes and some normal decodes in the same trip through the event loop. <jlebar> So I no longer ignore size decodes. <jlebar> IMO, the point of the histogram is to say "Here's how long we blocked the event loop for." So these changes seem fine to me. <jlebar> In fact, it's a much better histogram now, because before, we might queue up 10 workers each blocking the event loop for 5ms at a time. THe histogram would say "5ms", but in fact, the event loop would be blocked for 50ms. <jlebar> Anyway, that's not much of a summary. <jrmuizel> jlebar: feedback+
Attachment #590957 - Flags: feedback?(jmuizelaar)
Are we good to go after this patch? I guess I should push to try once more.
Attachment #591579 - Flags: review?(joe)
Attachment #590957 - Attachment description: Patch v6 → Part 1, v6
Attachment #591579 - Flags: review?(joe) → review+
> + if (!aDecodeType == DECODE_TYPE_BRIEF) { This stray "!" appeared when I first introduced brief decodes, meaning that we were never actually decoding 32 bytes, as I thought; we were always doing 4096. Tests fail when we sync-decode 40 or fewer bytes. 41 seems to work, but I haven't run all the tests. I'm going to try to pinpoint exactly why.
Try run for 357c8e8ee05b is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=357c8e8ee05b Results (out of 61 total builds): exception: 13 success: 34 warnings: 14 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-357c8e8ee05b
(In reply to Justin Lebar [:jlebar] from comment #90) > However, I can certainly get rid of mSourceData clause. I'm pretty sure that it's needed for the case where the image isn't finished downloading and all available data has been decoded. RasterImage::IsDecodeFinished() returns false in that case.
Ah, you're right! I had to write out all the boolean logic, but it comes out to: !IsDecodeFinished() iff !(IsSizeDecode && mHasSize) && !(!IsSizeDecode() && mDecoded) && (!mHasSourceData || mBytesDecoded != mSourceData.Length()) !mHasSourceData means "we haven't received all the image data." But we only want to enqueue another decode if we have more source data to decode in hand. That is, we want !IsDecodeFinished() && mBytesDecoded != mSourceData.Length(). > + if (image->mDecoder && > + !image->mError && > + !image->IsDecodeFinished() && > + image->mSourceData.Length() > image->mBytesDecoded) { Joe, I can write this as a separate method, like you asked, but I think this bug demonstrates that it would be better to leave it as above, or even to inline the IsDecodeFinished() call. Too much DeMorgan's law for one codeblock.
Why do we have to test for mDecoder and !mError, though?
Per comment 90, you need mDecoder because it's incorrect to call IsDecodeFinished() with !mDecoder. We might not have had a decoder to begin with, or it might have been shut down when we finished decoding. Similarly, we might have had mError == true before we started decoding, or we might have hit that during the decode.
WRT the 40/41 bytes issue from comment 103: With 40 bytes, the sync decode doesn't end up calling nsPNGDecoder::info_callback, whereas with 41 bytes, we get the info callback. The info callback does a number of things, but in particular, it reads the image's size. (The decode in question is a normal decode, not a size decode.)
(In reply to Joe Drew (:JOEDREW!) from comment #107) > Why do we have to test for mDecoder and !mError, though? Answered another way: Because we have if(mDecoder && !mError) at the beginning of DecodeWorker::DecodeSomeData() -- if DecodeSomeData bails on !mDecoder || mError, we certainly shouldn't re-enqueue the image.
I think I understand what's going on: If we don't brief-decode enough bytes, then we don't read the image size before firing onload, and tests break. This only happened to work in our current code; the decoder can decode synchronously as much or as little as it wants, but it always decodes at least 4092 bytes, which is plenty. I think the solution is to make brief decodes decode until they get the size or they run out of data.
That's what size decodes are for, though.
But it doesn't look like the code is at all set up to do a sync size decode while we're in the middle of a normal decode.
Right, ok. We do size decodes ahead of time for almost all images (except chrome:, res: and multipart/x-mixed-replace), but those size decodes would almost immediately be interrupted by regular decodes for any image that's being drawn. So, I think what you should do is simplify a bit by changing SyncDecodeBeginning to: 1) Return immediately if gDecodeBytesAtATime bytes have already been decoded; 2) Write as many bytes to the decoder as required to have decoded gDecodeBytesAtATime. Alternate #2: just flush gDecodeBytesAtATime bytes to the decoder.
What's the advantage of doing that as opposed to decoding, say, 64 bytes at a time until we have a size or hit gDecodeBytesAtATime? We suspect that 64 should be sufficient for most cases. I ask because I recall that decoding even 4096 bytes of a PNG can take a fair bit of time. Actually...we have telemetry, so we don't have to guess. About 3% of images decode at less than 500KB (compressed) per second, so decoding 4KB could take 1ms. Multiply that by a few images and we're talking real money...
I'm just looking to reduce the amount of code (and hence complexity), if possible. Maybe we can add some telemetry to find out how long a short decode actually takes? I don't feel super-strongly about it, but I'd just rather make things a little prettier.
Try run for 5c8429d15c91 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=5c8429d15c91 Results (out of 209 total builds): success: 168 warnings: 40 other: 1 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-5c8429d15c91
Comment on attachment 591579 [details] [diff] [review] Part 2: Add <object> fallback test. This test no longer applies with part 1 v7.
Attachment #591579 - Attachment is obsolete: true
Attached patch Part 1 v7Splinter Review
Attachment #590957 - Attachment is obsolete: true
Attachment #591823 - Flags: review?(joe)
Comment on attachment 591823 [details] [diff] [review] Part 1 v7 Review of attachment 591823 [details] [diff] [review]: ----------------------------------------------------------------- ::: image/src/RasterImage.cpp @@ +2953,5 @@ > + nsresult rv = DecodeSomeOfImage(aImg, DECODE_TYPE_UNTIL_SIZE); > + > + // This assertion must be non-fatal. DECODE_TYPE_UNTIL_SIZE makes an effort > + // to find the image's size, but makes no guarantees. > + NS_ASSERTION(aImg->mHasSize, "Did not find image's size after SyncDecodeUntilHaveSize."); I claim it should not exist since we cannot guarantee anything. A debug-only printout could work if you're looking to catch bugs. Alternately, abort if false on NS_FAILED(rv) || aImg->mHasSize. @@ +3016,5 @@ > + return rv; > + } > + > + // If we're an UNTIL_SIZE decode and we got the image's size, we're done > + // here. Explicitly document this as an early exit optimization. @@ +3021,5 @@ > + if (aDecodeType == DECODE_TYPE_UNTIL_SIZE && aImg->mHasSize) > + break; > + > + // Figure out if we still have more data. > + haveMoreData = aImg->mSourceData.Length() > aImg->mBytesDecoded; You could remove haveMoreData altogether, and just inline this into the while() check. Though that *would* be a change in behaviour, so it might be best to do as a separate bug with its own regression range and try run. ::: image/src/RasterImage.h @@ +453,5 @@ > + * > + * @return NS_ERROR if an error is encountered, and NS_OK otherwise. (Note > + * that we return NS_OK even when the size was not found.) > + */ > + nsresult SyncDecodeUntilHaveSize(RasterImage* aImg); SyncDecodeForSize()? SyncDecodeUntilSizeAvailable()?
Attachment #591823 - Flags: review?(joe) → review+
> A debug-only printout could work if you're looking to catch bugs. I was hoping to put in something that Jesse could catch with his fuzzer. But tests will fail if we don't have the size, so this is probably ok with no warning. > Explicitly document this as an early exit optimization. You mean in RasterImage.h, right? > You could remove haveMoreData altogether, and just inline this into the while() check. Though that > *would* be a change in behaviour, so it might be best to do as a separate bug with its own > regression range and try run. I'll be gung-ho about it. DecodeSomeData won't do anything if !haveMoreData, so I think we're probably OK. Maybe. :)
(In reply to Justin Lebar [:jlebar] from comment #122) > > Explicitly document this as an early exit optimization. > > You mean in RasterImage.h, right? No, right above the code. Just trying to keep special cases to a minimum, and justify them where they aren't!
1.624 + // If we're an UNTIL_SIZE decode and we got the image's size, we're done 1.625 + // here. 1.626 + if (aDecodeType == DECODE_TYPE_UNTIL_SIZE && aImg->mHasSize) 1.627 + break; You want this comment to say "Early-exit optimization: If we're an UNTIL_SIZE decode ..."? Or you want a comment just above the loop (one's already there)?
The former.
How about I fix this as part of bug 721510?
Sure! Whatever makes most sense.
Depends on: 721589
Target Milestone: --- → mozilla12
Depends on: 721917
Depends on: 723053
I think this needs to be backed out of Aurora and nightly, due to: Bug 723053 (possibly security-sensitive crash) Bug 721917 (intermittent reftest orange) Do I need to get approval for the backout?
(In reply to Justin Lebar [:jlebar] from comment #131) > Do I need to get approval for the backout? yep
[Approval Request Comment] Back out patch and dependent change due to regressions (see comment 131). No string changes, reverting after just a few days to known-good state, so low-risk.
Attachment #593527 - Flags: review?(joe)
Attachment #593527 - Flags: approval-mozilla-aurora?
Why is this a security issue for Aurora, and not the rest of the Nightly Users Audience ?
It is a (potential) security issue for everyone. That's why I said "I think this needs to be backed out of Aurora *and nightly*". I need approval to back out of Aurora, and I asked Joe for a review on the backout from Nightly.
Comment on attachment 593527 [details] Back out this and bug 721510 sure
Attachment #593527 - Flags: review?(joe) → review+
I may have a fix in hand for all our problems in bug 723053 (sorry, for those of you without access). So I'm going to test it out and wait to see whether we need to back out on nightly. Release drivers: We may still want to back out in Aurora, given how many things were broken with this patch. But check out the patch in bug 723053; it's simply adding a strong reference.
Unfortunately the fix for the crasher in bug 723053 does not fix the random oranges.
Comment on attachment 593527 [details] Back out this and bug 721510 [Triage Comment] Approved for Aurora 12 to get us back to a known working state.
Attachment #593527 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla12 → ---
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 711269
It looks like my try push may fix all of the intermittent reftest failures except image/test/reftest/bmp/bmp-corrupted/wrapper.html?invalid-compression.bmp | image comparison (==)
iinteresting. My suspicion is that a call to FlushInvalidations is causing this failure. So I put printfs before the calls to FlushInvalidations in RasterImage. Sure enough, I saw this in a failing test: XXX RasterImage::DecodeWorker::Run calling FlushInvalidations. mURI=file:///Users/cltbld/talos-slave/test/build/reftest/tests/image/test/reftest/bmp/bmp-corrupted/invalid-compression-RLE4.bmp, mSourceData.Length()=246, mBytesDecoded=246, mDecoded=0, mDecoder=0x112d49300 REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/image/test/reftest/bmp/bmp-corrupted/wrapper.html?invalid-compression-RLE4.bmp | image comparison (==) The FlushInvalidations call is missing in non-failing runs.
Oh...I bet I know what the problem is. When we do an UNTIL_SIZE decode, we should not flush invalidations. https://tbpl.mozilla.org/?tree=Try&rev=7426ebe91a3f
Comment on attachment 595656 [details] [diff] [review] Part 2: Fix crash, randomorange Review of attachment 595656 [details] [diff] [review]: ----------------------------------------------------------------- ::: image/src/RasterImage.cpp @@ +3020,5 @@ > } > > + // Flush invalidations after we've decoded all the chunks we're going to. > + // But if this was an UNTIL_SIZE decode, don't flush invalidations; wait for > + // a real decode before doing that. Slight rewording: For non-size decodes, flush invalidations after we've decoded all the chunks we're going to. Size decodes shouldn't cause paints. @@ +3023,5 @@ > + // But if this was an UNTIL_SIZE decode, don't flush invalidations; wait for > + // a real decode before doing that. > + // > + // Similarly, don't flush invalidations if we have all the source data; this > + // disables progressive display of previously-discarded images (which lets us This might need a rewording since we no longer have the parallelism above.
Attachment #595656 - Flags: review?(joe) → review+
Blocks: 725945
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: mozilla12 → mozilla13
Blocks: 726004
No longer blocks: 725945
Depends on: 787766
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: