Closed Bug 583028 Opened 11 years ago Closed 10 years ago

Tp4 regression from async imagelib notification

Categories

(Core :: ImageLib, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: joe, Assigned: joe)

References

Details

Attachments

(1 file, 1 obsolete file)

There was a pretty large (1-5%) regression from the async imagelib patch. We should work out whether this happened on only some pages and what caused it.
blocking2.0: --- → final+
http://people.mozilla.org/~jmuizelaar/graph-server/detail.html?host=graphs&old=4894012&new=4896393 is, I believe, an accurate representation of what happened before and after. (Tp4 doesn't seem to get run on all checkins on OS X, for some reason, so this doesn't correspond precisely to before and after my checkin, but it's close enough.) 

So, mostly what I see there is that there's an overall loss of performance, with one or two sites taking a lot longer. I'll investigate this.
blocking2.0: final+ → ---
blocking2.0: --- → final+
Yeah, I'd probably start with livescore.com as something that might be particularly amenable to profiling+debugging.
We spend a ton of time just dispatching events. I'm going to try coalescing events ("do we have an event pending? add this proxy to a list to be notified by it") rather than sending a new one every time.
Huh.  How many events are we talking here for a typical web page?
One per duplicated image. The livescore.com pages have quite a number of duplicated images.
Is "quite a number" dozens, or thousands?
Hundreds.
OK.  It still feels like tens-of-milliseconds processing times for that is a lot, but maybe that's happening....
(In reply to comment #9)
> OK.  It still feels like tens-of-milliseconds processing times for that is a
> lot, but maybe that's happening....

Maybe this is a good profile case for bug 583069?
I should note that all the time was _under_ SyncNotify. Also, I just reprofiled on livescore.com, this time with an internet connection (it turns out that livescore.com, even in our Tp4 page set, contains a reference to a live web page), and SyncNotify and children was down to 1.3% of time on a constantly-refreshing copy of livescore.
> Maybe this is a good profile case for bug 583069?

That can't explain the _Linux_ regression.

> I should note that all the time was _under_ SyncNotify.

OK, then reducing the number of runnables is not likely to help... where was the time actually spent?
In handling OnStopDecode - and whatever is implied by that, including a bunch of JS stuff. All standard.

I have great faith in my plan to mitigate this onload slowdown by not queueing extra tasks when there's already one queued. That way, we won't end up at the end of the event queue, making onload fire much later than it would have otherwise.
Attached patch delay onload less (obsolete) — Splinter Review
Instead of always firing off new runnables, we check to see if a runnable already exists and append a proxy to its list of proxies to notify.

Note that I didn't implement this for NotifyCurrentState because a) it's called a lot less frequently (only for images that come from print preview) and b) it would be very ugly having both a copy of the imgStatusTracker to keep the status you want to update, and a reference to a different imgStatusTracker to update when you actually get run.

This is currently running through try server.
Assignee: nobody → joe
Attachment #463338 - Flags: review?(bobbyholley+bmo)
(In reply to comment #14)
> 
> This is currently running through try server.

Let me know when you've got confirmation that this fixes the regression.
Attachment #463338 - Flags: feedback?(jmuizelaar)
Won't be until tomorrow at the earliest; all trees, including try, are all kinds of broken right now.
http://people.mozilla.org/~jmuizelaar/graph-server/detail.html?host=graphs&old=4896393&new=4976045

Results look very good - big green bars for livescore.com, for example. There are a couple of red bars, but there's also a fair amount of noise in these tests, from comparing different mozilla-central runs.
nsCOMPtr can't hold on to a forward-declared class when we're compiling debug; work around that.
Attachment #463338 - Attachment is obsolete: true
Attachment #463655 - Flags: review?(bobbyholley+bmo)
Attachment #463655 - Flags: feedback?(jmuizelaar)
Attachment #463338 - Flags: review?(bobbyholley+bmo)
Attachment #463338 - Flags: feedback?(jmuizelaar)
Comment on attachment 463655 [details] [diff] [review]
delay onload less, and compile in debug mode

diff --git a/modules/libpr0n/src/imgStatusTracker.cpp b/modules/libpr0n/src/imgStatusTracker.cpp
--- a/modules/libpr0n/src/imgStatusTracker.cpp
+++ b/modules/libpr0n/src/imgStatusTracker.cpp


+  imgRequestNotifyRunnable* runnable = static_cast<imgRequestNotifyRunnable*>(mRequestRunnable.get());
+  if (runnable && runnable->mRequest == request) {

Add a comment here explaining that this is to handle ChangeOwner(), and file a bug for the issue we talked about on IRC where calling ChangeOwner while a notification is pending causes consumers to be notified with the wrong state.


+  // We don't keep track of 

What?

diff --git a/modules/libpr0n/src/imgStatusTracker.h b/modules/libpr0n/src/imgStatusTracker.h
--- a/modules/libpr0n/src/imgStatusTracker.h
+++ b/modules/libpr0n/src/imgStatusTracker.h

   // alive as long as this instance is, because we hold a weak reference to it.
   imgStatusTracker(imgIContainer* aImage);
+  imgStatusTracker(const imgStatusTracker& aOther);

Please declare an explicit operator= here and flag it "not to be implemented" (see layout/style/nsStyleStruct.h) so that we don't accidentally use it.

This patch coalesces things so that we only ever have 1 event per imgRequest. I'd really prefer to coalesce things all the way with a tracker so that we only ever have 1 event per imagelib, but joe doesn't really want to do that right now. r=bholley with the above fixes. If it doesn't fix the regression, I think that's probably the next step.
Attachment #463655 - Flags: review?(bobbyholley+bmo) → review+
http://hg.mozilla.org/mozilla-central/rev/c3d482704ba8
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Let's leave this open until we confirm that the regression was fixed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This actually looks to be fixed now.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Attachment #463655 - Flags: feedback?(jmuizelaar)
You need to log in before you can comment on or make changes to this bug.