Triggering image loads should be faster

RESOLVED FIXED

Status

()

Core
DOM
P1
normal
RESOLVED FIXED
10 months ago
6 months ago

People

(Reporter: Away for a while, Unassigned)

Tracking

(Blocks: 1 bug, {meta})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:meta])

(Reporter)

Description

10 months ago
See this test case: <https://jakearchibald.github.io/service-worker-benchmark/>

42ms in this profile is spent in imgLoader::LoadImage: <https://perfht.ml/2nqAyEZ>
73ms in this profile: <https://perfht.ml/2nqDQI8>

Images are pretty common on the web, and this function gets called a lot during page loads on the main thread.  These times are unacceptable.
(Reporter)

Updated

10 months ago
Depends on: 1347369
(Reporter)

Updated

10 months ago
Depends on: 1347379
(Reporter)

Updated

10 months ago
Depends on: 1347400

Updated

10 months ago
Priority: -- → P1
Whiteboard: [qf:p1]
(Reporter)

Updated

10 months ago
Depends on: 1348460
(Reporter)

Updated

10 months ago
Depends on: 1348461
(Reporter)

Updated

10 months ago
Depends on: 1348462
(Reporter)

Updated

10 months ago
Depends on: 1348464
(Reporter)

Updated

10 months ago
Depends on: 1348591

Updated

10 months ago
Keywords: meta
(Reporter)

Updated

9 months ago
Depends on: 1356835
(Reporter)

Updated

9 months ago
Depends on: 1356836
(Reporter)

Updated

9 months ago
Depends on: 1356838
(Reporter)

Updated

9 months ago
Depends on: 1357107
(Reporter)

Comment 1

9 months ago
Here is where we are on Nightly right now.

From a profile of comment 0 we spend 35ms in imgLoader::LoadImage().  Here is a breakdown (I'm not mentioning branches where only a few milliseconds of time was spent):

35ms imgLoader::LoadImage()
 |- 18ms HttpChannelChild::AsyncOpen2()
 |   |- 17ms HttpChannelChild::AsyncOpen()
 |   |   |- 13ms HttpChannelChild::ContinueAsyncOpen()
 |   |   |   |- 9ms PNeckoChild::SendPHttpChannelConstructor()
 |   |   |   |- 2ms LoadInfoToLoadInfoArgs()
 |   |   |   \_ 2ms misc
 |   |   |- 2ms HttpChannelChild::ShouldInterceptURI()
 |   |   \_ 2ms misc
 |   \_ 1ms nsContentSecurityManager::doContentSecurityCheck() (bug 1360971)
 \_ 12ms NewImageChannel()
    |- 6ms NS_NewChannelWithTriggeringPrincipal()
     \_ 3ms HttpChannelChild::SetReferrerWithPolicy()
Depends on: 1360971
(Reporter)

Updated

9 months ago
Depends on: 1360973

Comment 2

9 months ago
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #1)
> Here is where we are on Nightly right now.
> 
> From a profile of comment 0 we spend 35ms in imgLoader::LoadImage().  Here
> is a breakdown (I'm not mentioning branches where only a few milliseconds of
> time was spent):
> 
> 35ms imgLoader::LoadImage()
>  |- 18ms HttpChannelChild::AsyncOpen2()
>  |   |- 17ms HttpChannelChild::AsyncOpen()
>  |   |   |- 13ms HttpChannelChild::ContinueAsyncOpen()
>  |   |   |   |- 9ms PNeckoChild::SendPHttpChannelConstructor()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I was thinking this might be caused by the long init process in HttpChannelParent [1].

My idea is, we can offload the task being done in PNeckoChild::RecvPHttpChannelConstructor
to other async IPDL message so that the main thread wouldn't be blocked
by the inevitable sync constructor message.

So, I measured its time duration and it turned out it only took less
than one millisecond. Even more, I measured PNeckoChild::SendPHttpChannelConstructor()
and it also took less than one millisecond. This led me to thinking of if I should
measure on some machine other than my powerful mac laptop.

Ehsan, do you have any suggestion for me regarding either my idea or the machine
I should use to do profiling? I am dedicated to helping QF bugs from now and I feel
like I would be helpful on this bug. 

Thanks!


[1] http://searchfox.org/mozilla-central/rev/abe68d5dad139e376d5521ca1d4b7892e1e7f1ba/netwerk/protocol/http/HttpChannelParent.cpp#301


>  |   |   |   |- 2ms LoadInfoToLoadInfoArgs()
>  |   |   |   \_ 2ms misc
>  |   |   |- 2ms HttpChannelChild::ShouldInterceptURI()
>  |   |   \_ 2ms misc
>  |   \_ 1ms nsContentSecurityManager::doContentSecurityCheck() (bug 1360971)
>  \_ 12ms NewImageChannel()
>     |- 6ms NS_NewChannelWithTriggeringPrincipal()
>      \_ 3ms HttpChannelChild::SetReferrerWithPolicy()
Flags: needinfo?(ehsan)
(Reporter)

Comment 3

9 months ago
Henry, firstly thanks a lot for looking into this!

I think one thing may have confused you about my profile.  These weren't running times from a single call to imgLoader::LoadImage(), but they were overall times captured from all of the calls to this function when we run the test case in comment 0 which loads a bunch of images (I was profiling the content process running that page.)

I wouldn't pay too much attention to the absolute running times of the functions here, what matters more is the relative running times, for example, think of things like this: about 18/35 (~51%) of the runtime of imgLoader::LoadImage is HttpChannelChild::AsyncOpen2(), so if you found a way to cut the time of that function in half, you would make LoadImage overall 25% faster.  Whereas based on the same logic, bug 1360971 only has the potential to help a tiny little bit with things overall here.  (That being said, that is still an important improvement since that function is used in many other parts of the code!)

To answer your question though, in general profiling on slower machines is more helpful because it allows for slowness issues to be noticed more easily.  A few people in the Taipei office should now have the Acer Quantum reference hardware machines, including Kan-Ru, Thinker, Evelyn, Bevis and Weicheng I think.  You should feel free to ask one of them to borrow their machine for profiling purposes, that would have the advantage of being the same machine as the ones being used by a few of other people profiling other bugs.  But even without that, if you can find an old loaner machine at the office that could be an option too.  I also tend to borrow laptops of normal users sometimes (basically anyone who doesn't use a super fast machine like us!) for profiling purposes.

Last but not least, on the specific issue of speeding up this constructor, there has been some discussion on it in bug 1342632 too, you may want to take a look there as well.

If I wanted to suggest something that you can start investigating, I would probably recommend looking at bug 1348591.  The profile I captured above was captured from a local build with IPC sentinel checking disabled of course to eliminate the overhead that billm mentioned there but I still see a ton of BufferList overhead in the profile (I'd say most of the cost of the IPC part of sending the constructor message is just the BufferList stuff.)  I haven't spent a ton of time looking into the details of what can be done about it yet, it would be nice if you could have a look to see if you can find out a way to make this faster.  The good thing about any improvement there is that we do pay a little bit of cost for this for all of our IPC traffic on the main thread (and on all threads for that matter) not just this one bug.

Thanks!!
Flags: needinfo?(ehsan)

Comment 4

9 months ago
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #3)
> Henry, firstly thanks a lot for looking into this!
> 
> I think one thing may have confused you about my profile.  These weren't
> running times from a single call to imgLoader::LoadImage(), but they were
> overall times captured from all of the calls to this function when we run
> the test case in comment 0 which loads a bunch of images (I was profiling
> the content process running that page.)
> 

Noted! 

> I wouldn't pay too much attention to the absolute running times of the
> functions here, what matters more is the relative running times, for
> example, think of things like this: about 18/35 (~51%) of the runtime of
> imgLoader::LoadImage is HttpChannelChild::AsyncOpen2(), so if you found a
> way to cut the time of that function in half, you would make LoadImage
> overall 25% faster.  Whereas based on the same logic, bug 1360971 only has
> the potential to help a tiny little bit with things overall here.  (That
> being said, that is still an important improvement since that function is
> used in many other parts of the code!)
> 

Ah it's like Amdahl's law :)


> To answer your question though, in general profiling on slower machines is
> more helpful because it allows for slowness issues to be noticed more
> easily.  A few people in the Taipei office should now have the Acer Quantum
> reference hardware machines, including Kan-Ru, Thinker, Evelyn, Bevis and
> Weicheng I think.  You should feel free to ask one of them to borrow their
> machine for profiling purposes, that would have the advantage of being the
> same machine as the ones being used by a few of other people profiling other
> bugs.  But even without that, if you can find an old loaner machine at the
> office that could be an option too.  I also tend to borrow laptops of normal
> users sometimes (basically anyone who doesn't use a super fast machine like
> us!) for profiling purposes.
> 

Thanks for all those information. That makes total sense to me.

> Last but not least, on the specific issue of speeding up this constructor,
> there has been some discussion on it in bug 1342632 too, you may want to
> take a look there as well.
> 
> If I wanted to suggest something that you can start investigating, I would
> probably recommend looking at bug 1348591.  The profile I captured above was
> captured from a local build with IPC sentinel checking disabled of course to
> eliminate the overhead that billm mentioned there but I still see a ton of
> BufferList overhead in the profile (I'd say most of the cost of the IPC part
> of sending the constructor message is just the BufferList stuff.)  I haven't
> spent a ton of time looking into the details of what can be done about it
> yet, it would be nice if you could have a look to see if you can find out a
> way to make this faster.  The good thing about any improvement there is that
> we do pay a little bit of cost for this for all of our IPC traffic on the
> main thread (and on all threads for that matter) not just this one bug.
> 

After reading the discussion on that bug, it seems that tackling with 
SendPHttpChannelConstructor is not considered helpful (in terms of UI jank.)
I'll try to be investigating bug 1348591 as a starter. 

Thanks for the information!


> Thanks!!
(Reporter)

Updated

9 months ago
Depends on: 1362791
(Reporter)

Updated

9 months ago
Depends on: 1362806
(Reporter)

Updated

9 months ago
Depends on: 1362822
(Reporter)

Updated

8 months ago
Depends on: 1363243
Given that this is a meta bug, when do we consider this done? After all the
depends-on bugs have been fixed?
Flags: needinfo?(ehsan)
(Reporter)

Comment 6

8 months ago
(In reply to Olli Pettay [:smaug] from comment #5)
> Given that this is a meta bug, when do we consider this done? After all the
> depends-on bugs have been fixed?

I think I have achieved almost all I have hoped for here by removing all of the waste I could find under imgLoader::Load() (a lot of it common to triggering any HTTP channel load), bug 1348591 is the only thing left to look at here, I think.
Flags: needinfo?(ehsan)

Comment 7

8 months ago
Bug 1364722 might be a quick and easy win to reduce some image churn.
Depends on: 1364722
(Reporter)

Comment 8

8 months ago
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #7)
> Bug 1364722 might be a quick and easy win to reduce some image churn.

Ben, this bug is tracking stuff that shows up under imgLoader::LoadImage() in profiles.  Is that bug really related?

Comment 9

8 months ago
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #8)
> Ben, this bug is tracking stuff that shows up under imgLoader::LoadImage()
> in profiles.  Is that bug really related?

I guess not, although it occurs to me the best way to make LoadImage() faster is not to call it at all.
No longer depends on: 1364722
Whiteboard: [qf:p1] → [qf:meta]
(Reporter)

Updated

7 months ago
Depends on: 1267075
(Reporter)

Comment 10

6 months ago
I'm going to call this bug fixed.  A lot of work happened here, and more work can be filed as separate bugs.
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.