Open Bug 1752859 Opened 3 years ago Updated 2 years ago

[network-markers] Some network markers of the content process don't have an end and don't exist on the parent

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

People

(Reporter: florian, Unassigned)

References

(Blocks 1 open bug, )

Details

On a recent build, loading https://www.ikea.com/us/en/p/billy-bookcase-white-s59017838/, I have network markers that don't finish in the content process, and don't exist at all in the parent: https://share.firefox.dev/3HiBmnR

Severity: -- → S3
Priority: -- → P2

I can reproduce but I can't debug because of bug 1753043.

Depends on: 1753043

Actually I can capture with rr if I set nostacksampling, so let's try this.

No longer depends on: 1753043

Here is a pernosco session

I clearly see that stop markers are missing for some start markers, but I'm not sure how to continue investigating this...

I'd like to rule out the elephant in the room: it is not related to the fact that some requests are canceled, because I see the same issue with tracking protection being disabled as well (in that case there's no canceled request).

Hey Valentin, by chance, would you know from the pernosco session above what could be special about these requests ? That's the ones where their aChannelId are 2815907260792996, 2815907260792997 and 2815907260792998. 2815907260792995 and 2815907260792999 get both START and STOP or CANCEL as expected. Thanks!

Flags: needinfo?(valentin.gosu)

I got another idea to look at the pernosco session, and I think I found the issue:
these requests go to FailedAsyncOpen, and then HandleAsyncAbort (directly in the child), and I believe this is a codepath where we don't handle network markers. This happens when in the parent the request has been already canceled, as we see in [1]. In that case an "abort" is propagated to the child.

In previous bugs such as bug 1697901 I handled these cases in the parent, and I missed this case in the child.

[1] https://searchfox.org/mozilla-central/rev/fe800a7fd291db3c4b3e498cfe12ef2097662290/netwerk/protocol/http/nsHttpChannel.cpp#5768-5771

Flags: needinfo?(valentin.gosu)

We need to look at usages of gHttpHandler->onStopRequest and make sure stop markers are properly added when we have a start marker.

Very probably our architecture should be cleaner and we should listen to these events instead. However that's more work than I have time for at the moment (especially with my lack of C++ knowledge).

So, for channelId 2815907260792996 it seems we're calling AsyncOpen, after which we get a FailedAsyncOpen message, followed by it calling TrySendDeletingChannel to tear down the IPC connection for the channels.
I think we need to add a closing profiler label in RecvFailedAsyncOpen.

Depends on: 1667316
No longer depends on: 1667316

From the pernosco session I see these requests are from an image load (which isn't surprising, this is common for tracking stuff), that's been canceled because the same image changed its src.
(This can be useful when so that we can add a test for this case).

Summary: Some network markers of the content process don't have an end and don't exist on the parent → [network-markers] Some network markers of the content process don't have an end and don't exist on the parent
You need to log in before you can comment on or make changes to this bug.