[network markers] Redirections are incorrect on child processes
Categories
(Core :: Gecko Profiler, defect, P2)
Tracking
()
People
(Reporter: julienw, Assigned: julienw)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 1 obsolete file)
See this profile from a capture of loading https://www.mozilla.org:
https://deploy-preview-1917--perf-html.netlify.com/public/b2a52d4f27616a7baf9fae7449edadb76ff01e88/network-chart/?globalTrackOrder=0-1-2-3-4-5-6-7-8&hiddenGlobalTracks=1-2-3-4-5-6-8&localTrackOrderByPid=3164-1-2-0~12671-0~12605-0~12886-0~12803-0~12665-0~13172-0~13100-0-1~12558-0-1~&thread=8&v=3
There are 3 markers showing up for the loading phase of the HTML page:
- The initial load.
{
"start": 512335.72875789064,
"dur": 3056.424849999952,
"name": "Load 36: https://www.mozilla.org/",
"title": null,
"data": {
"type": "Network",
"startTime": 512335.72875789064,
"endTime": 512337.07070589066,
"id": 56264071577636,
"status": "STATUS_START",
"pri": 0,
"URI": "https://www.mozilla.org/"
},
"incomplete": true
}
- The answer with the redirection
{
"start": 505780.25126389065,
"dur": 6618.132344999933,
"name": "Load 1884: https://www.mozilla.org/",
"title": null,
"data": {
"type": "Network",
"startTime": 505780.25126389065,
"endTime": 512398.3836088906,
"id": 13589276526428,
"status": "STATUS_REDIRECT",
"pri": -20,
"URI": "https://www.mozilla.org/",
"RedirectURI": "https://www.mozilla.org/en-US/",
"fetchStart": 512337.07070589066
},
"incomplete": true
}
- The final answer.
{
"start": 505780.25126389065,
"dur": 6700.485168999992,
"name": "Load 1884: https://www.mozilla.org/en-US/",
"title": null,
"data": {
"type": "Network",
"startTime": 505780.25126389065,
"endTime": 512480.73643289064,
"id": 13589276526428,
"status": "STATUS_STOP",
"pri": -20,
"URI": "https://www.mozilla.org/en-US/",
"fetchStart": 512398.60101289063
},
"incomplete": true
}
In the current process, we don't match these 3 markers. We see that 2 and 3 share their ids, so we should match them, and probably handle the marker 2 as the START for the marker 3. (I'll just do it in a coming patch).
But we don't have any way to match 1 and 2: their ids are different, and their names are different.
I believe this is an important issue as this prevents us from showing the right data to the user.
| Assignee | ||
Comment 1•6 years ago
|
||
In the current process, we don't match these 3 markers. We see that 2 and 3 share their ids, so we should match them, and probably handle the marker 2 as the START for the marker 3. (I'll just do it in a coming patch).
Actually, when looking at the parent process of the same capture [1], we clearly see that the markers on that parent process are different, and seem to be as expected by the existing code.
{
"start": 512372.37945699994,
"dur": 2.3016650000354275,
"name": "Load 36: https://www.mozilla.org/",
"data": {
"type": "Network",
"startTime": 512372.37945699994,
"endTime": 512374.681122,
"id": 56264071577636,
"status": "STATUS_START",
"cache": "Unresolved",
"pri": 0,
"URI": "https://www.mozilla.org/"
},
"title": null
}
{
"start": 512374.681122,
"dur": 21.995926000061445,
"name": "Load 36: https://www.mozilla.org/",
"data": {
"type": "Network",
"startTime": 512374.681122,
"endTime": 512396.67704800004,
"id": 56264071577636,
"status": "STATUS_REDIRECT",
"cache": "Unresolved",
"pri": -20,
"URI": "https://www.mozilla.org/",
"RedirectURI": "https://www.mozilla.org/en-US/"
},
"title": null
}
{
"start": 512396.717695,
"dur": 3.8700159999425523,
"name": "Load 1884: https://www.mozilla.org/en-US/",
"data": {
"type": "Network",
"startTime": 512396.717695,
"endTime": 512400.58771099994,
"id": 13589276526428,
"status": "STATUS_START",
"cache": "Unresolved",
"pri": -20,
"URI": "https://www.mozilla.org/en-US/"
},
"title": null
}
{
"start": 512400.58771099994,
"dur": 8.961450000002515,
"name": "Load 1884: https://www.mozilla.org/en-US/",
"data": {
"type": "Network",
"startTime": 512400.58771099994,
"endTime": 512409.54916099994,
"id": 13589276526428,
"status": "STATUS_STOP",
"cache": "Hit",
"pri": -20,
"count": 152253,
"URI": "https://www.mozilla.org/en-US/"
},
"title": null
}
So we clearly see that all markers are written as pairs as the code expects. Then the problem is only in the child process.
[1] https://deploy-preview-1917--perf-html.netlify.com/public/b2a52d4f27616a7baf9fae7449edadb76ff01e88/network-chart/?globalTrackOrder=0-1-2-3-4-5-6-7-8&hiddenGlobalTracks=1-2-3-4-5-6-8&localTrackOrderByPid=3164-1-2-0~12671-0~12605-0~12886-0~12803-0~12665-0~13172-0~13100-0-1~12558-0-1~&range=8.5255_9.6459&thread=0&v=3
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 2•6 years ago
|
||
Hey Randell, is this something you could help with? Thanks!
Comment 3•6 years ago
|
||
I'll take a look. it's possible that the child isn't informed of every network operation that occurs in the parent
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 4•6 years ago
|
||
I reproduce 100% of the tries I do. It's super easy to check on mozilla.org or developer.mozilla.org where there's a redirection when accessing the root /.
Randell this is super important because we don't draw a correct chart because of this, but I can't make sense of the problem myself. With some guidance I could probably write the patch but I need some investigation from you or someone else who knows this well.
Comment 5•6 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #4)
I reproduce 100% of the tries I do. It's super easy to check on mozilla.org or developer.mozilla.org where there's a redirection when accessing the root /.
Randell this is super important because we don't draw a correct chart because of this, but I can't make sense of the problem myself. With some guidance I could probably write the patch but I need some investigation from you or someone else who knows this well.
Can you capture a full log with a reproducible minimal test case? Then please share the exact URLs that manifested the problem (at least the first one in the chain). Thanks.
| Assignee | ||
Comment 6•6 years ago
|
||
| Assignee | ||
Comment 7•6 years ago
|
||
| Assignee | ||
Comment 8•6 years ago
|
||
These logs are for the URL https://developer.mozilla.org. It gets redirected to https://developer.mozilla.org/en-US.
As a reminder, the problem happens on the child process: the second marker seems to be a merge of the end of the redirect and the start of the new request.
Comment 9•6 years ago
|
||
Thanks. Now please refer the code where all the markers you present in comment 0 and 1 are taken. Did you capture a profile along with the logs for reference?
| Assignee | ||
Comment 10•6 years ago
|
||
sorry, I didn't capture a profile along, but this one should be fairly similar: https://perfht.ml/2Vw8gIK
Here is where we add the markers:
Parent:
Where we put the start marker: https://searchfox.org/mozilla-central/rev/11cfa0462a6b5d8c5e2111b8cfddcf78098f0141/netwerk/protocol/http/nsHttpChannel.cpp#6347
Where we put the end redirect marker: https://searchfox.org/mozilla-central/rev/11cfa0462a6b5d8c5e2111b8cfddcf78098f0141/netwerk/protocol/http/nsHttpChannel.cpp#5915
Where we put the end stop marker: https://searchfox.org/mozilla-central/rev/11cfa0462a6b5d8c5e2111b8cfddcf78098f0141/netwerk/protocol/http/nsHttpChannel.cpp#8144
Child:
Where we put the start marker: https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/netwerk/protocol/http/HttpChannelChild.cpp#2622
Where we put the end redirect marker: https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/netwerk/protocol/http/HttpChannelChild.cpp#1785
Where we put the end stop marker: https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/netwerk/protocol/http/HttpChannelChild.cpp#1125
The problems we see in the child process are:
- The id changed in the "end redirect" marker but shouldn't have.
- There's no "start" marker for the second request (that should have the new id).
Updated•6 years ago
|
Comment 11•6 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #10)
sorry, I didn't capture a profile along, but this one should be fairly similar: https://perfht.ml/2Vw8gIK
So this one is hard to reason about (perhaps) since it seems that the buffer wrapped at the same point the load started (or perhaps the content process started then, and the profiler took a bit to start capturing and missed the start). Please redo with the content process already alive, then browse.
The earlier capture clearly shows the problem.
Also: what changed? was this a regression from updates to the capture code, or a change in how we display it? I don't think this has always been happening. That can help as well (perhaps we changed the matching criteria, or removed a heuristic to connect them?)
Looking some more
| Assignee | ||
Comment 12•6 years ago
|
||
No it didn't wrap and the content process didn't start then, what happened is that it took ~2sec for the profiler to register. But it's ok because I waited until the profiler started before starting the navigation, so please pay attention only to that part.
I think nothing changed, I remember I've seen problems around this for a long time, but the network panel also had many other problems which hid this. Now that the network panel is cleaner we can see the problems around the marker generation.
I think the problem is around https://searchfox.org/mozilla-central/rev/a887c90ea9c19a0b5529a1f5fa351929944887ba/netwerk/protocol/http/HttpChannelParent.cpp#1997-2005: we get the channel id here and that's the channel id we send with Redirect1Begin. Then in the child's Redirect1Begin handler we use this channel id when pushing the STATUS_REDIRECT marker.
Maybe we could pass both the old and new id, and the old and new URL, and have the child's Redirect1Begin handler push both markers.
| Assignee | ||
Updated•6 years ago
|
Comment 13•6 years ago
|
||
I think this one was discussed in Whistler and figure out!
| Assignee | ||
Comment 14•6 years ago
|
||
Hey Honza, thanks again for your help during the week :)
I tried to fix the issue and I realized I still have another question: it seems to me that AsyncOpen isn't called on the Child process for the new channel, can you please confirm? I try to find the best location to emit the "start" marker for the redirected request.
Comment 15•6 years ago
|
||
Yes, it's not called for it, I somehow missed to tell you that. There is a place:
https://searchfox.org/mozilla-central/rev/928742d3ea30e0eb4a8622d260041564d81a8468/netwerk/protocol/http/HttpChannelChild.cpp#2219
| Assignee | ||
Comment 16•6 years ago
|
||
Thanks, I managed to do something that works now :)
Here is a try build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=779d823d61e5879513112749049d6915b324ea90
| Comment hidden (obsolete) |
| Assignee | ||
Comment 18•6 years ago
|
||
Markers were either incorrect or missing when redirecting a request.
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 19•6 years ago
|
||
My previous try had a mistake, here is a new try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=608648890ce147ee844352a55ec29c621e4b4708
| Assignee | ||
Comment 20•6 years ago
|
||
I just pushed the patch to integration, thanks again for your help!
Comment 21•6 years ago
|
||
Comment 22•6 years ago
|
||
| bugherder | ||
Updated•6 years ago
|
Description
•