Closed Bug 1546641 Opened 6 years ago Closed 6 years ago

[network markers] Redirections are incorrect on child processes

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox68 --- wontfix
firefox69 --- fixed

People

(Reporter: julienw, Assigned: julienw)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

See this profile from a capture of loading https://www.mozilla.org:
https://deploy-preview-1917--perf-html.netlify.com/public/b2a52d4f27616a7baf9fae7449edadb76ff01e88/network-chart/?globalTrackOrder=0-1-2-3-4-5-6-7-8&hiddenGlobalTracks=1-2-3-4-5-6-8&localTrackOrderByPid=3164-1-2-0~12671-0~12605-0~12886-0~12803-0~12665-0~13172-0~13100-0-1~12558-0-1~&thread=8&v=3

There are 3 markers showing up for the loading phase of the HTML page:

  1. The initial load.
{
  "start": 512335.72875789064,
  "dur": 3056.424849999952,
  "name": "Load 36: https://www.mozilla.org/",
  "title": null,
  "data": {
    "type": "Network",
    "startTime": 512335.72875789064,
    "endTime": 512337.07070589066,
    "id": 56264071577636,
    "status": "STATUS_START",
    "pri": 0,
    "URI": "https://www.mozilla.org/"
  },
  "incomplete": true
}
  1. The answer with the redirection
{
  "start": 505780.25126389065,
  "dur": 6618.132344999933,
  "name": "Load 1884: https://www.mozilla.org/",
  "title": null,
  "data": {
    "type": "Network",
    "startTime": 505780.25126389065,
    "endTime": 512398.3836088906,
    "id": 13589276526428,
    "status": "STATUS_REDIRECT",
    "pri": -20,
    "URI": "https://www.mozilla.org/",
    "RedirectURI": "https://www.mozilla.org/en-US/",
    "fetchStart": 512337.07070589066
  },
  "incomplete": true
}
  1. The final answer.
{
  "start": 505780.25126389065,
  "dur": 6700.485168999992,
  "name": "Load 1884: https://www.mozilla.org/en-US/",
  "title": null,
  "data": {
    "type": "Network",
    "startTime": 505780.25126389065,
    "endTime": 512480.73643289064,
    "id": 13589276526428,
    "status": "STATUS_STOP",
    "pri": -20,
    "URI": "https://www.mozilla.org/en-US/",
    "fetchStart": 512398.60101289063
  },
  "incomplete": true
}

In the current process, we don't match these 3 markers. We see that 2 and 3 share their ids, so we should match them, and probably handle the marker 2 as the START for the marker 3. (I'll just do it in a coming patch).

But we don't have any way to match 1 and 2: their ids are different, and their names are different.

I believe this is an important issue as this prevents us from showing the right data to the user.

In the current process, we don't match these 3 markers. We see that 2 and 3 share their ids, so we should match them, and probably handle the marker 2 as the START for the marker 3. (I'll just do it in a coming patch).

Actually, when looking at the parent process of the same capture [1], we clearly see that the markers on that parent process are different, and seem to be as expected by the existing code.

{
  "start": 512372.37945699994,
  "dur": 2.3016650000354275,
  "name": "Load 36: https://www.mozilla.org/",
  "data": {
    "type": "Network",
    "startTime": 512372.37945699994,
    "endTime": 512374.681122,
    "id": 56264071577636,
    "status": "STATUS_START",
    "cache": "Unresolved",
    "pri": 0,
    "URI": "https://www.mozilla.org/"
  },
  "title": null
}
{
  "start": 512374.681122,
  "dur": 21.995926000061445,
  "name": "Load 36: https://www.mozilla.org/",
  "data": {
    "type": "Network",
    "startTime": 512374.681122,
    "endTime": 512396.67704800004,
    "id": 56264071577636,
    "status": "STATUS_REDIRECT",
    "cache": "Unresolved",
    "pri": -20,
    "URI": "https://www.mozilla.org/",
    "RedirectURI": "https://www.mozilla.org/en-US/"
  },
  "title": null
}
{
  "start": 512396.717695,
  "dur": 3.8700159999425523,
  "name": "Load 1884: https://www.mozilla.org/en-US/",
  "data": {
    "type": "Network",
    "startTime": 512396.717695,
    "endTime": 512400.58771099994,
    "id": 13589276526428,
    "status": "STATUS_START",
    "cache": "Unresolved",
    "pri": -20,
    "URI": "https://www.mozilla.org/en-US/"
  },
  "title": null
}
{
  "start": 512400.58771099994,
  "dur": 8.961450000002515,
  "name": "Load 1884: https://www.mozilla.org/en-US/",
  "data": {
    "type": "Network",
    "startTime": 512400.58771099994,
    "endTime": 512409.54916099994,
    "id": 13589276526428,
    "status": "STATUS_STOP",
    "cache": "Hit",
    "pri": -20,
    "count": 152253,
    "URI": "https://www.mozilla.org/en-US/"
  },
  "title": null
}

So we clearly see that all markers are written as pairs as the code expects. Then the problem is only in the child process.
[1] https://deploy-preview-1917--perf-html.netlify.com/public/b2a52d4f27616a7baf9fae7449edadb76ff01e88/network-chart/?globalTrackOrder=0-1-2-3-4-5-6-7-8&hiddenGlobalTracks=1-2-3-4-5-6-8&localTrackOrderByPid=3164-1-2-0~12671-0~12605-0~12886-0~12803-0~12665-0~13172-0~13100-0-1~12558-0-1~&range=8.5255_9.6459&thread=0&v=3

Summary: [network markers] Redirections seem to be incorrect (or incorrectly processed) → [network markers] Redirections seem to be incorrect on the child process

Hey Randell, is this something you could help with? Thanks!

Flags: needinfo?(rjesup)

I'll take a look. it's possible that the child isn't informed of every network operation that occurs in the parent

Summary: [network markers] Redirections seem to be incorrect on the child process → [network markers] Redirections are incorrect on child processes

I reproduce 100% of the tries I do. It's super easy to check on mozilla.org or developer.mozilla.org where there's a redirection when accessing the root /.

Randell this is super important because we don't draw a correct chart because of this, but I can't make sense of the problem myself. With some guidance I could probably write the patch but I need some investigation from you or someone else who knows this well.

(In reply to Julien Wajsberg [:julienw] from comment #4)

I reproduce 100% of the tries I do. It's super easy to check on mozilla.org or developer.mozilla.org where there's a redirection when accessing the root /.

Randell this is super important because we don't draw a correct chart because of this, but I can't make sense of the problem myself. With some guidance I could probably write the patch but I need some investigation from you or someone else who knows this well.

Can you capture a full log with a reproducible minimal test case? Then please share the exact URLs that manifested the problem (at least the first one in the chain). Thanks.

Attached file log-firefox.txt.0.gz

These logs are for the URL https://developer.mozilla.org. It gets redirected to https://developer.mozilla.org/en-US.

As a reminder, the problem happens on the child process: the second marker seems to be a merge of the end of the redirect and the start of the new request.

Thanks. Now please refer the code where all the markers you present in comment 0 and 1 are taken. Did you capture a profile along with the logs for reference?

sorry, I didn't capture a profile along, but this one should be fairly similar: https://perfht.ml/2Vw8gIK

Here is where we add the markers:

Parent:
Where we put the start marker: https://searchfox.org/mozilla-central/rev/11cfa0462a6b5d8c5e2111b8cfddcf78098f0141/netwerk/protocol/http/nsHttpChannel.cpp#6347
Where we put the end redirect marker: https://searchfox.org/mozilla-central/rev/11cfa0462a6b5d8c5e2111b8cfddcf78098f0141/netwerk/protocol/http/nsHttpChannel.cpp#5915
Where we put the end stop marker: https://searchfox.org/mozilla-central/rev/11cfa0462a6b5d8c5e2111b8cfddcf78098f0141/netwerk/protocol/http/nsHttpChannel.cpp#8144

Child:
Where we put the start marker: https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/netwerk/protocol/http/HttpChannelChild.cpp#2622
Where we put the end redirect marker: https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/netwerk/protocol/http/HttpChannelChild.cpp#1785
Where we put the end stop marker: https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/netwerk/protocol/http/HttpChannelChild.cpp#1125

The problems we see in the child process are:

  1. The id changed in the "end redirect" marker but shouldn't have.
  2. There's no "start" marker for the second request (that should have the new id).
Flags: needinfo?(honzab.moz)

(In reply to Julien Wajsberg [:julienw] from comment #10)

sorry, I didn't capture a profile along, but this one should be fairly similar: https://perfht.ml/2Vw8gIK

So this one is hard to reason about (perhaps) since it seems that the buffer wrapped at the same point the load started (or perhaps the content process started then, and the profiler took a bit to start capturing and missed the start). Please redo with the content process already alive, then browse.

The earlier capture clearly shows the problem.
Also: what changed? was this a regression from updates to the capture code, or a change in how we display it? I don't think this has always been happening. That can help as well (perhaps we changed the matching criteria, or removed a heuristic to connect them?)

Looking some more

Flags: needinfo?(rjesup)

No it didn't wrap and the content process didn't start then, what happened is that it took ~2sec for the profiler to register. But it's ok because I waited until the profiler started before starting the navigation, so please pay attention only to that part.

I think nothing changed, I remember I've seen problems around this for a long time, but the network panel also had many other problems which hid this. Now that the network panel is cleaner we can see the problems around the marker generation.

I think the problem is around https://searchfox.org/mozilla-central/rev/a887c90ea9c19a0b5529a1f5fa351929944887ba/netwerk/protocol/http/HttpChannelParent.cpp#1997-2005: we get the channel id here and that's the channel id we send with Redirect1Begin. Then in the child's Redirect1Begin handler we use this channel id when pushing the STATUS_REDIRECT marker.
Maybe we could pass both the old and new id, and the old and new URL, and have the child's Redirect1Begin handler push both markers.

I think this one was discussed in Whistler and figure out!

Flags: needinfo?(honzab.moz)

Hey Honza, thanks again for your help during the week :)
I tried to fix the issue and I realized I still have another question: it seems to me that AsyncOpen isn't called on the Child process for the new channel, can you please confirm? I try to find the best location to emit the "start" marker for the redirected request.

Flags: needinfo?(honzab.moz)
Flags: needinfo?(honzab.moz)

Thanks, I managed to do something that works now :)

Here is a try build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=779d823d61e5879513112749049d6915b324ea90

Markers were either incorrect or missing when redirecting a request.

Assignee: nobody → felash
Attachment #9073774 - Attachment is obsolete: true

I just pushed the patch to integration, thanks again for your help!

Pushed by jwajsberg@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/647196edc184 Fix network markers in the redirection case r=mayhemer
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: