Open Bug 1493579 Opened 6 years ago Updated 2 months ago

Add support for chunks in HAR for HTTP2

Categories

(DevTools :: Netmonitor, enhancement, P3)

61 Branch
enhancement

Tracking

(Not tracked)

People

(Reporter: peter, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [webpagetest] )

Attachments

(2 files, 1 obsolete file)

Attached image h2.png
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36

Steps to reproduce:

Today it's really hard to see what happens in the HAR file when your web server uses HTTP2. The response in the HAR file will signal when the browser first talks to the server and when everything is finished. In most HAR viewers that means we will have a long bar that actually gives us any new information. 

The HAR format doesn't support showing more detailed info (yep I know you created it Jan, it's great :)). Patrick Meenan of WebPageTest has created a workaround where we adds when chunks are being sent over the network in the HAR file. Every HTTP2 response have these extra fields:
...
 "_chunks": [
 {
   "bytes": 385,
   "ts": 1361.457
},
{
   "bytes": 1300,
   "ts": 1367.293
},
...

And here's an full example: https://gist.github.com/soulgalore/2a45fb2ecce8d8e039fd587ef7cb437c

HAR viewers can then show exactly when data is received by the browser. I've attached how WebPageTest show that extra info (darker colors in the waterfall).

I think it would be great if:
* Firefox HAR also included chunks
* The network viewer in Firefox showed those chunks.

I know this isn't standard behavior but it is so hard to see and understand today when data is sent I think this really would help me as a developer.
Component: Untriaged → Networking: HTTP
Product: Firefox → Core
Component: Networking: HTTP → Netmonitor
Product: Core → DevTools

Thanks for the report!

Agree, this would be nice feature.

Honza

Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Whiteboard: [webpagetest]

be sure the chunk ?number? is submitted to telemetry

Is this bug available to work on? I'd like to try tackling it if I may?

I see there are two aspects to this task: adding the additional fields to the HAR export, and displaying HTTP2 chunks in the waterfall. I'm pretty sure I've found the code relevant to the HAR export in src/har/har-builder.js. Displaying chunks in the waterfall might be achievable by enhancing timingBoxes() in src/components/RequestListColumnWaterfall.js (perhaps something along the lines of this: https://jsfiddle.net/tsn1bLkr/)?

I need to dig further to ascertain whether this timing information is readily available, or if some work needs to be done to expose this data to the front-end.

I guess another thing to consider is whether it is worth dividing this into two tasks and focusing on one or the other for now.

What are your thoughts Jan?

Flags: needinfo?(odvarko)

Sorry for huge delay.

I guess another thing to consider is whether it is worth dividing this into two tasks and focusing on one or the other for now.

Exactly, I think that this bug should be about extending HAR only (and we can file another report for the waterfall).

Still interested?

Honza

Flags: needinfo?(odvarko)

Yes, I'm absolutely still interested :)

I haven't looked at this is a while, I'll post a follow-up when I've got myself back up to speed.

Great, thanks.
Honza

Assignee: nobody → jlogandavison
Status: NEW → ASSIGNED
Attached patch patch.diff (obsolete) — Splinter Review

I've prepared a preliminary patch that adds a _chunks field to the HAR output with the format:

_chunks: [{
    bytes: ...
    ts: ...
}, ... ]

Not 100% sure about my approach to implementing this. I've essentially added a _chunks field to specs/network-event.js complete with get/add/onEventChunks() methods.

It does feel like a bit of an obtrusive change, simply adding adding _chunks to the existing event timings data might be a bit cleaner. On the other hand, these chunk arrays can get quite large and the possibility of lazily fetching them separate from the other timings data may prove beneficial. Thoughts?


On a note related to the format of the _chunks array, I got curious about how exactly WebPagetest were calculating the size of the chunks on their waterfall. From the source it appears that the width of a chunk is a function of bytes in the chunk divided by the maximum bandwidth of the download ($max_bw).

Are there values available in our HAR file that could be used by tools to make a similar calculation?

Flags: needinfo?(odvarko)

Thanks for working on this!

(In reply to jlogandavison from comment #7)

It does feel like a bit of an obtrusive change, simply adding adding _chunks to the existing event timings data might be a bit cleaner. On the other hand, these chunk arrays can get quite large and the possibility of lazily fetching them separate from the other timings data may prove beneficial. Thoughts?

I agree, lazily loading the _chunks array seems to be the right way to go.

Are there values available in our HAR file that could be used by tools to make a similar calculation?

By 'maximum_bandwith', do you mean the total size of the response body?
(that is available in the HAR File)

Do you have any test page I could use to try the patch?

Honza

Flags: needinfo?(odvarko) → needinfo?(jlogandavison)

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #8)

By 'maximum_bandwith', do you mean the total size of the response body?
(that is available in the HAR File)

WebPagetest runs tests in an environment with a restricted download speed. That speed is 5Mbps and I think that is what is being used as their measure of maximum bandwidth.

The chunk timestamps in the HAR file represent the point in time that a chunk was received, so the timestamp only represents the end of the chunk as displayed on the waterfall. WebPagetest then approximates the download duration based on the fastest possible download speed (maximum bandwidth).

In detail, each individual chunk is positioned on the waterfall based on their timestamp (ts), aligned to the end of the chunk. Then: chunk_time = bytes / (maximum_bandwidth / 8) calculates an approximation of the period of time that the chunk was downloading for. Finally the chunk_time is subtracted to find the start position of the chunk: start = end - chunk_time.

Of course, in Firefox's case there is no maximum bandwidth to perform these calculations with. Within the HAR file bodySize & timings.receive can be used to calculate the overall download speed of a request, but I worry that that isn't a representative measure (especially in the case of HTTP2 where requests are multiplexed).

Do you have any test page I could use to try the patch?

I have been using the wiki page for Barack Obama used by Peter in the HAR file above. Additionally I've been using the image of Barack Obama directly.

There's also the Akamai HTTP2 demo but that results in a relatively large HAR output.

Flags: needinfo?(jlogandavison)
Attached patch patch.diffSplinter Review

Updating the patch. There was a bug with _setupHarChunks() because timings.RESPONSE_START is sometimes not set (eg: in the case of a cached response). I've added an additional guard for this case.

Attachment #9075020 - Attachment is obsolete: true
Flags: needinfo?(odvarko)

Thanks for the update and the explanation!

I am seeing a conflict when applying the patch on the latest m-c. Please make sure you have the latest version (hg pull -u)

Hunk #1 FAILED at 109
1 out of 1 hunks FAILED -- saving rejects to file devtools/client/netmonitor/src/har/har-builder.js.rej
patching file devtools/shared/specs/network-event.js
Hunk #1 succeeded at 71 with fuzz 1 (offset 5 lines).
patch failed, unable to continue (try -v)
? 0 problems (0 errors, 0 warnings)
patch failed, rejects left in working directory

The patch looks good to me and seems to be working, but one question. Shouldn't the sum of all chunk.bytes be equal to response.content.bodySize (the transferred amount of bytes)? In my case it seems to be bigger (even if responses with one chunk only matches exactly the bodySize).

Honza

Flags: needinfo?(odvarko) → needinfo?(jlogandavison)

I am seeing two test failures

https://treeherder.mozilla.org/#/jobs?repo=try&revision=143b7c6abdc6a828433dd2f5df13b25865b8c16a&selectedJob=255250166

devtools/shared/webconsole/test/browser/browser_network_longstring.js
devtools/client/netmonitor/src/har/test/browser_net_har_import.js

Running tests locally is possible using: mach test devtools/shared/webconsole/test/browser/browser_network_longstring.js

Honza

Hi jlogandavison. I'm just checking if you are still interested in working on this bug. There hasn't been any activity here for some time. If you just need more time, that's fine. If you, however, are not planning on working on this bug anymore, let us know so it can be made available for others to pick up.

(In reply to Patrick Brosset <:pbro> from comment #13)

Hi, yes I'm still interested in working on this. I've been been somewhat AFK for a little while and I'm just about getting to the point where I have time to work on stuff again. I'll post an update once I'm back up to speed.

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #11)

Shouldn't the sum of all chunk.bytes be equal to response.content.bodySize (the transferred amount of bytes)? In my case it seems to be bigger (even if responses with one chunk only matches exactly the bodySize).

I've had a closer look at this. With regards to the total size of the _chunks array. I don't know whats going on :)
There's something entirely messed up about the results that I'm seeing. The total size is inconsistent with both compressed/uncompressed sizes, sometimes the _chunks array is missing entirely. I'll need to revisit what I've done here, I'll keep you updated.

I think I've spotted an inconsistency with the recorded bodySize though. It appears that response.content.bodySize is currently being recorded as equal to the 'Content-Length' header plus the size of the received headers. This is because HarBuilder is using the transferredSize of the response [1] which stores the "size on the wire" (compressed or uncompressed body plus headers) [2].

I'm not sure this is compliant to the HAR spec strictly speaking. Is this a bug? Shouldn't bodySize denote the number of bytes just in the body of the response? Should it be the "size on the wire" or the uncompressed size?

[1]. https://searchfox.org/mozilla-central/source/devtools/client/netmonitor/src/har/har-builder.js#378-382
[2]. https://searchfox.org/mozilla-central/source/devtools/server/actors/network-monitor/network-response-listener.js#146-149

Flags: needinfo?(jlogandavison) → needinfo?(odvarko)
Flags: needinfo?(odvarko)

The bug assignee didn't login in Bugzilla in the last 7 months.
:Honza, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: jlogandavison → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(odvarko)

This bug is still somethinge we want to have fixed, but it isn't a priority for the team at the moment.

Flags: needinfo?(odvarko)
Severity: normal → S3

Hey! Long time no speak

Is this still something that needs working on? I'd be happy to jump back on it if it's still something that would be useful and the team has bandwidth to review

Thanks

Flags: needinfo?(odvarko)

Hi, thank you for the ping! I am adding this to our triage list. We'll discuss at our triage meeting this week.

Flags: needinfo?(odvarko)
Whiteboard: [webpagetest] → [webpagetest] [devtools-triage]

Is this still something that needs working on? I'd be happy to jump back on it if it's still something that would be useful and the team has bandwidth to review

Yes, it sounds good to revive the patch and start some reviews here, even if this is a Firefox only feature for now.

Peter, are you still interested in this feature? Do you know if this is something you can also get from other browsers eg Chrome? Thanks

Flags: needinfo?(peter)
Whiteboard: [webpagetest] [devtools-triage] → [webpagetest]

Hi, I’ve been slowly spinning back up on this and so this is just a quick update.

I think what I stalled on last time was a bit of confusion about what a “chunk” represents here. And as such I was struggling to debug code that was producing chunks with different byte totals vs the bodySize of a request.

Looking a bit further into the HTTP/2 spec [1], I think it could be a good idea to base our “chunk” on the payload size of the HTTP frame primitive [1.1]. HTTP/2 interleaves on the frame level as far as I understand it, there’s no sub-frame multiplexing going on (though this may be different for QUIC / HTTP/3?). So, the arrival time and size of each frame would be enough info to populate our chunks array in a way that accurately represents multiplexing behaviour. Have I understood the spec correctly? Does this sound like a reasonable approach?

Further from this, what would you think about having some more detailed information about frames in the HAR. Something along the lines of:

_frames: [

    {

        “type”: “HEADER”,

        “weight, etc…”: …,
        …,
        “payloadLength”: 1234,
        “paddingLength”: 8,
        “timestamp”: 327217981

    },
    {
         type: “DATA”,
         “payloadLength”: 1235,
         “paddingLength” 8,
         “timestamp”: 327217987
    }
]


I think it would help people with their debugging if they can see this kind of info, but perhaps it’s just TMI? What do you think?

Flags: needinfo?(odvarko)

Redirect a needinfo that is pending on an inactive user to the triage owner.
:Honza, since the bug has recent activity, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(peter) → needinfo?(odvarko)

Julian, can you please look at the comment #20, does that make sense?

Flags: needinfo?(odvarko)
Flags: needinfo?(jdescottes)

I think as long as the information can be collected easily (across browsers), this sounds fine.

Flags: needinfo?(jdescottes)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: