If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Make "HTTP: Total page load time" more precise

RESOLVED WORKSFORME

Status

()

Core
Networking
--
enhancement
RESOLVED WORKSFORME
6 years ago
6 years ago

People

(Reporter: mayhemer, Assigned: mayhemer)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

6 years ago
Created attachment 538251 [details] [diff] [review]
v1

This is a followup to bug 658894.

I can see that total page load times are quit coarse.  This is a patch that splits the load time for loads taking less then 1,5 seconds and more then that time, having the shorter times better granularity of the histogram.
Attachment #538251 - Flags: review?(jduell.mcbugs)
(Assignee)

Updated

6 years ago
Severity: normal → enhancement
OS: Windows 7 → All
Hardware: x86 → All
(Assignee)

Comment 1

6 years ago
Created attachment 538279 [details] [diff] [review]
v1.1

Even better patch.  Now sure how much load on the telemetry server and local memory has the bucket size of 200 ranges.  But keeping the value lower at 50 doesn't seem to produce finer graphs.
Assignee: nobody → honzab.moz
Attachment #538251 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #538251 - Flags: review?(jduell.mcbugs)
Attachment #538279 - Flags: review?(tglek)
Attachment #538279 - Flags: review?(jduell.mcbugs)

Comment 2

6 years ago
Comment on attachment 538279 [details] [diff] [review]
v1.1

Can we hold this until bug 661574 lands? I would rather not define new macros only to undo them later?
(Assignee)

Comment 3

6 years ago
(In reply to comment #2)
> Can we hold this until bug 661574 lands? I would rather not define new
> macros only to undo them later?

For sure, there is no rush.  Until we start collecting data from users, I don't think we need this anyway.  I'm mainly using this fine-grained histogram for my own testing now.

Comment 4

6 years ago
Comment on attachment 538279 [details] [diff] [review]
v1.1

taking myself off review here while we wait for 661574
Attachment #538279 - Flags: review?(tglek)

Updated

6 years ago
Depends on: 661574
(Assignee)

Comment 5

6 years ago
I am having an idea to change the patch a bit:
1. report all page load times with range of 10 minutes - the default (worse granularity), but including all loads
2. then additionally report only loads with time shorter then 3 seconds (to have a better granularity for most loads)

This way we will have overall time with an average value count from all measures made, and also a precise measure for pages that load quickly (IMO most of them, but numbers will say).


Also other way might be to change the bucket granularity distribution.  What about some kind of power, logarithmic or Gaussian proportion?  We expect page load times are going to be around some average load time with rare, and not that interesting according to precision of the histogram, deviations to longer times.

So, to outline the idea better:
linear buckets:   |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  
power buckets:    |||| | |  |  |   |    |     |       |        | 
logarithmic:      ||||| |   |     |         |                  |
Gaussian:         |  |  | | |||||| | |  |  |   |     |         |

Makes sense?
Is the exponential histogram type mentioned here good enough for this:

http://blog.mozilla.com/tglek/2011/06/22/developers-how-to-submit-telemetry-data/
(Assignee)

Comment 7

6 years ago
Exponential is the default.  Let's see how the telemetry in bug 658894 will work first.  I set it with min=100, max=10000, bucket=100.  I don't think we will load a significant number of pages quicker then in 100ms.  If we would, we would not need telemetry ;)
Comment on attachment 538279 [details] [diff] [review]
v1.1

Honza, how are you feeling about this bug now?  The code has bitrotted, so I'm taking myself off the review.

I see TOTAL_CONTENT_PAGE_LOAD_TIME times are doing ok for granularity (about:telemetry shows a *lot* of buckets, especially at the <3sec range.  If anything, it's now getting hard to parse the results visually because there are so many buckets.

I also see the max value (10000: 10 seconds) that gets a huge spike.  Do we have 10s hard-coded somewhere?  Ah, we do.   I'll file a separate bug for that.

Anyway, I suggest we mark this WORKSFORME unless there's some way of presenting the histogram that's better and that we can't get from the standard telemetry view.
Attachment #538279 - Flags: review?(jduell.mcbugs)
(Assignee)

Comment 9

6 years ago
(In reply to Jason Duell (:jduell) from comment #8)
> Comment on attachment 538279 [details] [diff] [review]
> v1.1
> 
> Honza, how are you feeling about this bug now?  The code has bitrotted, so
> I'm taking myself off the review.
> 

I think we may close this bug.  The resolution finally looks pretty good.

> I see TOTAL_CONTENT_PAGE_LOAD_TIME times are doing ok for granularity
> (about:telemetry shows a *lot* of buckets, especially at the <3sec range. 
> If anything, it's now getting hard to parse the results visually because
> there are so many buckets.

Do you think that is a serious problem for the telemetry infrastructure?  I believe this particular timing is the most important and direct indicator of our performance gain efforts.  So it should be precise, but not 'crazy' precise, of course.

This bug -> WORKSFORME
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.