Closed Bug 1407988 Opened 2 years ago Closed 2 years ago

How to track BHR reports against devtools?

Categories

(DevTools :: General, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ochameau, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [bhr])

In bug 1404917, I'm expecting to have an impact on 2s+ BHR reports against devtools and especially the ones about netmonitor. But I would like to know if such patch had an impact. Otherwise, it will be hard to track progress of bug 1403894 and ensure we really improve BHR results.

There is this page that works:
http://arewesmoothyet.com/?category=all&durationSpec=2048_65536&payloadID=11c81065702946eaa83d036fdb10fc1f&search=devtools%2Fclient%2Fnetmonitor%2F&thread=2
But you can only see that last ~10days, which doesn't make any trend obvious.

There is another page, when you click on "show historical data", but that never loads:
http://arewesmoothyet.com/?category=all&durationSpec=2048_65536&historical=true&thread=0
Harald, Do you know if BHR data is available somewhere else other than arewesmoothyet? Or do you know how to navigate into its UI to get better visualizations? Or who should we ping to unbreak the historical data view?
Flags: needinfo?(hkirschner)
Adding the BHR experts.

For background: Devtools team wants to track hangs caused by devtools over time as a metric. 2+s hangs are a major issue we are trying to eliminate completely. Is there a way to visualize a set of hangs (filtered by devtools .js in this case) over time (maybe averages over 1 week)?
Flags: needinfo?(nika)
Flags: needinfo?(hkirschner)
Flags: needinfo?(dothayer)
Arewesmoothyet started OOMing the browser in historical view because there's too much there, so I split it out by thread, but I still need to update the UI to accomodate this. For now you should be able to get what you need here:

https://arewesmoothyet.com/?category=all&durationSpec=2048_65536_historical_Gecko&payloadID=8527d21007034e64b070384a0d836a4b&search=devtool&thread=0

(and if you need to look at content):
https://arewesmoothyet.com/?category=all&durationSpec=2048_65536_historical_Gecko_Child&payloadID=8527d21007034e64b070384a0d836a4b&search=devtool&thread=0

I'm working now to update the UI, so it should be functioning normally soon.
Flags: needinfo?(dothayer)
Minor clarification, in case anyone was surprised by the OOMing comment: I think it's just failing to allocate the contiguous region needed to parse the >1GB JSON file. And just in case anyone is trying to use Chrome to look at this, I think Chrome's limit on contiguous allocations might be smaller(?), because if you try to load the Gecko_Child payload in Chrome it won't work.
(In reply to Doug Thayer [:dthayer] from comment #3)
> Arewesmoothyet started OOMing the browser in historical view because there's
> too much there, so I split it out by thread, but I still need to update the
> UI to accomodate this. For now you should be able to get what you need here:
> 
> https://arewesmoothyet.com/
> ?category=all&durationSpec=2048_65536_historical_Gecko&payloadID=8527d2100703
> 4e64b070384a0d836a4b&search=devtool&thread=0

The timeline is almost empty.
There is only 3 days being reported with very high variance:
  9/5, 9/6, 9/11, 9/12

Did we received BHR reports about devtools only on these dates?
If yes, I'm not sure we can draw any trend unless we expand the view to many monthes :/

It looks like the timeline Y axis is the severity of hangs, is it?
If that's the case, may be the number of hangs would be better?
(In reply to Alexandre Poirot [:ochameau] from comment #5)
> It looks like the timeline Y axis is the severity of hangs, is it?
> If that's the case, may be the number of hangs would be better?

Light blue is the number of hangs. Dark blue is the total time spent hanging across all sessions (i.e. the mean severity per hang would be the dark blue number divided by the light blue number).
(In reply to Alexandre Poirot [:ochameau] from comment #5)
> The timeline is almost empty.
> There is only 3 days being reported with very high variance:
>   9/5, 9/6, 9/11, 9/12

Quick note: I think you've selected a node with less data. The graph at the top reflects the currently selected node. That being said, there's still quite a lot of variance once you select the devtools/client/netmonitor node, and the only thing that I can think of to explain this variance is that it really is that volatile. :/
Doug, we would like to see a trend over weeks. As Alex said, if we filter the UI by devtools-related hangs, we don’t have an evenly distributed daily signal which can probably be fixed by looking at weekly numbers.

I assume BHR data is available longer and we can generate a trend going back a few months/release, so this is just a matter of writing a web service that aggregates the filtered hang data by week?
Yeah, but if you want to go back to before the historical view that I linked above, the quality of data is going to go down pretty significantly, since it'll put you before the BHR ping format change, which also means before interleaved stacks. However, if all you want is to graph a single stat over time for stacks that contain a specific string, I should be able to adapt that Spark job I wrote for overall hang stats for this. However, it will have all the same problems that chutten brought up (changing Nightly population, etc.)
The ping data change is not an issue. We are only interested in historical data from now onwards, not going back beyond this week.

Given that Nightly is our only source for BHR data, we have to accept the changing population as fact of life.

To validate the impact of our performance work, we want to track different stacks of devtools-related hangs: 512-2048ms and over 2048ms. The weekly trend for hang aggregate (hangs per usage hour?) would be enough for tracking, as we can use arewesmoothyet for diagnostics.
Clearing ni? as doug seems to have this handled :-).
Flags: needinfo?(nika)
Doug, do you need more to have this unblocked? Maybe we should have test run first on how stable the data is aggregated weekly?
Flags: needinfo?(dothayer)
> To validate the impact of our performance work, we want to track different
> stacks of devtools-related hangs: 512-2048ms and over 2048ms.

To clarify: you want to track different stacks, or different severity classes? I.e., do you want me to just track all stacks that contain "devtools", but split out into those between 512 and 2048ms, and those greater than 2048? Or do you want me to break out the stacks into subgroups, and also split by severity class?

> The weekly
> trend for hang aggregate (hangs per usage hour?) would be enough for
> tracking, as we can use arewesmoothyet for diagnostics.

Visually, it looks like the weekly values should be somewhat stable for 512-2048ms[1], but given the daily graph for 2048ms+, I can't see it being stable at a week level.


[1]: https://arewesmoothyet.com/?category=all&durationSpec=512_2048_historical_Gecko&payloadID=bf24475d197e4cd1bc97962142e701df&search=devtools&thread=0 (click on the (root) node to see all stacks that have "devtools" in them)(In reply to :Harald Kirschner :digitarald from comment #10)
Flags: needinfo?(dothayer) → needinfo?(hkirschner)
"split out into those between 512 and 2048ms, and those greater than 2048" is what we need for tracking; to eliminate the most severe hangs and reduce the less severe ones.

> Visually, it looks like the weekly values should be somewhat stable for 512-2048ms[1], but given the daily graph for 2048ms+, I can't see it being stable at a week level.

WFM with the steps described. Every day has at least some hangs; while some stand out.
Flags: needinfo?(hkirschner)
I ran an analysis to grab a rolling 7-day average of devtools hangs[1]. If you want to use this instead of arewesmoothyet's historical view, here[2] is the source.

[1]: https://screenshots.firefox.com/KeGTbyJItwfhMRQt/localhost
[2]: https://gist.github.com/squarewave/90dc84a02407216a532501ae665b8b67
Doug, pardon my extra questions; devtools doesn’t have a data scientist assigned.

What is the best for us to use the python script as a data source? Could this be set up to have weekly CSV drops into an S3 folder that I could aggregate into a graph or could it even feed into stmo?
Flags: needinfo?(dothayer)
Since I've been getting a few similar requests to this, I decided to just make a reusable system for this stuff which I can make a simple dashboard for. It should be ready soon, if that works.
Flags: needinfo?(dothayer)
Thanks a lot! Is it tracked somewhere to I follow along development?
Flags: needinfo?(dothayer)
Added https://github.com/squarewave/bhr.html/issues/24 to track it.
Flags: needinfo?(dothayer)
Priority: -- → P2
Thanks Doug for your dashboard:
  https://arewesmoothyet.com/?mode=track&trackedStat=Devtools%20Hangs

I think this is perfect for tracking BHR related to DevTools!
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Whiteboard: [bhr]
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.