Closed Bug 1188250 Opened 9 years ago Closed 6 years ago

Consider reporting unsafe CPOW instead of CPOW wallclock time

Categories

(Toolkit :: Performance Monitoring, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: Yoric, Unassigned)

References

Details

Currently, we measure CPOW badness by the wallclock duration of CPOW calls during each time interval. However, this proves quite unstable. On other hand, dom/ipc/ContentParent.cpp has code to detect "unsafe CPOW". It might be a better idea to report the number of unsafe CPOWs, rather than their duration.
Does this sound like a good idea to you, Mike?
Flags: needinfo?(mconley)
I'm personally more interested in unsafe CPOW usage time, but I can only speak for myself. I believe blassey mentioned in IRC that wallclock time has advantages, and you responded that wallclock is somehow misleading...

I think you two should probably duke it out here. I personally have no educated opinion on keeping or dropping wallclock.
Flags: needinfo?(mconley) → needinfo?(blassey.bugs)
about:performance currently tells me that "Terms of Service; Didn’t Read" has a highest CPOW time per 15s of 11699000µs, basically since I started Firefox. That's 11s per 15s, which is just abnormal, especially since my Firefox is responsive. I cannot find any obvious bug in the code, which leads me to believe that the issue is actually that we're measuring wallclock time, and that errors add up.

Additionally, wallclock is affected by the rest of the activity on the system, which means that we are going to end up with random false positives depending on what is going on on the computer.

That's two reasons to drop wallclock, as far as I am concerned, and find an alternative.
(additionally, this means that the slow add-on monitor shows me an alert every 15 seconds, for an add-on that doesn't cause any slowdown)
Since we're dealing with hangs due to IPC, wallclock time seems like a perfectly reasonable measurement to me.

(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #4)
> (additionally, this means that the slow add-on monitor shows me an alert
> every 15 seconds, for an add-on that doesn't cause any slowdown)

You could click the button that says ignore permanently. That's what its there for.
Flags: needinfo?(blassey.bugs)
I'm sure that it sounds reasonable. Unfortunately, testing strongly suggests that it doesn't work. See the results in comment 3. I will continue checking if there is an error in the code, but I believe that the core issue is using wallclock time at all.

Incidentally, the implementation of `PR_IntervalNow()` on MacOS X has an interesting comment (saying essentially that it's broken): https://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/unix/unix.c#3020-3025. The implementation on Windows uses `timeGetTime`, which has a precision of ~5 milliseconds, which is probably at least 100x worse than what we need.

> You could click the button that says ignore permanently. That's what its there for.

Certainly, but how many million users will we lose if we release something like that?
I think I can replace `PR_IntervalNow()` with something that has a better resolution. We'll see how things go.
So let's see about bug 1188616 first.
Depends on: 1188616
Blocks: 1188616
No longer depends on: 1188616
No longer blocks: 1188616
Depends on: 1188616
about:performance is being redesigned; mass closing the bugs related to parts of the current about:performance page that we are not keeping.

Our goals with the redesign are to reduce the overhead caused by having the page opened, increase the reliability of the displayed information, and make the offered information actionable for most users. The back-end work is being tracked in bug 1419681.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.