Closed Bug 1549504 Opened 5 years ago Closed 5 years ago

Assert that mRecursiveMutex is held in AsyncPanZoomController::Metrics()

Categories

(Core :: Panning and Zooming, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox68 --- wontfix
firefox69 --- fixed

People

(Reporter: botond, Assigned: srujana121, Mentored)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

In bug 1477832, we improved thread safety in AsyncPanZoomController by identifying places Metrics() was called without mRecursiveMutex being held, and acquiring mRecursiveMutex in those places.

We used local testing and a Try run to identify such places, and fixed all the places we identified in this way.

We identified these places by temporarily asserting in Metrics() that the lock was held, but we didn't actually land this assertion.

As a follow-up, we could try to land this assertion as well.

There are two things to watch out for:

  1. The accessor is frequently called, and we don't want to slow the program down too much (even in a debug build).

  2. There is a good chance that there are still places where Metrics() is accessed without holding the lock (and thus where the assertion will fire) that was not caught by local testing and a Try run, that will be triggered "in the wild". This will cause the browser to crash (when running a debug build). We'll want to keep an eye on such crashes and address them quickly (especially if they're frequent), or else back out the assertion.

Type: task → enhancement

Hi,
I'm interested to work on this.

Ok, great! The main thing to do is write a patch that adds back the assertions that we removed from the patch in bug 1477832. We can run some Talos tests on Treeherder to assess the impact of the assertion check on runtime.

So the assert statement should be moved from Framemetrics() to Metrics()?

Yes, that's right.

Changed assertion from x to y. Effect on runtime is to be studied to decide whether the change has to be landed.

(In reply to Botond Ballo [:botond] from comment #2)

We can run some Talos tests on Treeherder to assess the impact of the assertion check on runtime.

I failed to realize when I wrote this that Talos tests only run on optimized builds, while MOZ_ASSERT only does anything in a debug build. So, adding a MOZ_ASSERT and pushing to Talos will not tell us anything interesting.

Instead, just for the purpose of performance testing, let's change the assertions we're adding to MOZ_RELEASE_ASSERT, so that they run in optimized builds as well. This will allow us to assess the performance impact using Talos tests. (However, if we then decide to land the assertions, we would want to land them in MOZ_ASSERT form, not MOZ_RELEASE_ASSERT.)

I have updated the patch and tested ! Let me know if anything else is to be done!
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6c0517c8c1885b62e358e664bc5b296e8604e02

(In reply to Srujana Peddinti from comment #7)

I have updated the patch and tested ! Let me know if anything else is to be done!
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6c0517c8c1885b62e358e664bc5b296e8604e02

Thanks! Here's what I did to assess the performance results:

  • In Treeherder, I selected one of the Talos jobs (a job in the category "T"). It doesn't matter which one, or on which platform.
  • In the bottom pane, I clicked "Compare result against another revision", which took me to Perfherder (our tool for analyzing performance test results).
  • In Perfeherder, I left "Base" as the default (mozilla-central) and clicked "Compare", which took me to this view.
  • I clicked "Show only important changes" to hide results that are mostly noise or low-confidence.

The only regression listed there is a 5% regression in "tp5o_scroll", which is a scrolling benchmark, and even there it only occurs on one platform (if you turn off "Show only important changes" and look at the results for "tp5o_scroll" on other platforms, it's a mixed bag with small regressions on some platforms and small improements on others). This suggests to me that the assertion is unlikely to be causing a systemic performance impact, and therefore that it is reasonable to land this, especially as it will only impact debug builds.

So, please go ahead and prepare a version of the patch that uses MOZ_ASSERT for landing.

I have reverted the changes. Please check and if there's any rebasing issues while landing please let me know, So that I can rebase and update the code .
Thank you!

I have updated the patch. Please check.
Thank you

Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cadc4bdcd75a
Assert that AsyncPanZoomController::mRecursiveMutex is held in Metrics() r=botond
Attachment #9066991 - Attachment description: Bug 1549504 - Changed the assertion from Framemetrics() to Metrics() → Bug 1549504 - Assert that AsyncPanZoomController::mRecursiveMutex is held in Metrics()

Thanks!

Please keep in mind this from the bug description:

(In reply to Botond Ballo [:botond] from comment #0)

  1. There is a good chance that there are still places where Metrics() is accessed without holding the lock (and thus where the assertion will fire) that was not caught by local testing and a Try run, that will be triggered "in the wild". This will cause the browser to crash (when running a debug build). We'll want to keep an eye on such crashes and address them quickly (especially if they're frequent), or else back out the assertion.
Assignee: nobody → Srujana.htt121
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: