Add locking to SurfaceCache entry points that bypass the public API

RESOLVED FIXED in Firefox 37

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: seth, Assigned: seth)

Tracking

unspecified
mozilla38
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox35 unaffected, firefox36 unaffected, firefox37+ fixed, firefox38+ fixed, firefox-esr31 unaffected)

Details

Attachments

(1 attachment, 3 obsolete attachments)

Assignee

Description

4 years ago
I messed up in bug 1116733. =( Although I added locking to all of SurfaceCache's public API entry points, I failed to add locking to some entry points that bypass the public API. These are:

- SurfaceTracker::NotifyExpired()
- MemoryPressureObserver::Observe()
- SurfaceCache::CollectReports()

This is almost certainly responsible for bug 1122744, and probably other failures as well.

This will need uplift to Aurora.
Assignee

Comment 1

4 years ago
Here's the patch.

I made a tweak to SurfaceTracker to make it look more like all the other sites
where we do locking, because consistency is wise with this kind of thing.
Attachment #8555794 - Flags: review?(tnikkel)
Assuming I've got my releases correct...

[Tracking Requested - why for this release]: Crashes, probably responsible for bug 1122744, intermittent failures in automation, and likely other things as well.
Attachment #8555794 - Flags: review?(tnikkel) → review+
Assignee

Comment 3

4 years ago
[Tracking Requested - why for this release]: Same as in comment 2, but this is on Aurora also.
Assignee

Comment 4

4 years ago
Pushed despite inbound being closed so this can make the next Nightly:

https://hg.mozilla.org/integration/mozilla-inbound/rev/eedc4452c4c3
Assignee

Comment 5

4 years ago
Comment on attachment 8555794 [details] [diff] [review]
Add locking to SurfaceCache entry points that bypass the public API

Approval Request Comment
[Feature/regressing bug #]: 1116733
[User impact if declined]: Crashes and erratic behavior related to images. Also intermittent oranges due to image surfaces being released when they should have been locked.
[Describe test coverage new/current, TreeHerder]: Green on try. (And pushed to inbound but not yet merged to central.)
[Risks and why]: This is really low risk - just adds locking in some places that were missed in bug 1116733.
[String/UUID change made/needed]: None.
Attachment #8555794 - Flags: approval-mozilla-aurora?
Assignee

Comment 7

4 years ago
Both of those issues were super quick fixes. I just ran through the image tests
locally to make sure nothing else got missed.

I'll submit a new try job too (I thought I submitted a try job for this before, but I can't find it!) and will repush after that's confirmed green.
Assignee

Updated

4 years ago
Attachment #8555794 - Attachment is obsolete: true
Attachment #8555794 - Flags: approval-mozilla-aurora?
Assignee

Comment 8

4 years ago
OK, here's the try job:

https://tbpl.mozilla.org/?tree=Try&rev=5d602f3fa506
Flags: needinfo?(seth)
Assignee

Comment 9

4 years ago
browser_bug666317.js failed because it expected DISCARD to be delivered
synchronously, but we made it async. It's an easy fix: we just need to pump the
event loop to allow the event to be delivered.
Assignee

Updated

4 years ago
Attachment #8556240 - Attachment is obsolete: true
(In reply to Seth Fowler [:seth] from comment #10)
> Should be ready to go now. Pushed:
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/e5c85f765f2d

sorry but this turned out to be still failing and so i had to back this out. Test failure was https://treeherder.mozilla.org/logviewer.html#?job_id=6059522&repo=mozilla-inbound
Assignee

Comment 12

4 years ago
Sigh. OK, apparently setTimeout(..., 0) is not a safe way to pump the event
queue, because setTimeout functions run when the next timer event fires, which
might've already been in the event queue before the DISCARD event got put there.
I talked to bz about this and he suggested using window.postMessage instead, so
this new version of the patch uses that approach. It works locally, but then, so
did the old approach, so I've pushed a new bc-only run on all desktop OSes:

https://tbpl.mozilla.org/?tree=Try&rev=0a1e6f6b9b5a
Assignee

Updated

4 years ago
Attachment #8556335 - Attachment is obsolete: true
Assignee

Updated

4 years ago
Blocks: 1120279
Assignee

Comment 13

4 years ago
I talked with KWierso and we concluded that the test failure in the above try push is probably unrelated. Pushed:

https://hg.mozilla.org/integration/mozilla-inbound/rev/f274f6a5616f
https://hg.mozilla.org/mozilla-central/rev/f274f6a5616f
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Assignee

Comment 15

4 years ago
Comment on attachment 8556636 [details] [diff] [review]
Add locking to SurfaceCache entry points that bypass the public API

Approval Request Comment
[Feature/regressing bug #]: 1116733
[User impact if declined]: Crashes and erratic behavior related to images. Also intermittent oranges due to image surfaces being released when they should have been locked.
[Describe test coverage new/current, TreeHerder]: On central.
[Risks and why]: This is really low risk - just adds locking in some places that were missed in bug 1116733. (Some tests had to be updated because a notification was changed from sync to async, but that notification is _only_ used in tests, so this change will not affect users.)
[String/UUID change made/needed]: None.
Attachment #8556636 - Flags: approval-mozilla-aurora?
Comment on attachment 8556636 [details] [diff] [review]
Add locking to SurfaceCache entry points that bypass the public API

Glad to see this patch stick on m-c. Let's get it on Aurora. Aurora+
Attachment #8556636 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee

Updated

4 years ago
Blocks: 1127829
You need to log in before you can comment on or make changes to this bug.