MappableSeekableZStream::ensure can deadlock when profiler is on

RESOLVED FIXED in Firefox 24

Status

()

defect
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: jchen, Assigned: glandium)

Tracking

unspecified
mozilla25
All
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox23 unaffected, firefox24+ fixed, firefox25 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

MappableSeekableZStream::ensure can deadlock in the following scenario,

1) On-demand decompression causes SIGSEGV
2) SIGSEGV handler calls MappableSeekableZStream::ensure
3) MappableSeekableZStream::ensure locks its mutex
4) Profiler causes SIGPROF
5) SIGPROF handler is not available, causing SIGSEGV again
6) SIGSEGV handler calls MappableSeekableZStream::ensure again
7) MappableSeekableZStream::ensure deadlocks on already-owned mutex

This can conceivably happen with other signals, but I don't know if we do use other signals.

FWIW, I verified that the attached patch fixes the issue by blocking signals during the lock.
Attachment #780648 - Flags: feedback?(mh+mozilla)
Assignee

Comment 1

6 years ago
Comment on attachment 780648 [details] [diff] [review]
Block signals to avoid mutex deadlock in MappableSeekableZStream::ensure

Review of attachment 780648 [details] [diff] [review]:
-----------------------------------------------------------------

Two things I'd like you to check:
- what happens if a segv happens while the lock is held (you can test what happens if you make MappableSeekableZStream::ensure MOZ_CRASH randomly).
- how well the profiler responds to its SIGPROF being delayed (probably fine).
Attachment #780648 - Flags: feedback?(mh+mozilla)
Assignee

Comment 2

6 years ago
(In reply to Jim Chen [:jchen :nchen] from comment #0)
> Created attachment 780648 [details] [diff] [review]
> Block signals to avoid mutex deadlock in MappableSeekableZStream::ensure
> 
> MappableSeekableZStream::ensure can deadlock in the following scenario,
> 
> 1) On-demand decompression causes SIGSEGV
> 2) SIGSEGV handler calls MappableSeekableZStream::ensure
> 3) MappableSeekableZStream::ensure locks its mutex
> 4) Profiler causes SIGPROF
> 5) SIGPROF handler is not available, causing SIGSEGV again
> 6) SIGSEGV handler calls MappableSeekableZStream::ensure again
> 7) MappableSeekableZStream::ensure deadlocks on already-owned mutex

Note there needs something else for that to be triggered: the code setting the SIGPROF handler needs to be in a different page than the handler code itself.
(In reply to Mike Hommey [:glandium] from comment #2)
> 
> Note there needs something else for that to be triggered: the code setting
> the SIGPROF handler needs to be in a different page than the handler code
> itself.

Right. Unfortunately the SIGPROF handler calls code from a few different modules, so the deadlock is quite likely to happen.

(In reply to Mike Hommey [:glandium] from comment #1)
> Comment on attachment 780648 [details] [diff] [review]
> Block signals to avoid mutex deadlock in MappableSeekableZStream::ensure
> 
> Review of attachment 780648 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Two things I'd like you to check:
> - what happens if a segv happens while the lock is held (you can test what
> happens if you make MappableSeekableZStream::ensure MOZ_CRASH randomly).

Seems like the process gets killed immediately without further action.

> - how well the profiler responds to its SIGPROF being delayed (probably
> fine).

I think BenWa is best to answer that.
Flags: needinfo?(bgirard)
Assignee

Comment 4

6 years ago
I think I have a better idea.
Assignee: nobody → mh+mozilla
Assignee

Comment 5

6 years ago
Also, I wonder if that couldn't be the cause of bug 886736, not with the profiler, but with something else on the system that would send signals.
Assignee

Comment 7

6 years ago
Comment on attachment 780883 [details] [diff] [review]
Allow faulty.lib's on-demand decompression to be reentrant

Jim, can you check this fixes it for you? I only tested on desktop.
Attachment #780883 - Flags: feedback?(nchen)
Assignee

Updated

6 years ago
Flags: needinfo?(bgirard)
Attachment #780883 - Flags: review?(nfroyd) → review+
Comment on attachment 780883 [details] [diff] [review]
Allow faulty.lib's on-demand decompression to be reentrant

Review of attachment 780883 [details] [diff] [review]:
-----------------------------------------------------------------

Yep, works fine.
Attachment #780883 - Flags: feedback?(nchen) → feedback+
https://hg.mozilla.org/mozilla-central/rev/4716f738ef0a
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Assignee

Updated

6 years ago
Attachment #780648 - Attachment is obsolete: true
Assignee

Comment 11

6 years ago
Comment on attachment 780883 [details] [diff] [review]
Allow faulty.lib's on-demand decompression to be reentrant

[Approval Request Comment]
Bug caused by bug 848764
User impact if declined: Enabling the profiler freezes Firefox for Android
Testing completed (on m-c, etc.): Landed on m-c a few days ago. Tested by jchen.
Risk to taking this patch (and alternatives if risky): It's possible this could lead to weird problems on some systems, but it shouldn't be more risky than the whole on-demand decompression feature (see bug 886736 for example). On-demand decompression can be disabled per-device, per-system-version, or globally, if it needs to be. This patch can also easily be backed out, too (there won't be any landing that will make that difficult)
String or IDL/UUID changes made by this patch: None
Attachment #780883 - Flags: approval-mozilla-aurora?
(In reply to Mike Hommey [:glandium] from comment #11)
> Comment on attachment 780883 [details] [diff] [review]
> Allow faulty.lib's on-demand decompression to be reentrant
> 
> [Approval Request Comment]
> Bug caused by bug 848764
> User impact if declined: Enabling the profiler freezes Firefox for Android
> Testing completed (on m-c, etc.): Landed on m-c a few days ago. Tested by
> jchen.
> Risk to taking this patch (and alternatives if risky): It's possible this
> could lead to weird problems on some systems,

Do you have testcases in mind that QA can help with ? Also can you please make sure to keep an eye on new regressions if any by this landing ?

> but it shouldn't be more risky
> than the whole on-demand decompression feature (see bug 886736 for example).
> On-demand decompression can be disabled per-device, per-system-version, or
> globally, if it needs to be. This patch can also easily be backed out, too
> (there won't be any landing that will make that difficult)
> String or IDL/UUID changes made by this patch: None
Attachment #780883 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee

Comment 14

6 years ago
(In reply to bhavana bajaj [:bajaj] from comment #12)
> Do you have testcases in mind that QA can help with ?

The best i can think of is stress testing starting fennec and display web pages on plenty of different devices.

> Also can you please make sure to keep an eye on new regressions if any by this landing ?

Sure
You need to log in before you can comment on or make changes to this bug.