Closed Bug 897723 Opened 12 years ago Closed 12 years ago

MappableSeekableZStream::ensure can deadlock when profiler is on

Categories

(Core :: mozglue, defect)

All
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25
Tracking Status
firefox23 --- unaffected
firefox24 + fixed
firefox25 --- fixed

People

(Reporter: jchen, Assigned: glandium)

References

Details

Attachments

(1 file, 1 obsolete file)

MappableSeekableZStream::ensure can deadlock in the following scenario, 1) On-demand decompression causes SIGSEGV 2) SIGSEGV handler calls MappableSeekableZStream::ensure 3) MappableSeekableZStream::ensure locks its mutex 4) Profiler causes SIGPROF 5) SIGPROF handler is not available, causing SIGSEGV again 6) SIGSEGV handler calls MappableSeekableZStream::ensure again 7) MappableSeekableZStream::ensure deadlocks on already-owned mutex This can conceivably happen with other signals, but I don't know if we do use other signals. FWIW, I verified that the attached patch fixes the issue by blocking signals during the lock.
Attachment #780648 - Flags: feedback?(mh+mozilla)
Comment on attachment 780648 [details] [diff] [review] Block signals to avoid mutex deadlock in MappableSeekableZStream::ensure Review of attachment 780648 [details] [diff] [review]: ----------------------------------------------------------------- Two things I'd like you to check: - what happens if a segv happens while the lock is held (you can test what happens if you make MappableSeekableZStream::ensure MOZ_CRASH randomly). - how well the profiler responds to its SIGPROF being delayed (probably fine).
Attachment #780648 - Flags: feedback?(mh+mozilla)
(In reply to Jim Chen [:jchen :nchen] from comment #0) > Created attachment 780648 [details] [diff] [review] > Block signals to avoid mutex deadlock in MappableSeekableZStream::ensure > > MappableSeekableZStream::ensure can deadlock in the following scenario, > > 1) On-demand decompression causes SIGSEGV > 2) SIGSEGV handler calls MappableSeekableZStream::ensure > 3) MappableSeekableZStream::ensure locks its mutex > 4) Profiler causes SIGPROF > 5) SIGPROF handler is not available, causing SIGSEGV again > 6) SIGSEGV handler calls MappableSeekableZStream::ensure again > 7) MappableSeekableZStream::ensure deadlocks on already-owned mutex Note there needs something else for that to be triggered: the code setting the SIGPROF handler needs to be in a different page than the handler code itself.
(In reply to Mike Hommey [:glandium] from comment #2) > > Note there needs something else for that to be triggered: the code setting > the SIGPROF handler needs to be in a different page than the handler code > itself. Right. Unfortunately the SIGPROF handler calls code from a few different modules, so the deadlock is quite likely to happen. (In reply to Mike Hommey [:glandium] from comment #1) > Comment on attachment 780648 [details] [diff] [review] > Block signals to avoid mutex deadlock in MappableSeekableZStream::ensure > > Review of attachment 780648 [details] [diff] [review]: > ----------------------------------------------------------------- > > Two things I'd like you to check: > - what happens if a segv happens while the lock is held (you can test what > happens if you make MappableSeekableZStream::ensure MOZ_CRASH randomly). Seems like the process gets killed immediately without further action. > - how well the profiler responds to its SIGPROF being delayed (probably > fine). I think BenWa is best to answer that.
Flags: needinfo?(bgirard)
I think I have a better idea.
Assignee: nobody → mh+mozilla
Also, I wonder if that couldn't be the cause of bug 886736, not with the profiler, but with something else on the system that would send signals.
Comment on attachment 780883 [details] [diff] [review] Allow faulty.lib's on-demand decompression to be reentrant Jim, can you check this fixes it for you? I only tested on desktop.
Attachment #780883 - Flags: feedback?(nchen)
Flags: needinfo?(bgirard)
Attachment #780883 - Flags: review?(nfroyd) → review+
Comment on attachment 780883 [details] [diff] [review] Allow faulty.lib's on-demand decompression to be reentrant Review of attachment 780883 [details] [diff] [review]: ----------------------------------------------------------------- Yep, works fine.
Attachment #780883 - Flags: feedback?(nchen) → feedback+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Attachment #780648 - Attachment is obsolete: true
Comment on attachment 780883 [details] [diff] [review] Allow faulty.lib's on-demand decompression to be reentrant [Approval Request Comment] Bug caused by bug 848764 User impact if declined: Enabling the profiler freezes Firefox for Android Testing completed (on m-c, etc.): Landed on m-c a few days ago. Tested by jchen. Risk to taking this patch (and alternatives if risky): It's possible this could lead to weird problems on some systems, but it shouldn't be more risky than the whole on-demand decompression feature (see bug 886736 for example). On-demand decompression can be disabled per-device, per-system-version, or globally, if it needs to be. This patch can also easily be backed out, too (there won't be any landing that will make that difficult) String or IDL/UUID changes made by this patch: None
Attachment #780883 - Flags: approval-mozilla-aurora?
(In reply to Mike Hommey [:glandium] from comment #11) > Comment on attachment 780883 [details] [diff] [review] > Allow faulty.lib's on-demand decompression to be reentrant > > [Approval Request Comment] > Bug caused by bug 848764 > User impact if declined: Enabling the profiler freezes Firefox for Android > Testing completed (on m-c, etc.): Landed on m-c a few days ago. Tested by > jchen. > Risk to taking this patch (and alternatives if risky): It's possible this > could lead to weird problems on some systems, Do you have testcases in mind that QA can help with ? Also can you please make sure to keep an eye on new regressions if any by this landing ? > but it shouldn't be more risky > than the whole on-demand decompression feature (see bug 886736 for example). > On-demand decompression can be disabled per-device, per-system-version, or > globally, if it needs to be. This patch can also easily be backed out, too > (there won't be any landing that will make that difficult) > String or IDL/UUID changes made by this patch: None
Attachment #780883 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to bhavana bajaj [:bajaj] from comment #12) > Do you have testcases in mind that QA can help with ? The best i can think of is stress testing starting fennec and display web pages on plenty of different devices. > Also can you please make sure to keep an eye on new regressions if any by this landing ? Sure
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: