Closed
Bug 1362626
Opened 7 years ago
Closed 7 years ago
The lock in ChannelEventQueue::RunOrEnqueue() can be contended
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
DUPLICATE
of bug 1354455
Performance Impact | high |
People
(Reporter: ehsan.akhgari, Assigned: schien)
References
Details
(Whiteboard: [necko-active])
See this profile: https://perfht.ml/2pjFHPq This lock is grabbed here: <https://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/netwerk/ipc/ChannelEventQueue.h#150> The other place where the lock is held is <https://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/netwerk/ipc/ChannelEventQueue.cpp#47> where events are run in a loop. I'm not familiar with this code and don't know if this loop can have an expected upper bound... It seems prone to contention at least. This bug blocks a quantum release criteria use case. This same thing also shows up in another profile of the same use case: https://perfht.ml/2pjx8UX See https://docs.google.com/spreadsheets/d/1Kxn850VasyaG_WfRg3pMInW0hbIT8LP7pRPBYTIpdbM/edit#gid=622178261 for how to reproduce.
Assignee | ||
Comment 1•7 years ago
|
||
The mRunningMonitor is grabbed before the flush/suspension check because there can be a race condition that an event might be enqueued but not dequeued during the end of flushing operation. One solution is to find a lock-free algorithm to avoid that issue above. Another workaround is to release and re-grab the mRunningMonitor after processing a fixed number of events in queue, so that we can avoid the lock contention.
Assignee | ||
Comment 2•7 years ago
|
||
During the investigation, the solution to bug 1354455 might be able to fix the lock contention as well. With a correct fix of the lock-free algorithm we can remove the mRunningMonitor dependency between RunOrEnqueue and FlushQueue.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → schien
Updated•7 years ago
|
Whiteboard: [necko-active]
Reporter | ||
Updated•7 years ago
|
Whiteboard: [necko-active] → [necko-active][qf]
Updated•7 years ago
|
Whiteboard: [necko-active][qf] → [necko-active][qf:p1]
Assignee | ||
Comment 3•7 years ago
|
||
As comment #2 mentioned, the problematic code is removed after bug 1354455 is landed. So, we should be able to close this issue as well. @Jean is there anyone can help to verify the result on current code base? I tried to follow the test procedure from the google doc but could not figure out how to get the numbers for comparison.
Flags: needinfo?(jgong)
Assignee | ||
Comment 4•7 years ago
|
||
Close this issue based on comment #2. feel free to file another bug if there is still lock contention issue found during test.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Updated•7 years ago
|
Blocks: BackButton_Amazon
Comment 5•7 years ago
|
||
(In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #3) > As comment #2 mentioned, the problematic code is removed after bug 1354455 > is landed. So, we should be able to close this issue as well. > > @Jean is there anyone can help to verify the result on current code base? I > tried to follow the test procedure from the google doc but could not figure > out how to get the numbers for comparison. Duplicate bug resolved.
Flags: needinfo?(jgong)
Updated•2 years ago
|
Performance Impact: --- → P1
Whiteboard: [necko-active][qf:p1] → [necko-active]
You need to log in
before you can comment on or make changes to this bug.
Description
•