Closed
Bug 1364624
Opened 7 years ago
Closed 6 years ago
Consider using SRWLOCK instead of CRITICALSECTION for our windows Mutex impl
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla62
People
(Reporter: erahm, Assigned: erahm)
References
(Depends on 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
4.44 KB,
patch
|
erahm
:
review+
RyanVM
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr52-
jcristau
:
approval-mozilla-esr60+
|
Details | Diff | Splinter Review |
2.61 KB,
patch
|
erahm
:
review+
RyanVM
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr52-
jcristau
:
approval-mozilla-esr60+
|
Details | Diff | Splinter Review |
Apparently there are perf benefits. This should be doable now that we've dropped support for XP. Chromium did this a year ago [1], it looks pretty straightforward. [1] https://codereview.chromium.org/1864323002
Assignee | ||
Comment 1•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=76db7e81673a0eb02b6ad9df98331f4140369d6c
Assignee | ||
Comment 2•7 years ago
|
||
MozReview-Commit-ID: 6JpGEQyUFz
Attachment #8868236 -
Flags: review?(nfroyd)
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → erahm
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4e741e44d70914cd1f42c3bee967dab0a1399543
Comment 4•7 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK Review of attachment 8868236 [details] [diff] [review]: ----------------------------------------------------------------- This works for me, thanks. I have seen http://stackoverflow.com/a/37357503 as a suggestion that SRW locks might not be that much faster, but I have also seen https://github.com/rust-lang/rust/pull/20367/commits/ee1ca88213133a58f0a9d234f03babbebeb7c5d8#diff-907be25f650d12cff49637dde8c9ae2eR28 suggesting SRW locks are faster. The measurements in the Google bug was the first actual numbers I had seen...
Assignee | ||
Comment 5•7 years ago
|
||
Talos run pending, 20 retriggers with CRITICALSECTION vs SRWLOCK: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=026bbbf2e1514950803e591d1d42054c6b887a04&newProject=try&newRevision=24ea3a59c4cd5a9b3e1a86438badb8a278e2b2f5&framework=1&showOnlyImportant=0
Comment 6•7 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK Review of attachment 8868236 [details] [diff] [review]: ----------------------------------------------------------------- Don't know why my previous review didn't flip the r+...
Attachment #8868236 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 7•7 years ago
|
||
Nothing super obvious either way in the talos results, I guess lets land this and see what happens.
Assignee | ||
Comment 8•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b6d169feb92d98032768d4d7d1a611668c60954 Bug 1364624 - Switch from CRITICALSECTION to SRWLOCK. r=froydnj
Comment 9•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5b6d169feb92
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Assignee | ||
Comment 10•6 years ago
|
||
Lets try this again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=582a138a76616f49dfacce7de96384ecfa62e3e7
Comment 11•6 years ago
|
||
Pushed by erahm@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/3802f86e1bd1 Switch from CRITICALSECTION to SRWLOCK. r=froydnj
Assignee | ||
Comment 12•6 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK Approval Request Comment [Feature/Bug causing the regression]: CRITICAL_SECTION has poorer perf characteristics and a higher memory overhead than SRWLOCK. [User impact if declined]: Continued perf impact and memory overhead. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Landed previously for 5 weeks. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: N/A. [Is the change risky?]: Low-med risk. [Why is the change risky/not risky?]: Baked for 5 weeks before and we didn't see any clear issues, possible SRWLOCK can expose incorrect locking assumptions. [String changes made/needed]: N/A
Attachment #8868236 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 13•6 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: User impact if declined: Possible perf issues. Fix Landed on Version: 61 Risk to taking this patch (and alternatives if risky): Low-med risk. Baked for 5 weeks before and we didn't see any clear issues, possible SRWLOCK can expose incorrect locking assumptions. String or UUID changes made by this patch: See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8868236 -
Flags: approval-mozilla-esr52?
Comment 14•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3802f86e1bd1
Updated•6 years ago
|
Comment 15•6 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK We try to keep ESR changes to critical security patches or major stability problems. With the ESR 60 release coming up in a couple of weeks and only 2 more ESR 52 cycles to go, I don't think we should take this patch for esr52.
Attachment #8868236 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52-
Comment 16•6 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK On further discussion, the gains look significant and so let's take this for ESR52 as well as beta 60.
Attachment #8868236 -
Flags: approval-mozilla-esr52-
Attachment #8868236 -
Flags: approval-mozilla-esr52+
Attachment #8868236 -
Flags: approval-mozilla-beta?
Attachment #8868236 -
Flags: approval-mozilla-beta+
Comment 17•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/e21076f36898
Comment 18•6 years ago
|
||
Backed out changeset 3802f86e1bd1 (bug 1364624) for shutdown hangs on reftests. a=backout Log: https://treeherder.mozilla.org/logviewer.html#?job_id=175581647&repo=mozilla-inbound&lineNumber=3798 INFO - REFTEST TEST-PASS | file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-novalidate.html == file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-required-ref.html | image comparison, max difference: 0, number of differing pixels: 0 19:52:39 INFO - REFTEST TEST-END | file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-novalidate.html == file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-required-ref.html 19:52:39 INFO - REFTEST INFO | Slowest test took 147ms (file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-invalid-reset.html) 19:52:39 INFO - REFTEST INFO | Total canvas count = 4 19:52:39 INFO - [Parent 4952, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:52:39 INFO - [Child 1204, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:52:39 INFO - JavaScript error: chrome://reftest/content/reftest.jsm, line 1538: NS_ERROR_NOT_AVAILABLE: Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIPropertyBag2.getPropertyAsAString] 19:52:39 INFO - [Child 1204, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:52:39 INFO - [Child 5236, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:52:39 INFO - [Child 5236, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:52:39 INFO - JavaScript error: chrome://reftest/content/reftest.jsm, line 1538: NS_ERROR_NOT_AVAILABLE: Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIPropertyBag2.getPropertyAsAString] 19:52:39 INFO - [Parent 4952, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:52:39 INFO - JavaScript error: chrome://reftest/content/reftest.jsm, line 1538: NS_ERROR_NOT_AVAILABLE: Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIPropertyBag2.getPropertyAsAString] 19:52:39 INFO - *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping 19:52:39 INFO - 1524685959764 Marionette DEBUG Received observer notification xpcom-will-shutdown 19:52:39 INFO - 1524685959764 Marionette DEBUG New connections will no longer be accepted 19:52:39 INFO - [GPU 5532, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 19:53:43 WARNING - TEST-UNEXPECTED-FAIL | file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-novalidate.html | application terminated with exit code 1 19:53:43 INFO - REFTEST INFO | Downloading symbols from: https://queue.taskcluster.net/v1/task/JveFTRtOSwmgk8cAyW-aeg/artifacts/public/build/target.crashreporter-symbols.zip 19:53:47 INFO - REFTEST INFO | Copy/paste: Z:\task_1524684969\build\win32-minidump_stackwalk.exe c:\users\task_1524684969\appdata\local\temp\tmpiq1ocr.mozrunner\minidumps\e1282faf-67f8-40e3-8098-edafe5b906de.dmp c:\users\task_1524684969\appdata\local\temp\tmp1eglrk 19:54:01 INFO - REFTEST INFO | Saved minidump as Z:\task_1524684969\build\blobber_upload_dir\e1282faf-67f8-40e3-8098-edafe5b906de.dmp 19:54:01 INFO - REFTEST INFO | Saved app info as Z:\task_1524684969\build\blobber_upload_dir\e1282faf-67f8-40e3-8098-edafe5b906de.extra 19:54:01 ERROR - REFTEST PROCESS-CRASH | file:///Z:/task_1524684969/build/tests/reftest/tests/layout/reftests/css-ui-invalid/select/select-novalidate.html | application crashed [@ MOZ_CrashOOL] 19:54:01 INFO - Crash dump filename: c:\users\task_1524684969\appdata\local\temp\tmpiq1ocr.mozrunner\minidumps\e1282faf-67f8-40e3-8098-edafe5b906de.dmp 19:54:01 INFO - Operating system: Windows NT 19:54:01 INFO - 6.1.7601 Service Pack 1 19:54:01 INFO - CPU: x86 19:54:01 INFO - GenuineIntel family 6 model 45 stepping 7 19:54:01 INFO - 8 CPUs 19:54:01 INFO - GPU: UNKNOWN 19:54:01 INFO - Crash reason: EXCEPTION_BREAKPOINT 19:54:01 INFO - Crash address: 0x73e9e2bb 19:54:01 INFO - Process uptime: 69 seconds 19:54:01 INFO - Thread 46 (crashed) Backout: https://hg.mozilla.org/mozilla-central/rev/e33e5049d9f35e16893f5ba4ec823f69f04972dd https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://queue.taskcluster.net/v1/task/R_ncxU7xQVujC6PizsER8A/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1
Status: RESOLVED → REOPENED
status-firefox60:
fixed → ---
Flags: needinfo?(erahm)
Resolution: FIXED → ---
Assignee | ||
Comment 19•6 years ago
|
||
Okay so this is shutdown crash due to a dom worker hang detector triggering a crash: > 19:54:01 INFO - Thread 46 (crashed) > 19:54:01 INFO - 0 mozglue.dll!MOZ_CrashOOL [Assertions.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 33 + 0x0] > 19:54:01 INFO - eip = 0x73e9e2bb esp = 0x1ab4f818 ebp = 0x1ab4f818 ebx = 0x00000003 > 19:54:01 INFO - esi = 0x0a703800 edi = 0x0a780520 eax = 0x0a780520 ecx = 0x0088f00c > 19:54:01 INFO - edx = 0x726f7272 efl = 0x00000212 > 19:54:01 INFO - Found by: given as instruction pointer in context > 19:54:01 INFO - 1 xul.dll!mozilla::dom::workerinternals::RuntimeService::CrashIfHanging() [RuntimeService.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 2018 + 0x16] > 19:54:01 INFO - eip = 0x5ecc2601 esp = 0x1ab4f820 ebp = 0x1ab4f870 > 19:54:01 INFO - Found by: call frame info > 19:54:01 INFO - 2 xul.dll!static void mozilla::`anonymous namespace'::RunWatchdog(void *) [nsTerminator.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 213 + 0x5] We can see xpcom shutting down and blocking on grabbing the IPC monitor [1]: > 19:54:01 INFO - Thread 0 > 19:54:01 INFO - 0 ntdll.dll!KiFastSystemCallRet + 0x0 > 19:54:01 INFO - eip = 0x77c170b4 esp = 0x0029f368 ebp = 0x0029f3c0 ebx = 0x0029f3d0 > 19:54:01 INFO - esi = 0xe5e5e5e5 edi = 0x1247bbb0 eax = 0xe5e5e5e5 ecx = 0x00000040 > 19:54:01 INFO - edx = 0x124de080 efl = 0x00000247 > 19:54:01 INFO - Found by: given as instruction pointer in context > 19:54:01 INFO - 1 ntdll.dll!ZwWaitForKeyedEvent + 0xc > 19:54:01 INFO - eip = 0x77c169f4 esp = 0x0029f36c ebp = 0x0029f3c0 > 19:54:01 INFO - Found by: call frame info > 19:54:01 INFO - 2 ntdll.dll!RtlpWaitCouldDeadlock + 0x35 > 19:54:01 INFO - eip = 0x77bed9f3 esp = 0x0029f370 ebp = 0x0029f3c0 > 19:54:01 INFO - Found by: call frame info > 19:54:01 INFO - 3 mozglue.dll!static void arena_dalloc(void *, unsigned int, struct arena_t *) [mozjemalloc.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 3510 + 0x7] > 19:54:01 INFO - eip = 0x73e8401c esp = 0x0029f3a8 ebp = 0x0029f3c0 > 19:54:01 INFO - Found by: stack scanning > 19:54:01 INFO - 4 mozglue.dll!mozilla::detail::MutexImpl::lock() [Mutex_windows.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 27 + 0x7] > 19:54:01 INFO - eip = 0x73e8d097 esp = 0x0029f3c8 ebp = 0x0029f3f8 > 19:54:01 INFO - Found by: call frame info > 19:54:01 INFO - 5 mozglue.dll!mozilla::detail::MutexImpl::lock() [Mutex_windows.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 27 + 0x7] > 19:54:01 INFO - eip = 0x73e8d097 esp = 0x0029f3d8 ebp = 0x0029f3f8 > 19:54:01 INFO - Found by: call frame info > 19:54:01 INFO - 6 xul.dll!mozilla::ipc::MessageChannel::Close() [MessageChannel.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 2716 + 0x8] > 19:54:01 INFO - eip = 0x5db0ef3c esp = 0x0029f3e0 ebp = 0x0029f3f8 > 19:54:01 INFO - Found by: call frame info > 19:54:01 INFO - 7 xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *) [XPCOMInit.cpp:51c27d9d25eaca9db40ef561bd3f9cdab63bf9b0 : 843 + 0x5] The stack is a bit wonky though. I'm not seeing anything super obvious right now. [1] https://searchfox.org/mozilla-central/rev/78dbe34925f04975f16cb9a5d4938be714d41897/ipc/glue/MessageChannel.cpp#2705
Flags: needinfo?(erahm)
Assignee | ||
Comment 20•6 years ago
|
||
Okay so we're actually hanging when MonitorAutoUnlock grabs the lock again when it goes out of scope [1], which probably explains the intermittent nature of this. [1] https://searchfox.org/mozilla-central/rev/78dbe34925f04975f16cb9a5d4938be714d41897/ipc/glue/MessageChannel.cpp#2716
Comment 21•6 years ago
|
||
Backed out changeset e21076f36898 (bug 1364624) for shutdown hangs on reftests. a=backout from beta Backout: https://hg.mozilla.org/releases/mozilla-beta/rev/e30deae7bb9432539f3ab66be4c5517392d9cff6
Comment 22•6 years ago
|
||
This patch also caused extreme breakage of accessibility on Windows when AccessibleHandler.dll is registered system wide (which it is by the installer and the updater, as this is a COM component needed for acceptable accessibility performance). See bug 1456991. 1. I built latest central (which has this patch backed out as per comment 18). 2. Registered AccessibleHandler.dll, ran Firefox and tested with the NVDA screen reader. All works as expected. 3. I re-applied 3802f86e1bd1. 4. Ran Firefox and tested with the NVDA screen reader. Bustage!
Updated•6 years ago
|
status-firefox55:
fixed → ---
status-firefox60:
--- → affected
status-firefox-esr52:
--- → affected
Target Milestone: mozilla55 → ---
Assignee | ||
Comment 23•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=731c1000f0f3b25d8ca3d869bc8fa4afdd0da4c5
Comment 24•6 years ago
|
||
Wontfix for 60.0 as we're entering RC week.
Comment 25•6 years ago
|
||
Comment on attachment 8868236 [details] [diff] [review] Switch from CRITICALSECTION to SRWLOCK Resetting uplift approval flags as this was backed out.
Attachment #8868236 -
Flags: approval-mozilla-esr52+
Attachment #8868236 -
Flags: approval-mozilla-beta+
Assignee | ||
Comment 26•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8abe49246936d69c64c15761cdc1d93bf312c48c
Assignee | ||
Comment 27•6 years ago
|
||
aklotz, do you think you could help us debug what went horribly wrong with AccessibleHandler.dll (comment 22)? dmajor, I have an usubstantiated hunch that the code generation got weird for `MessageChannel::Close` [1] on opt/pgo builds. Any chance you could help take a look at that? [1] https://searchfox.org/mozilla-central/rev/78dbe34925f04975f16cb9a5d4938be714d41897/ipc/glue/MessageChannel.cpp#2700-2745
Flags: needinfo?(dmajor)
Flags: needinfo?(aklotz)
Assignee | ||
Comment 28•6 years ago
|
||
Switching from an AutoLock to manually calling lock/unlock seems to have fixed the shutdown hang: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8abe49246936d69c64c15761cdc1d93bf312c48c Switching to clang-cl also seems to have fixed the shutdown hang: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ed1126240a049ac302c5f7eb6769d7c810e95b03 That seems to support the codegen hypothesis, but it's also possible this issue doesn't repro on try or I need more than 5 runs per reftest.
Assignee | ||
Comment 29•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c5c9eaa78ae889796b2a604a37285ce27f956e91
Assignee | ||
Comment 30•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=256620673a9a53c59439a147cd32af7eee25ab29
Assignee | ||
Comment 31•6 years ago
|
||
Okay and just to double check I've got a build with the origin changes going to see if this is some sort of try issue: https://treeherder.mozilla.org/#/jobs?repo=try&revision=529b9cc4a84a547d20274767de18306bec083436 And build that is a straight conversion of the auto(un)locks to manual (un)locks: https://treeherder.mozilla.org/#/jobs?repo=try&revision=87fc33cf7daa67c1b7f698cf0b8e016c55dc0182 Those should both fail, if only the first one fails I'd feel more comfortable pointing to a codegen issue. If neither fails I'm not sure what we can do.
Comment 32•6 years ago
|
||
I can easily repro the a11y problem, but there is no smoking gun. I do see some debug spew that points to some problems, but it is going to take some time for me to figure out where that is coming from.
Assignee | ||
Comment 33•6 years ago
|
||
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #31) > Okay and just to double check I've got a build with the origin changes going > to see if this is some sort of try issue: > > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=529b9cc4a84a547d20274767de18306bec083436 The original patch does repro on try.
Comment 34•6 years ago
|
||
(In reply to Aaron Klotz [:aklotz] from comment #32) > I can easily repro the a11y problem, but there is no smoking gun. I do see > some debug spew that points to some problems, but it is going to take some > time for me to figure out where that is coming from. I found it! There is a Mutex reentry. I have filed bug 1459085 to fix this.
Depends on: 1459085
Flags: needinfo?(aklotz)
Assignee | ||
Comment 35•6 years ago
|
||
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #31) > And build that is a straight conversion of the auto(un)locks to manual > (un)locks: > > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=87fc33cf7daa67c1b7f698cf0b8e016c55dc0182 This fails as well at roughly the same spot (so when we lock again when scoping out of the if). To summarize from the various pushes: if we remove the extra lock (scoping out of if) / unlock (scoping out of the function) pair things work fine. We're clearly unlocking before calling `NotifyMaybeChannelError`, so again either there's something weird in the codegen optimizing out the unlock or possibly we're leaving the lock locked in the `NotifyMaybeChannelError` codepath, but if that were the case I'd expect to see an assertion in the deadlock detector on debug builds.
Comment 36•6 years ago
|
||
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #35) > (In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment > #31) > > And build that is a straight conversion of the auto(un)locks to manual > > (un)locks: > > > > https://treeherder.mozilla.org/#/ > > jobs?repo=try&revision=87fc33cf7daa67c1b7f698cf0b8e016c55dc0182 > > This fails as well at roughly the same spot (so when we lock again when > scoping out of the if). I think this is the key. The RAII objects result in a "superfluous" Lock+Unlock pair at line 1.32-1.36 that you had optimized out in your earlier translation to manual locking code. https://hg.mozilla.org/try/rev/256620673a9a#l1.32 I am still of the opinion that we should try to explain this with run-of-the-mill code misbehavior before turning to codegen problems. What if the Unlock at line 1.28 woke somebody up, and they took some action, that either made the Lock at line 1.32 take a long time, or be unsafe/incorrect somehow?
Flags: needinfo?(dmajor)
Assignee | ||
Comment 37•6 years ago
|
||
(In reply to David Major [:dmajor] from comment #36) > I am still of the opinion that we should try to explain this with > run-of-the-mill code misbehavior before turning to codegen problems. What if > the Unlock at line 1.28 woke somebody up, and they took some action, that > either made the Lock at line 1.32 take a long time, or be unsafe/incorrect > somehow? In theory we should see that in the CRITICAL_SECTION version as well, but SRWLock isn't fair so maybe there's a different ordering. OTOH the odd looking stack definitely makes this look like a re-entrant lock.
Assignee | ||
Comment 38•6 years ago
|
||
This switches over to manaually managing the locking in MessageChannel::Close in order to avoid a deadlock on msvc opt builds. It has the added benefit of avoid a superfluos lock/unlock pair.
Attachment #8973275 -
Flags: review?(nfroyd)
Assignee | ||
Comment 39•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Attachment #8868236 -
Attachment is obsolete: true
Assignee | ||
Comment 40•6 years ago
|
||
Comment on attachment 8973276 [details] [diff] [review] Part 2: Switch from CRITICALSECTION to SRWLOCK Carry forward r+ (this is just a title change)
Attachment #8973276 -
Flags: review+
Assignee | ||
Comment 41•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f23f79d6dfce57f59fb1231e35390c56f25495a3
Comment 42•6 years ago
|
||
Comment on attachment 8973275 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close Review of attachment 8973275 [details] [diff] [review]: ----------------------------------------------------------------- Nit: "superfluous" and "manually managing" in the commit message. ::: ipc/glue/MessageChannel.cpp @@ +2709,5 @@ > + mMonitor->Lock(); > + > + // Instead just use a ScopeExit to manage the unlock. > + RefPtr<RefCountedMonitor> monitor(mMonitor); > + auto exit = MakeScopeExit([monitor] () { Should this be &monitor instead, so we're not taking a separate strong reference? Or [monitor = Move(monitor)] or something like that?
Attachment #8973275 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 43•6 years ago
|
||
This switches over to manually managing the locking in MessageChannel::Close in order to avoid a deadlock on msvc opt builds. It has the added benefit of avoid a superfluous lock/unlock pair.
Assignee | ||
Updated•6 years ago
|
Attachment #8973275 -
Attachment is obsolete: true
Assignee | ||
Comment 44•6 years ago
|
||
Comment on attachment 8973341 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close Updated for typos in patch message, switched to Move() per review. Carrying forward r+.
Attachment #8973341 -
Flags: review+
Assignee | ||
Comment 45•6 years ago
|
||
Given the code freeze, should I hold off until next week? I've got a few hundred reftest runs in so I feel okay about this being fixed. The a11y issue seems to be fixed and landed, but I can't guarantee their won't be another weird issue that doesn't get tested in our infra. Given the timing maybe just getting this queued up for the next dot-releases is the best option.
Flags: needinfo?(ryanvm)
Comment 46•6 years ago
|
||
Yes, this missed the boat for previous cycle. We're into the Fx62 nightly cycle now.
Flags: needinfo?(ryanvm)
Comment 47•6 years ago
|
||
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #37) > maybe there's a different ordering. OTOH the odd > looking stack definitely makes this look like a re-entrant lock. I guess mutex re-entry wasn't the problem in this case. However, it certainly was for accessibility (bug 1459085). Is there any reason we don't assert when re-entering a mutex?
Assignee | ||
Comment 48•6 years ago
|
||
(In reply to James Teh [:Jamie] from comment #47) > (In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment > #37) > > maybe there's a different ordering. OTOH the odd > > looking stack definitely makes this look like a re-entrant lock. > > I guess mutex re-entry wasn't the problem in this case. However, it > certainly was for accessibility (bug 1459085). Is there any reason we don't > assert when re-entering a mutex? The deadlock detector should assert in debug builds. From what I can tell, the accessibility DLL code isn't tested in automation.
Comment 49•6 years ago
|
||
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #48) > The deadlock detector should assert in debug builds. From what I can tell, > the accessibility DLL code isn't tested in automation. Indeed it isn't. I just wanted to make sure there was an assert at the right point in the stack which would have allowed us to catch this easily in debug builds.
Assignee | ||
Comment 50•6 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c9daea7564eff41710975ff34511ea14156c130 Bug 1364624 - Part 1: Manually manage locks in MessageChannel::Close. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/fa9c21a4ab9443872a79617802039974539412d6 Bug 1364624 - Part 2: Switch from CRITICALSECTION to SRWLOCK. r=froydnj
Comment 51•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3c9daea7564e https://hg.mozilla.org/mozilla-central/rev/fa9c21a4ab94
Status: REOPENED → RESOLVED
Closed: 7 years ago → 6 years ago
status-firefox62:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Assignee | ||
Comment 52•6 years ago
|
||
Comment on attachment 8973341 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close Lets do this again... Approval Request Comment [Feature/Bug causing the regression]: CRITICAL_SECTION has poorer perf characteristics and a higher memory overhead than SRWLOCK. [User impact if declined]: Continued perf impact and memory overhead. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Baked on Nightly for 4 days. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: Bug 1459085, but that's already in 61. [Is the change risky?]: Low-med risk. [Why is the change risky/not risky?]: Baked for 5 weeks before, now on Nightly for a few days and we didn't see any clear issues. It's possible SRWLOCK can expose incorrect locking assumptions and/or deadlocks due to reentrancy. We think those bugs have been fixed. [String changes made/needed]: N/A
Attachment #8973341 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 53•6 years ago
|
||
Comment on attachment 8973276 [details] [diff] [review] Part 2: Switch from CRITICALSECTION to SRWLOCK Approval Request Comment See patch 1.
Attachment #8973276 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 54•6 years ago
|
||
Comment on attachment 8973341 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close Re-requesting ESR approval. [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: User impact if declined: Possible perf issues. Fix Landed on Version: 60, uplift requested for 61. Risk to taking this patch (and alternatives if risky): Low-med risk. Baked for 5 weeks before, 4 days on Nightly now and we didn't see any clear issues, possible SRWLOCK can expose incorrect locking assumptions. String or UUID changes made by this patch: N/A This needs to land with bug 1459085 which is also nominated for ESR60.
Attachment #8973341 -
Flags: approval-mozilla-esr60?
Assignee | ||
Comment 55•6 years ago
|
||
Comment on attachment 8973276 [details] [diff] [review] Part 2: Switch from CRITICALSECTION to SRWLOCK See patch 1.
Attachment #8973276 -
Flags: approval-mozilla-esr60?
Comment 56•6 years ago
|
||
Comment on attachment 8973341 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close Taking this for 61.0b6 so we can see how it fares with a wider audience.
Attachment #8973341 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•6 years ago
|
Attachment #8973276 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 57•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/9c083d037ce2 https://hg.mozilla.org/releases/mozilla-beta/rev/7a8b8efc8098
Comment 58•6 years ago
|
||
Comment on attachment 8973341 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close locking change for 60.1esr
Attachment #8973341 -
Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Updated•6 years ago
|
Attachment #8973276 -
Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Comment 59•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr60/rev/539d7238171d https://hg.mozilla.org/releases/mozilla-esr60/rev/54937944b922
Comment 60•6 years ago
|
||
Please nominate this for ESR52 also when you get a chance.
Flags: needinfo?(erahm)
Assignee | ||
Comment 61•6 years ago
|
||
Comment on attachment 8973276 [details] [diff] [review] Part 2: Switch from CRITICALSECTION to SRWLOCK [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: User impact if declined: Possible perf issues. Fix Landed on Version: 60, uplift approved for 61 and esr 60. Risk to taking this patch (and alternatives if risky): Low-med risk. Baked for 5 weeks before, 4 days on Nightly now and we didn't see any clear issues, possible SRWLOCK can expose incorrect locking assumptions. String or UUID changes made by this patch: N/A This needs to land with bug 1459085.
Flags: needinfo?(erahm)
Attachment #8973276 -
Flags: approval-mozilla-esr52?
Assignee | ||
Comment 62•6 years ago
|
||
Comment on attachment 8973341 [details] [diff] [review] Part 1: Manually manage locks in MessageChannel::Close see patch 2 request
Attachment #8973341 -
Flags: approval-mozilla-esr52?
Assignee | ||
Comment 63•6 years ago
|
||
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #61) > Comment on attachment 8973276 [details] [diff] [review] > Part 2: Switch from CRITICALSECTION to SRWLOCK > > [Approval Request Comment] > If this is not a sec:{high,crit} bug, please state case for ESR > consideration: > User impact if declined: Possible perf issues. > Fix Landed on Version: 60, uplift approved for 61 and esr 60. > Risk to taking this patch (and alternatives if risky): Low-med risk. Baked > for 5 weeks before, 4 days on Nightly now and we didn't see any clear > issues, possible SRWLOCK can expose incorrect locking assumptions. > String or UUID changes made by this patch: N/A > > This needs to land with bug 1459085. Please ni aklotz again on bug 1459085 to verify that it isn't a blocker for this bug, he indicated it's not but I'm not 100% sure of that.
Comment 64•6 years ago
|
||
https://bugzilla.mozilla.org/show_bug.cgi?id=1459085#c6 sounds like we're fine for esr52.
Comment 65•6 years ago
|
||
Comment on attachment 8973276 [details] [diff] [review] Part 2: Switch from CRITICALSECTION to SRWLOCK approved for 52.9esr
Attachment #8973276 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Updated•6 years ago
|
Attachment #8973341 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Assignee | ||
Comment 67•6 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #66) > Part 2 needs rebasing for ESR52. Oh right, we reworked the mutex code since then. I'm on PTO for the next week, Nathan do you think you can take a look? Otherwise I'll take a look when I'm back.
Flags: needinfo?(nfroyd)
Comment 68•6 years ago
|
||
(In reply to Out until June 4 - Eric Rahm [:erahm] (please no mozreview requests) from comment #67) > (In reply to Ryan VanderMeulen [:RyanVM] from comment #66) > > Part 2 needs rebasing for ESR52. > > Oh right, we reworked the mutex code since then. I'm on PTO for the next > week, Nathan do you think you can take a look? Otherwise I'll take a look > when I'm back. Sure, np. Actually, yes problem. Because this requires taking bug 1312086 and bug 1312087 as well. I can do/have done the port already, but how comfortable are we porting all of that to ESR52 at this stage? Those changes have gone through a lot of testing at this point, but that is a lot of churn for a...ah, neglible (?) performance win.
Flags: needinfo?(nfroyd) → needinfo?(ryanvm)
Comment 69•6 years ago
|
||
No, not worth it.
Updated•6 years ago
|
Attachment #8973341 -
Flags: approval-mozilla-esr52+ → approval-mozilla-esr52-
Updated•6 years ago
|
Attachment #8973276 -
Flags: approval-mozilla-esr52+ → approval-mozilla-esr52-
You need to log in
before you can comment on or make changes to this bug.
Description
•