Crash in [@ PR_MD_PR_POLL | mozilla::net::nsSocketTransportService::Run]
Categories
(Core :: Networking, defect, P2)
Tracking
()
People
(Reporter: gsvelto, Assigned: kershaw)
References
(Blocks 1 open bug)
Details
(Keywords: crash, sec-high, Whiteboard: [necko-triaged][necko-priority-review][post-critsmash-triage][adv-main107+r][adv-esr102.5+r])
Crash Data
Attachments
(2 files, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-beta+
tjr
:
sec-approval+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-esr102+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/1eef1d6e-094f-4bea-8e8c-0c4440220422
Reason: EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames of crashing thread:
0 nss3.dll PR_MD_PR_POLL nsprpub/pr/src/md/windows/w32poll.c:273
1 xul.dll mozilla::net::nsSocketTransportService::Run netwerk/base/nsSocketTransportService2.cpp:1150
2 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:467
3 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:300
4 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:324
5 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:306
6 xul.dll static nsThread::ThreadFunc xpcom/threads/nsThread.cpp:389
7 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:399
8 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:139
9 ucrtbase.dll thread_start<unsigned int , 1>
This appears to be some kind of wild-pointer access. The crashing address is either in rcx
or rdx
. When we're hitting rdx
it's usually a near-NULL access, but when we're hitting rcx
it's almost always a non-canonical address and it most likely looks like some kind of integer data. In at least one case - the crash above - rcx
also contains the poison pattern. So it doesn't look like a straight UAF and there are crashes which are stack overflows so we may be stomping over memory that's on the stack.
Assignee | ||
Comment 1•3 years ago
|
||
This looks like bug 1760611, which already has a patch ready to land.
Let's see if bug 1760611 can fix this.
Updated•3 years ago
|
Waiting for release of this patch to assess if has resolved in production release 104 2022-08-23.
Comment 3•3 years ago
|
||
Fix landed with bug 1782740 in 104.0b8 - no crashes since then.
Assignee | ||
Comment 4•3 years ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #3)
Fix landed with bug 1782740 in 104.0b8 - no crashes since then.
Let's close this one.
Comment 5•3 years ago
|
||
Uh, there seems to be a crash in 104 🤔
https://crash-stats.mozilla.org/report/index/188a5c0f-7aed-41c2-8335-2eef10220825
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 6•3 years ago
|
||
Hi Jed,
Since you fixed a similar issue in bug 1760611, do you probably have an idea about this one?
The crashes in this bug seem to be all on windows platform, but I can't figure out how to fix this.
It'd be really appreciated if you can provide some insights.
Thanks.
Comment 7•3 years ago
|
||
Windows select
seems to work differently: fd_set
is an array of fds, according to Microsoft's documentation, so it's not a bitset like typical Unix implementations and it won't overflow from being given a numerically large fd. (And, according to comments in NSPR, FD_SET
will silently fail if there are more fds than can fit in the array.)
The crash in comment #5 has the crash reason EXCEPTION_GUARD_PAGE
(and crashed in _chkstk
at the top of PR_MD_PR_POLL
), so that might be a stack overflow; are we setting the SocketTransportService thread's stack size to something that's too small?
And the crash linked in the initial bug report blames the call to PR_GetIdentitiesLayer
, so that's from dealing with NSPR's data structures, not fd_set
s.
So, I don't think this is related to bug 1760611 in a useful way.
Assignee | ||
Comment 8•3 years ago
|
||
(In reply to Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧ from comment #7)
Windows
select
seems to work differently:fd_set
is an array of fds, [according to Microsoft's documentation][ms-doc], so it's not a bitset like typical Unix implementations and it won't overflow from being given a numerically large fd. (And, according to [comments in NSPR][nspr-comment],FD_SET
will silently fail if there are more fds than can fit in the array.)The crash in comment #5 has the crash reason
EXCEPTION_GUARD_PAGE
(and crashed in_chkstk
at the top ofPR_MD_PR_POLL
), so that might be a stack overflow; are we setting the SocketTransportService thread's stack size to something that's too small?
Thanks for the feedback.
I think trying to increase the stack size might be a good way to go.
Assignee | ||
Comment 9•3 years ago
|
||
Assignee | ||
Comment 10•3 years ago
|
||
Comment on attachment 9297245 [details]
Bug 1767920 - Increase thread stack size on windows, r=#necko
Security Approval Request
- How easily could an exploit be constructed based on the patch?: Unknown. This patch only increases socket thread's stack size.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
- Which older supported branches are affected by this flaw?: all
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: No
- If not, how different, hard to create, and risky will they be?: I think we don't have to uplift this.
- How likely is this patch to cause regressions; how much testing does it need?: Low. I think increasing the stack size should be fine.
- Is Android affected?: No
Comment 11•3 years ago
|
||
Comment on attachment 9297245 [details]
Bug 1767920 - Increase thread stack size on windows, r=#necko
Approved to land and uplift after beta branches
Assignee | ||
Updated•3 years ago
|
![]() |
||
Comment 12•3 years ago
|
||
Increase thread stack size on windows, r=necko-reviewers,dragana
https://hg.mozilla.org/integration/autoland/rev/99eab50330ff5667772aece28f8d78f1c4fc0ed9
https://hg.mozilla.org/mozilla-central/rev/99eab50330ff
Comment 13•3 years ago
|
||
This is going to need a rebased patch for ESR102 uplift. It grafts cleanly to Beta as-landed.
Updated•3 years ago
|
Assignee | ||
Comment 14•3 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
This is going to need a rebased patch for ESR102 uplift. It grafts cleanly to Beta as-landed.
I think we don't need to uplift this to ESR, since the crash rate is low. Also, this patch does not really fix the crash. It only increases the stack size, since we still don't understand the root cause.
Comment 15•3 years ago
|
||
The patch landed in nightly and beta is affected.
:kershaw, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox107
towontfix
.
For more information, please visit auto_nag documentation.
Comment 16•3 years ago
|
||
(In reply to Kershaw Chang [:kershaw] from comment #14)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
This is going to need a rebased patch for ESR102 uplift. It grafts cleanly to Beta as-landed.
I think we don't need to uplift this to ESR, since the crash rate is low. Also, this patch does not really fix the crash. It only increases the stack size, since we still don't understand the root cause.
In general, sec-high bugs are always uplifted (and comment 11 seems to imply that as well). Please confirm with Tom and the security team that they're OK with not taking this fix on branches.
Assignee | ||
Comment 17•3 years ago
|
||
In general, sec-high bugs are always uplifted (and comment 11 seems to imply that as well). Please confirm with Tom and the security team that they're OK with not taking this fix on branches.
Hi Tom,
What do you think? Should we uplift this patch to beta and ESR?(In reply to Ryan VanderMeulen [:RyanVM] from comment #16)
Updated•2 years ago
|
Comment 18•2 years ago
|
||
Sorry, I remember typing a reply but I guess I never submitted it. I don't see why we shouldn't uplift it. Although it may not fix the issue, it makes the behavior universal at least, and if it can be exploited at a lower size but is more difficult at the higher size we don't look bad in not doing so.
Assignee | ||
Comment 19•2 years ago
|
||
Comment on attachment 9297245 [details]
Bug 1767920 - Increase thread stack size on windows, r=#necko
Beta/Release Uplift Approval Request
- User impact if declined: Possible crash.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: N/A
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This patch only increases the stack size of socket thread.
- String changes made/needed: N/A
- Is Android affected?: Yes
Comment 20•2 years ago
|
||
Comment on attachment 9297245 [details]
Bug 1767920 - Increase thread stack size on windows, r=#necko
Approved for 107.0b9
Comment 21•2 years ago
|
||
uplift |
Assignee | ||
Comment 22•2 years ago
|
||
Assignee | ||
Comment 23•2 years ago
|
||
Assignee | ||
Comment 24•2 years ago
|
||
Comment on attachment 9301594 [details]
Bug 1767920 - Increase thread stack size on windows (for ESR 102), r=#necko
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration:
- User impact if declined: Possible crash.
- Fix Landed on Version: 108
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This patch only incrrease the stack size of socket thread.
Updated•2 years ago
|
Comment 25•2 years ago
|
||
Comment on attachment 9301594 [details]
Bug 1767920 - Increase thread stack size on windows (for ESR 102), r=#necko
Approved for 102.5esr.
Comment 26•2 years ago
|
||
uplift |
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•