Closed Bug 1323150 Opened 8 years ago Closed 8 years ago

Crash [@ ReadDBEntry ]

Tracking

(firefox51+ wontfix, firefox52- fixed, firefox53+ fixed)

Status:

RESOLVED FIXED

Milestone:

3.29

Tracking Flags:

Tracking

Status

firefox51

wontfix

firefox52

fixed

firefox53

fixed

People

(Reporter: cbook, Assigned: franziskus)

References

(
URL
)

Details

(Keywords: crash, Whiteboard: [adv-main52+])

Attachments

(2 files)

bughunter stack 8 years ago Carsten Book [:Tomcat] 141.12 KB, text/plain		Details
bug1323150.patch 8 years ago Franziskus Kiefer [:franziskus] 1.40 KB, patch	mt : review+ rrelyea : feedback+ franziskus : feedback? wtc	Details \| Diff \| Splinter Review

Carsten Book [:Tomcat]

Reporter

Description

•

8 years ago

Attached file bughunter stack — Details

found via bughunter and reproduced on latest windows aurora opt build (also bughunter reports this for trunk -> beta on opt builds for windows). Seems a windows only crash Steps to reproduce -> Load http://jobs.work-in-bavaria.de/ --> Crash Bughunter flagged this exploitable high. Opt crash id : https://crash-stats.mozilla.com/report/index/4802573c-18ef-452d-adac-e00b22161213

Carsten Book [:Tomcat]

Reporter

Comment 1

•

8 years ago

[Tracking Requested - why for this release]: Bughunter opt crash on real world website Franziskus, Kai, i guess this something for you guys. Could you take a look ?

status-firefox51: --- → affected

status-firefox52: --- → affected

status-firefox53: --- → affected

tracking-firefox51: --- → ?

tracking-firefox52: --- → ?

tracking-firefox53: --- → ?

Flags: needinfo?(kaie)

Flags: needinfo?(franziskuskiefer)

Gerry Chang [:gchang]

Comment 2

•

8 years ago

Track 51+ as this affected opt build.

tracking-firefox51: ? → +

Marcia Knous [:marcia]

Comment 3

•

8 years ago

Makes sense to track this reproducible crash for 52/53.

tracking-firefox52: ? → +

tracking-firefox53: ? → +

Tyson Smith [:tsmith]

Comment 4

•

8 years ago

I could not repro with a x86_64 build on Windows or Linux, x86 only maybe?

Hardware: Unspecified → x86

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Group: core-security → crypto-core-security

Kai Engert [:KaiE:]

Comment 5

•

8 years ago

This bug was reported on Dec 13. The most recent change of NSS on aurora was an update to this NSS revision: https://hg.mozilla.org/projects/nss/rev/6c26f0cd19ba The uplift was made on Nov 28: https://hg.mozilla.org/releases/mozilla-aurora/file/071d92d89379/security/nss/TAG-INFO https://bugzilla.mozilla.org/show_bug.cgi?id=1305970#c30 If the NSS change is responsible for this crash, then all builds between Nov 29 and Dec 13 should crash, can this be confirmed? Can it be confirmed that Aurora builds before Nov 28 didn't crash? Before Nov 28, aurora was using NSS revision 5d2424d699a0 from Nov 11: https://hg.mozilla.org/projects/nss/rev/5d2424d699a0 https://hg.mozilla.org/releases/mozilla-aurora/file/3f1dcd04bc47/security/nss/TAG-INFO If the above assumptions are correct, the code causing the crash must have landed between the above two NSS revisions. That was a rather big diff, 14000 lines, to view it run (inside an NSS tree): hg diff -r 5d2424d699a0 -r 6c26f0cd19ba However, at first glance, I cannot see any changes to the code that's at the top in the call stack (dbm/softokn) If the above NSS revision really introduced the crash, it must have been caused by the changes to lib/ssl

Flags: needinfo?(kaie)

Kai Engert [:KaiE:]

Comment 6

•

8 years ago

Carsten, would be able to help confirming which builds crash, and which don't, see my previous comment? Thank you.

Flags: needinfo?(cbook)

Franziskus Kiefer [:franziskus]

Assignee

Comment 7

•

8 years ago

I'm looking into this. I can produce a crash with that site. A different one though. There's definitely something going on here that we have to fix.

Carsten Book [:Tomcat]

Reporter

Comment 8

•

8 years ago

seeem franziskus got a reproducible crash now , so letting him comment

Flags: needinfo?(cbook)

Martin Thomson [:mt:]

Comment 9

•

8 years ago

I'm not getting a crash on trunk. The cert has lots of subjectAltNames. Those that I tested don't show the same issue for me either. There are changes to certificate handling in the range that Kai provided, but this is a TLS 1.2 server and those changes are TLS 1.3 only. Franziskus, I would look at the changes to ssl3_CompleteHandleCertificate() to start with.

Franziskus Kiefer [:franziskus]

Assignee

Comment 10

•

8 years ago

This crashes reproducibly on Win10 32-bit. I don't think this is caused by any changes we did. The intermediate certificate on the page is new (Dec 2) and as mt points out pretty large. This seems to be a bug in the legacydb. Trying to map the file to memory at [1] fails. One solution would be to always emulate the mapping here, which seems to solve the issue for me. I'm not entirely sure yet what exactly is going wrong here and there might be a better fix for this but I don't think it's worth spending too much time on this considering that it's the legacydb (Firefox should finally do the switch). [1] https://searchfox.org/nss/rev/8f7b205fa456a79d9b387ccf1717c1a8c64c36b1/lib/softoken/legacydb/dbmshim.c#347

Assignee: nobody → nobody

Component: Security: PSM → Libraries

Flags: needinfo?(franziskuskiefer)

Product: Core → NSS

Version: unspecified → trunk

Franziskus Kiefer [:franziskus]

Assignee

Comment 11

•

8 years ago

Attached patch bug1323150.patch — Details — Splinter Review

This is the fix I'd suggest. This might influence performance though as we always read the file instead of mapping it to memory. Unfortunately I can't find the reason this is failing and only handle that specific case. I think input from Bob or Wan-Teh could help here.

Assignee: nobody → franziskuskiefer

Attachment #8822144 - Flags: feedback?(wtc)

Attachment #8822144 - Flags: feedback?(rrelyea)

Attachment #8822144 - Flags: feedback?(martin.thomson)

Martin Thomson [:mt:]

Comment 12

•

8 years ago

Comment on attachment 8822144 [details] [diff] [review] bug1323150.patch Review of attachment 8822144 [details] [diff] [review]: ----------------------------------------------------------------- It's possible that we're running into memory limits. win32 in particular is hard on RAM and adding a large memory mapped file to that probably chews up a lot. Let's see what Bob has to add.

Wan-Teh Chang

Comment 13

•

8 years ago

Comment on attachment 8822144 [details] [diff] [review] bug1323150.patch Review of attachment 8822144 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/softoken/legacydb/dbmshim.c @@ +333,5 @@ > goto loser; > } > > len = dbs_getBlobSize(data); > + /* PR_MemMap fails on Windows for larger certificates. Let's always use We should at least add a link to this bug report in the comment "PR_MemMap fails on Windows for larger certificates." What is the return value of PR_GetError() after PR_MemMap() returns NULL? This info should be in the bug report. It seems that this is the paging file size limitation mentioned in MSDN: https://msdn.microsoft.com/en-us/library/windows/desktop/aa366761(v=vs.85).aspx If a file mapping object is backed by the paging file (CreateFileMapping is called with the hFile parameter set to INVALID_HANDLE_VALUE), the paging file must be large enough to hold the entire mapping. If it is not, MapViewOfFile fails. The initial contents of the pages in a file mapping object backed by the paging file are 0 (zero). It would be good to confirm that.

Robert Relyea

Comment 14

•

8 years ago

Comment on attachment 8822144 [details] [diff] [review] bug1323150.patch Review of attachment 8822144 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/softoken/legacydb/dbmshim.c @@ +335,5 @@ > > len = dbs_getBlobSize(data); > + /* PR_MemMap fails on Windows for larger certificates. Let's always use > + * the emulated map, i.e. read the file. */ > + addr = dbs_EmulateMap(filed, len); yes, I wondered what WTC does as well. My first thought is to simply always fall back to dbs_EmulateMap() if the map failed (so the majority of the cases we still get the "benefit" of file mapping). OTOH, this is yet another reason to avoid legacydb. It's a time bomb ready to go off (the code is question predates even me, so I have no idea why we are using memory mapping rather than just reading the file). bob

Franziskus Kiefer [:franziskus]

Assignee

Comment 15

•

8 years ago

> We should at least add a link to this bug report in the comment "PR_MemMap fails on Windows for larger certificates." yep! > What is the return value of PR_GetError() after PR_MemMap()returns NULL? This info should be in the bug report. > simply always fall back to dbs_EmulateMap() if the map failed The problem is that PR_MemMap does NOT return NULL and doesn't set an error code (not entirely sure about the error code but I don't have the right VM with me to check). That's why I have trouble catching an error and only emulate the map in the error case. > OTOH, this is yet another reason to avoid legacydb. It's a time bomb ready to go off (the code is question predates even me, so I have no idea why we are using memory mapping rather than just reading the file). I totally agree. We should press Firefox harder to get off the legacydb.

Martin Thomson [:mt:]

Comment 16

•

8 years ago

Comment on attachment 8822144 [details] [diff] [review] bug1323150.patch Review of attachment 8822144 [details] [diff] [review]: ----------------------------------------------------------------- OK, it looks like you have the feedback you need. When this change lands in m-c, it might pay to watch for Talos regressions though. Please include the requested comments, and maybe a link to the MSDN article Wan-Teh found.

Attachment #8822144 - Flags: feedback?(martin.thomson) → review+

Robert Relyea

Comment 17

•

8 years ago

Comment on attachment 8822144 [details] [diff] [review] bug1323150.patch feedback given, patch looks fine based on Franziscus's comments.

Attachment #8822144 - Flags: feedback?(rrelyea) → feedback+

Franziskus Kiefer [:franziskus]

Assignee

Comment 18

•

8 years ago

https://hg.mozilla.org/projects/nss/rev/d402449b07fdbef2f9e66854dc8424f065ef53ee

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

Target Milestone: --- → 3.29

Ryan VanderMeulen [:RyanVM]

Comment 19

•

8 years ago

Too late for 51 at this point. Are we going to want to backport an updated NSS release to 52 for this? Also, can we assign this bug a security rating?

status-firefox51: affected → wontfix

Flags: needinfo?(franziskuskiefer)

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Group: crypto-core-security → core-security-release

Franziskus Kiefer [:franziskus]

Assignee

Comment 20

•

8 years ago

I don't see a big security risk here. The bug crashes firefox because of Win32 memory limits (which might be a reason to uplift). This landed in 3.29, so will be in FF53. If we want this in 52 and 51, we could make another point release of 3.28 (which is in both versions).

Flags: needinfo?(jcristau)

Flags: needinfo?(gchang)

Flags: needinfo?(franziskuskiefer)

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

status-firefox53: affected → fixed

Julien Cristau [:jcristau]

Comment 21

•

8 years ago

How about we keep this one in mind if we need to bump nss in 52 for another reason, but don't do a point release just for this?

status-firefox52: affected → fix-optional

Flags: needinfo?(jcristau)

Flags: needinfo?(gchang)

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 22

•

8 years ago

Tracking "-" this one since it's now tagged as fix-optional. Please let me know if you have any concerns.

tracking-firefox52: + → -

Martin Thomson [:mt:]

Updated

•

8 years ago

Blocks: 1337580

Ryan VanderMeulen [:RyanVM]

Comment 23

•

8 years ago

Fixed for Fx52 via bug 1337580.

status-firefox52: fix-optional → fixed

Al Billings [:abillings - ex-MoCo]

Updated

•

8 years ago

Whiteboard: [adv-main52+]

Daniel Veditz [:dveditz]

Updated

•

7 years ago

Group: core-security-release

Paul Donohue

Comment 24

•

3 years ago

dbs_EmulateMap() causes a memory leak. Before this bug fix, dbs_EmulateMap() was only used on platforms that don't support memory mapping files. However, after this bug fix, dbs_EmulateMap() is used unconditionally, which leads to this memory leak applying to all systems that use legacydb.

More specifically:
Memory for the "data" DBT object that is populated by db->get() is normally managed by the DB library, and callers of db->get() do not need to free this memory. However, dbs_EmulateMap() in dbmshim.c uses PORT_Alloc() to allocate memory for the blob data, then returns this data from a wrapper that emulates db->get(). After that, nothing ever frees this memory; The caller assumes that this memory is managed by the library and doesn't need to be freed, the underlying database library has no knowledge of this memory allocated to hold the blob, and nothing in dbmshim.c ever subsequently frees it.

Given that the legacydb format is effectively dead, I'm not advocating that this memory leak be fixed ... But since I just wasted an afternoon hunting it down, I wanted to throw this out there somewhere that others might be able to find it.

You need to log in before you can comment on or make changes to this bug.