Closed Bug 1314496 Opened 3 years ago Closed 3 years ago

Uplift libcubeb allocator mismatch fixes

Categories

(Core :: Audio/Video: cubeb, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox50 --- wontfix
firefox51 --- fixed
firefox52 --- fixed

People

(Reporter: kinetik, Assigned: kinetik)

References

Details

Attachments

(3 files)

We need these in m-c as soon as possible, and they should probably be uplifted because they may cause leaks and/or memory corruption.

audiounit: https://github.com/kinetiknz/cubeb/commit/090483ea6a627213ce9285a5ad961b305a6b3efa

wasapi (not yet reviewed/merged): https://github.com/kinetiknz/cubeb/pull/181/commits/6523bd76cbee7e5703ee853485c8914e5600d6c1

Regressed in bug 1251502 (WASAPI) and bug 1285541 (AudioUnit).
Rank: 15
Priority: -- → P1
Summary: Uplift libcubeb allocator mismatch fixes → Uplift libcubeb allocator mismatch and leak fixes
Attachment #8807402 - Flags: review?(padenot)
Note: doesn't include the leak fix in PR 183 because it's still waiting on review.
Approval Request Comment
[Feature/regressing bug #]: bug 1251502, bug 1285541
[User impact if declined]: potential memory leak and/or memory corruption
[Describe test coverage new/current, TreeHerder]: n/a
[Risks and why]: low risk, changes are very simple
[String/UUID change made/needed]: none

Note that the patches have already been reviewed in the upstream Git repo, so the r? on the update patch is simply to rubber stamp the library update.  The versions for aurora and beta are restricted uplifts of only the allocator fixes.
Attachment #8807403 - Flags: approval-mozilla-aurora?
Approval Request Comment
[Feature/regressing bug #]: bug 1251502, bug 1285541
[User impact if declined]: potential memory leak and/or memory corruption
[Describe test coverage new/current, TreeHerder]: n/a
[Risks and why]: low risk, changes are very simple
[String/UUID change made/needed]: none

Note that the patches have already been reviewed in the upstream Git repo, so the r? on the update patch is simply to rubber stamp the library update.  The versions for aurora and beta are restricted uplifts of only the allocator fixes.
Attachment #8807404 - Flags: approval-mozilla-beta?
Attachment #8807402 - Flags: review?(padenot) → review+
Hi Kinetik, Jean-Yves, I am extremely hesitant to take this fix. I know that Beta50 crash rates were really high until we fixed a bug in a similar code path in bug 1308418. I would prefer not to take this in 50 at all, but rather let it ride the trains.

Can you please quantify the extent of mem leaks and also the likelihood of the meam leak happening?
Flags: needinfo?(kinetik)
Flags: needinfo?(jyavenard)
Comment on attachment 8807403 [details] [diff] [review]
aurora uplift patch

Fixes mem leaks, let's uplift to Aurora51.
Attachment #8807403 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
It's a malloc vs new operator with non matching delete vs free. 
The behaviour is undefined and I can understand why kinetic would want those uplifted asap...
Flags: needinfo?(jyavenard)
Of the two regressing bugs mentioned in comment 0, only bug 1285541 is new in 50, the other bug was fixed in 48. Also, dcamp mention some memory bloating issues on OS X. Perhaps this might be a possible reason.
The exact result of this is officially undefined; however in VS and in GCC you can predict the results (and I presume similarly in clang). This is a good article on the implications of the error:
http://web.archive.org/web/20080703153358/http://taossa.com/index.php/2007/01/03/attacking-delete-and-delete-in-c
and also https://blogs.msdn.microsoft.com/oldnewthing/20040203-00/?p=40763/#66782 and the continuation https://blogs.msdn.microsoft.com/oldnewthing/20040204-00/?p=40743

Note: in this case, I think all (or almost all) the instances are "delete/free of ptr allocated with new []" (the last case in the article -- "Attacking delete").

I'll note that both articles are quite old, and the allocators undoubtedly have more safety checks now (and even then the MS allocator was largely safe due to ptr-alignment checks (if it wasn't linked with MSVCRT.DLL)and would just leak the memory). If they have enough checks, this is merely a leak (and not a huge one unless this is done often, which I doubt).  It might still be worse on GCC/Clang, however (closer to the old analysis in the page I linked), which would lead to possible memory list corruptions.

Kinetik/jya - do you agree that in most cases these would occur infrequently (and thus be ok to not fix), if it merely leaks the memory? 

NI-ing glandium and dveditz and jseward who may have more experience with the real-world impacts in other bugs of delete vs delete[].

At this point, I would assume unless proven otherwise that this can cause memory list corruptions, and would take the patches.
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(jseward)
Flags: needinfo?(dveditz)
I wanted to check if these allocation issues were solely an issue with full_duplex=true for mac and windows (since it's false in 50 for them).  However, it appears that on mac at least one of the mismatches can be hit due to GraphDriver registering for collection changes.  I haven't verified in a debugger however.
This is just the allocator mismatches.  The leak fix mentioned in comment 3 can go in separately once it's reviewed; it's less serious being a simple leak with no potential heap corruption and only triggered on only input and capture streams, which is hidden behind the full_duplex pref that's only enabled on aurora and central.
Flags: needinfo?(kinetik)
Summary: Uplift libcubeb allocator mismatch and leak fixes → Uplift libcubeb allocator mismatch fixes
See Also: → 1315495
(In reply to Randell Jesup [:jesup] from comment #12)
> I wanted to check if these allocation issues were solely an issue with
> full_duplex=true for mac and windows (since it's false in 50 for them). 
> However, it appears that on mac at least one of the mismatches can be hit
> due to GraphDriver registering for collection changes.  I haven't verified
> in a debugger however.

Assuming:
- full-duplex is prefed off on beta
  OK: https://hg.mozilla.org/releases/mozilla-beta/file/tip/modules/libpref/init/all.js#l490
- cubeb_stream_init is never passed a non-NULL cubeb_devid on non-full-duplex paths
  OK: AudioStream always passes nullptr
      GraphDriver won't pass a non-NULL input_id with full-duplex disabled, and mOutputDeviceID is always -1 so will never pass a non-NULL output_id
- cubeb_enumerate_devices/cubeb_device_collection_destroy/cubeb_device_info_destroy is never called on non-full-duplex paths
  OK: only paths to these are through AudioInputCubeb which is only instantiated for full-duplex
- cubeb_register_device_collection_changed is never called on non-full-duplex paths
  OK: GraphDriver is calling cubeb_stream_register_device_changed_callback, which is safe
      and cubeb_register_device_collection_changed is unused in beta

...then we may be clear to avoid uplifting this to beta if the cost of uplifting (bearing in mind how low risk the fix is) is higher than the risk of our analysis being incorrect or incomplete.
full_duplex is off for Mac and windows (and Android) in 50.  It's enabled for Linux, but none of these paths in question are for Linux.
Thanks Kinetik and Jesup for the deep-dive analysis on this one. It is really appreciated! Glad to hear that this may be something we can avoid uplifting in RC week, at this point we are only taking fixes for release blocking issues. Let's stabilize this fix on Beta 51 (post merge) for a few weeks and plan to uplift in 50.1.0.
https://hg.mozilla.org/mozilla-central/rev/e80ad115dbb8
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
In practice, in Firefox, operator new is a wrapper around moz_xmalloc, which is a wrapper around malloc that crashes when malloc returns NULL. operator delete is a wrapper for free, which means, in practice, operator new/free, malloc/operator delete combinations are not a problem.

They will, however, usually make ASan, valgrind, etc. unhappy.
Flags: needinfo?(mh+mozilla)
Indeed, for valgrind testing of Gecko I disable mismatch-checking
(--show-mismatched-frees=no) since it generates a lot of noise otherwise.

I am unclear how serious the real-world impact of such mismatches is.
I had the vague impression that they haven't been a source of crashes
for C++ on Unixes since the early 2000s, but that could be wrong, and
from now reading the links in comment #11, I'm even more unclear.
Flags: needinfo?(jseward)
In many flavors of Unixes, operator new uses the same underlying allocator as malloc (often by just calling malloc).

On top of that, Firefox actually doesn't uses the system operator new, but has its own goop to make it redirected to moz_xmalloc.
Comment on attachment 8807404 [details] [diff] [review]
beta uplift patch

Since the merge day is passed and this patch has been in 51, there is no need to keep approval‑mozilla‑beta? flag. I change the approval‑mozilla‑beta? to approval‑mozilla‑release?.
Attachment #8807404 - Flags: approval-mozilla-beta? → approval-mozilla-release?
Comment on attachment 8807404 [details] [diff] [review]
beta uplift patch

I don't think we need to uplift it.  Clearing req.
Attachment #8807404 - Flags: approval-mozilla-release?
Flags: needinfo?(dveditz)
You need to log in before you can comment on or make changes to this bug.