Fix native resource management in NativeZip

RESOLVED FIXED in Firefox 50

Status

()

Firefox for Android
General
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: esawin, Assigned: esawin)

Tracking

51 Branch
Firefox 51
All
Android
Points:
---

Firefox Tracking Flags

(firefox50 fixed, firefox51 fixed)

Details

Attachments

(6 attachments, 5 obsolete attachments)

1.27 KB, patch
kats
: review+
Details | Diff | Splinter Review
2.36 KB, patch
esawin
: review+
Details | Diff | Splinter Review
768 bytes, patch
kats
: review+
Details | Diff | Splinter Review
2.19 KB, patch
Details | Diff | Splinter Review
712 bytes, patch
Details | Diff | Splinter Review
1.27 KB, patch
Details | Diff | Splinter Review
(Assignee)

Description

2 years ago
Bug 1291424 affects the startup timings which results in intermittent crashes when trying to access unmapped memory.

I think this is due to incorrect handling of the native Zip resources in NativeZip.

1. We allow for 1-to-N relationships between a NativeZip and InputStreams via NativeZip::getInputStream, but each returned InputStream releases the parent NativeZip when closed.
2. We allow for NativeZip construction based on an a given InputStream for children jar files without keeping the original NativeZip referenced which holds the memory mapping.
3. ZipCollection is not thread-safe.
4. GeckoJarReader::getStream may return null, we don't check for it.
5. lib_mapping may be accessed before initialization, handled in bug 1301665.
(Assignee)

Comment 1

2 years ago
Created attachment 8790420 [details] [diff] [review]
0001-Bug-1302189-1.1-Release-native-zip-only-on-finalize-.patch

Don't allow manual closing of ByteBufferInputStream and NativeZip as the referenced native resources may be shared across streams.

If we want to allow for manual closing, we would need to implement custom ref count management (a la https://hg.mozilla.org/try/rev/acd6a97dab4c9325c31b9214782110aeeb9ed28f).
Attachment #8790420 - Flags: review?(mh+mozilla)
(Assignee)

Comment 2

2 years ago
Created attachment 8790423 [details] [diff] [review]
0002-Bug-1302189-2.1-Keep-named-zip-referenced-in-childre.patch

The named zips handle the memory mapping and need to be referenced by the children zips to prevent gc.
Attachment #8790423 - Flags: review?(mh+mozilla)
(Assignee)

Comment 3

2 years ago
Created attachment 8790424 [details] [diff] [review]
0003-Bug-1302189-3.1-Add-mutex-locking-to-ZipCollection-z.patch

Mutex-locking in ZipCollection.
Attachment #8790424 - Flags: review?(mh+mozilla)
(Assignee)

Comment 4

2 years ago
Created attachment 8790425 [details] [diff] [review]
0004-Bug-1302189-4.1-Check-for-null-when-retrieving-input.patch

Check for null return from GeckoJarReader::getStream.
Attachment #8790425 - Flags: review?(mh+mozilla)
(Assignee)

Comment 5

2 years ago
This is interesting, apparently the patches above fix the crash tests [1] but not the rest of the tests [2].
For comparison, the custom ref count implementation (on top of the patches) passes the tests [3].

I must have missed something.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=852827c4af7c
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=c1494af2fb22
[3] https://treeherder.mozilla.org/#/jobs?repo=try&revision=33c1a5d24a0e
Comment on attachment 8790424 [details] [diff] [review]
0003-Bug-1302189-3.1-Add-mutex-locking-to-ZipCollection-z.patch

Review of attachment 8790424 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozglue/linker/Zip.cpp
@@ +188,5 @@
>  already_AddRefed<Zip>
>  ZipCollection::GetZip(const char *path)
>  {
> +  {
> +    AutoLock lock(&sZipCollectionMutex);

Hopefully, this doesn't cause deadlocks on fork...

::: mozglue/linker/Zip.h
@@ +15,4 @@
>  #include "mozilla/RefCounted.h"
>  #include "mozilla/RefPtr.h"
>  
> +class AutoLock {

Please put this in Utils.h instead.
Attachment #8790424 - Flags: review?(mh+mozilla) → review+
Please find another reviewer for the Java code.
Flags: needinfo?(esawin)
(that is, I don't know who a suitable reviewer is, I'm not)
(Assignee)

Comment 9

2 years ago
Comment on attachment 8790424 [details] [diff] [review]
0003-Bug-1302189-3.1-Add-mutex-locking-to-ZipCollection-z.patch

Moving patch 3 to bug 1302516.
Attachment #8790424 - Attachment is obsolete: true
(Assignee)

Comment 10

2 years ago
Created attachment 8790910 [details] [diff] [review]
0002-Bug-1302189-2.2-Keep-named-zip-referenced-in-its-chi.patch

To mimic the behavior of the custom ref counting [1], we need to keep the named zip reference around in ByteBufferInputStream to prevent it from getting gc'ed.

kats, looks like you've been the original reviewer, can you take the reviews of these patches?

[1] https://hg.mozilla.org/try/rev/acd6a97dab4c9325c31b9214782110aeeb9ed28f
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=0218fd871b0c
Attachment #8790423 - Attachment is obsolete: true
Attachment #8790423 - Flags: review?(mh+mozilla)
Flags: needinfo?(esawin)
Attachment #8790910 - Flags: review?(bugmail)
(Assignee)

Updated

2 years ago
Attachment #8790420 - Flags: review?(mh+mozilla) → review?(bugmail)
(Assignee)

Updated

2 years ago
Attachment #8790425 - Flags: review?(mh+mozilla) → review?(bugmail)
Comment on attachment 8790420 [details] [diff] [review]
0001-Bug-1302189-1.1-Release-native-zip-only-on-finalize-.patch

Review of attachment 8790420 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/mozglue/NativeZip.java
@@ +42,1 @@
>      }

I think it's better to just get rid of this function entirely and the two callsites of it in GeckoJarReader. We shouldn't have an API that misleads users into thinking it does something.
Attachment #8790420 - Flags: review?(bugmail) → review+
Comment on attachment 8790910 [details] [diff] [review]
0002-Bug-1302189-2.2-Keep-named-zip-referenced-in-its-chi.patch

Review of attachment 8790910 [details] [diff] [review]:
-----------------------------------------------------------------

Not sure I fully understand why this is needed. My impression is that this scenario occurs when:
- Somebody creates a NativeZip
- Then creates a ByteBufferInputStream to read a file from that zip
- Then creates another NativeZip for the file being read (which is a nested zip)
- Then creates another ByteBufferInputStream to read a file from the nested zip

And somehow the root zip gets released. However, the nested ByteBufferInputStream holds a ref to the nested zip, and the nested zip holds a ref to the root ByteBufferInputStream (in NativeZip::mInput) which also holds a ref to the root zip. So if the NativeZip is now only released in finalize(), because of the part 1 patch, this chain should remain valid and nothing should get released until the nested ByteBufferInputStream is GC'd. Am I missing something?
Attachment #8790910 - Flags: review?(bugmail)
Attachment #8790425 - Flags: review?(bugmail) → review+
(Assignee)

Comment 13

2 years ago
Created attachment 8791398 [details] [diff] [review]
0001-Bug-1302189-1.2-Release-native-zip-only-on-finalize-.patch

Addressed comments.
Attachment #8790420 - Attachment is obsolete: true
Attachment #8791398 - Flags: review+
(Assignee)

Comment 14

2 years ago
Created attachment 8791407 [details] [diff] [review]
0002-Bug-1302189-2.3-Prevent-ProGuard-from-optimizing-out.patch

Further inspection has revealed that ProGuard is optimizing out NatizeZip::mInput which breaks the reference chain to the root zip.

Adding the @JNITarget annotation will prevent this and implicitly document that we need the reference to manage the native resource lifecycle.

I think this is a clearer solution than adding this specific case to proguard.cfg, which usually holds general exclusion patterns otherwise.
Attachment #8790910 - Attachment is obsolete: true
Attachment #8791407 - Flags: review?(bugmail)
(Assignee)

Comment 15

2 years ago
Created attachment 8791409 [details] [diff] [review]
0002-Bug-1302189-2.3-Prevent-ProGuard-from-optimizing-out.patch

Removed redundant import.
Attachment #8791407 - Attachment is obsolete: true
Attachment #8791407 - Flags: review?(bugmail)
Attachment #8791409 - Flags: review?(bugmail)
Comment on attachment 8791409 [details] [diff] [review]
0002-Bug-1302189-2.3-Prevent-ProGuard-from-optimizing-out.patch

Review of attachment 8791409 [details] [diff] [review]:
-----------------------------------------------------------------

Sure
Attachment #8791409 - Flags: review?(bugmail) → review+

Comment 17

2 years ago
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/df4352625311
[1.2] Release native zip only on finalize to prevent premature unmapping of referenced memory. r=kats
https://hg.mozilla.org/integration/mozilla-inbound/rev/0e08d54ad9f6
[2.3] Prevent ProGuard from optimizing out native resource references. r=kats
https://hg.mozilla.org/integration/mozilla-inbound/rev/48cca0d90707
[4.1] Check for null when retrieving input streams from GeckoJarReader. r=kats
I wish we would've not used @JNITarget, since it's not a JNI target. We could consider adding a new annotation like @NoStrip or something for stuff like this. As it is, if you wanted to look at JNI entry points via annotations, you'd have stuff like this showing up too.

Comment 19

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/df4352625311
https://hg.mozilla.org/mozilla-central/rev/0e08d54ad9f6
https://hg.mozilla.org/mozilla-central/rev/48cca0d90707
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox51: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
(Assignee)

Comment 20

2 years ago
Comment on attachment 8791398 [details] [diff] [review]
0001-Bug-1302189-1.2-Release-native-zip-only-on-finalize-.patch

Approval Request Comment
[Feature/regressing bug #]: None
[User impact if declined]: Intermittent crashes on startup
[Describe test coverage new/current, TreeHerder]: Nightly, crashtests
[Risks and why]: Low, it just prevents objects from being released too early
[String/UUID change made/needed]: None
Attachment #8791398 - Flags: approval-mozilla-beta?
Attachment #8791398 - Flags: approval-mozilla-aurora?

Updated

2 years ago
status-firefox50: --- → affected
Comment on attachment 8791398 [details] [diff] [review]
0001-Bug-1302189-1.2-Release-native-zip-only-on-finalize-.patch

Crash fix, Beta50+ 

(already on Aurora51)
Attachment #8791398 - Flags: approval-mozilla-beta?
Attachment #8791398 - Flags: approval-mozilla-beta+
Attachment #8791398 - Flags: approval-mozilla-aurora?
Needs rebasing for Beta.
Flags: needinfo?(esawin)
(Assignee)

Comment 23

2 years ago
Created attachment 8798087 [details] [diff] [review]
[beta] 0001-Bug-1302189-1.2-Release-native-zip-only-on-finalize-.patch

Beta rebase.
Flags: needinfo?(esawin)
(Assignee)

Comment 24

2 years ago
Created attachment 8798088 [details] [diff] [review]
[beta] 0002-Bug-1302189-2.3-Prevent-ProGuard-from-optimizing-out.patch

Beta rebase.
(Assignee)

Comment 25

2 years ago
Created attachment 8798089 [details] [diff] [review]
[beta] 0003-Bug-1302189-4.1-Check-for-null-when-retrieving-input.patch

Beta rebase.
You need to log in before you can comment on or make changes to this bug.