Closed
Bug 1239789
Opened 9 years ago
Closed 9 years ago
crash in zxx_stream::Alloc
Categories
(Core :: mozglue, defect)
Tracking
()
RESOLVED
FIXED
mozilla48
People
(Reporter: snorp, Assigned: droeh)
Details
(Keywords: crash)
Crash Data
Attachments
(3 files, 1 obsolete file)
760 bytes,
patch
|
glandium
:
review-
|
Details | Diff | Splinter Review |
2.85 KB,
patch
|
snorp
:
review+
glandium
:
review-
|
Details | Diff | Splinter Review |
3.10 KB,
patch
|
glandium
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-80a6baf9-8677-4797-b5df-7836f2160110.
=============================================================
Looks like perhaps a null pointer for the zxx_stream in Alloc()?
Comment 2•9 years ago
|
||
The crash is on
http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l114
which means zlib is requesting an unexpected buffer size.
Flags: needinfo?(mh+mozilla)
Reporter | ||
Comment 3•9 years ago
|
||
Ah, forgot you have to dig deeper for line numbers sometimes. Alright, let's add some logging or something in that case so we can add the new buffer. I guess someone is shipping a weird gzip.
Assignee: nobody → snorp
Reporter | ||
Comment 4•9 years ago
|
||
This probably fixes it
Attachment #8708449 -
Flags: review?(mh+mozilla)
Comment 5•9 years ago
|
||
Comment on attachment 8708449 [details] [diff] [review]
Allow for gzip window allocations that are smaller than the maximum
Review of attachment 8708449 [details] [diff] [review]:
-----------------------------------------------------------------
Considering the zlib caller is the allocation of the state buffer, that would suggest both buffers are already allocated. The question becomes why are both buffers allocated and yet a new zlib call wants to allocate them? Especially when the inflateInit2 and inflateEnd calls are in mutex-protected section.
Attachment #8708449 -
Flags: review?(mh+mozilla) → review-
Reporter | ||
Comment 6•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #5)
> Comment on attachment 8708449 [details] [diff] [review]
> Allow for gzip window allocations that are smaller than the maximum
>
> Review of attachment 8708449 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Considering the zlib caller is the allocation of the state buffer, that
> would suggest both buffers are already allocated. The question becomes why
> are both buffers allocated and yet a new zlib call wants to allocate them?
> Especially when the inflateInit2 and inflateEnd calls are in mutex-protected
> section.
We don't know if that's happening at all, do we? Couldn't it just be failing the 'else if' for the window allocation because we are using a direct equality comparison? The comment we have states that the windowBuf is the maximum allowable size, so surely a smaller buffer is possible and should be allowed.
Flags: needinfo?(mh+mozilla)
Comment 7•9 years ago
|
||
That answer is easily answered by looking at the zlib source code, and the zlib source code says the only two allocations are:
state = (struct inflate_state FAR *)
ZALLOC(strm, 1, sizeof(struct inflate_state));
(where struct inflate_state is what is < 10k)
state->window = (unsigned char FAR *)
ZALLOC(strm, 1U << state->wbits,
sizeof(unsigned char));
Where state->wbits is essentially the absolute value of windowBits in the szip headser, and that's the 16th byte in the szipped file. And that is fixed to -SzipCompress::winSizeLog in szip.cpp, which value is 15.
So yes, in the theoretical case a szip header would advertize a different window size, the test wouldn't work, so in that sense, it's possibly worth changing, but it's not going to do any good for this crash.
Flags: needinfo?(mh+mozilla)
Reporter | ||
Comment 8•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #7)
Oops, you're right. Interesting. Not sure what's going on here then.
Reporter | ||
Updated•9 years ago
|
Assignee: snorp → droeh
tracking-fennec: --- → ?
Reporter | ||
Updated•9 years ago
|
tracking-fennec: ? → 45+
Assignee | ||
Comment 9•9 years ago
|
||
A patch to add logcat output when StaticAllocator::Alloc fails.
Attachment #8726300 -
Flags: review?(snorp)
Reporter | ||
Comment 10•9 years ago
|
||
Comment on attachment 8726300 [details] [diff] [review]
Logging patch
Review of attachment 8726300 [details] [diff] [review]:
-----------------------------------------------------------------
Seems ok to me with nits, but let's make sure glandium agrees
::: mozglue/linker/Zip.cpp
@@ +18,5 @@
> + return stateBuf.get();
> + } else if (items * size == windowBuf.size) {
> + return windowBuf.get();
> + } else {
> + ERROR("StaticAllocator failed with items: %i, size: %i", items, size);
This should be %u for unsigned values
Attachment #8726300 -
Flags: review?(snorp)
Attachment #8726300 -
Flags: review?(mh+mozilla)
Attachment #8726300 -
Flags: review+
Comment 11•9 years ago
|
||
Comment on attachment 8726300 [details] [diff] [review]
Logging patch
Review of attachment 8726300 [details] [diff] [review]:
-----------------------------------------------------------------
What is this expected to diagnose that we don't already know? We have a backtrace that can tell us which of the two allocations I mentioned in comment 7 it is that leads to the crash (hint: it's https://github.com/android/platform_external_zlib/blob/master/src/inflate.c#L208 ). Again this is not a problem of the requested allocation being too big, the problem is that we've already allocated both buffers. *That* is what needs sorting out.
Attachment #8726300 -
Flags: review?(mh+mozilla) → review-
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #11)
I don't see how this is caused by already having allocated both buffers; we know the crash is happening on http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l114 (as you pointed out in comment 2). If the crash was caused by both buffers already being allocated, surely we would be hitting http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l81 instead?
Comment 14•9 years ago
|
||
So, let's take a step back: this is what zxx_stream::Alloc (where zxx_stream::StaticAllocator::Alloc is inlined) looks like in 44.0b7, which is the version where the crash from comment 0 occurred:
24a34: 6b80 ldr r0, [r0, #56] ; 0x38
loads zStream->allocator into $r0
24a36: b1d8 cbz r0, 24a70 <_ZN10zxx_stream5AllocEPvjj+0x3c>
if $0 is zero, go to 24a70
24a38: 2901 cmp r1, #1
24a3a: d108 bne.n 24a4e <_ZN10zxx_stream5AllocEPvjj+0x1a>
if items is not 1, go to 24a4e
24a3c: f5b2 5f40 cmp.w r2, #12288 ; 0x3000
24a40: d805 bhi.n 24a4e <_ZN10zxx_stream5AllocEPvjj+0x1a>
if size is > 0x3000, go to 24a4e
24a42: f500 5340 add.w r3, r0, #12288 ; 0x3000
24a46: 781a ldrb r2, [r3, #0]
24a48: b97a cbnz r2, 24a6a <_ZN10zxx_stream5AllocEPvjj+0x36>
if this->stateBuf.inUse, go to 24a6a
24a4a: 7019 strb r1, [r3, #0]
set this->stateBuf.inUse to 1
24a4c: 4770 bx lr
return
24a4e: 434a muls r2, r1
24a50: f5b2 4f00 cmp.w r2, #32768 ; 0x8000
24a54: d109 bne.n 24a6a <_ZN10zxx_stream5AllocEPvjj+0x36>
if items * size != 0x8000, go to 24a6a
24a56: f500 5040 add.w r0, r0, #12288 ; 0x3000
24a5a: 3001 adds r0, #1
24a5c: f500 4300 add.w r3, r0, #32768 ; 0x8000
24a60: 781a ldrb r2, [r3, #0]
24a62: b912 cbnz r2, 24a6a <_ZN10zxx_stream5AllocEPvjj+0x36>
if this->windowBuf.inUse, go to 24a6a
24a64: 2201 movs r2, #1
24a66: 701a strb r2, [r3, #0]
set this->windowBuf.inUse to 1
24a68: 4770 bx lr
return
24a6a: 2300 movs r3, #0
24a6c: 601b str r3, [r3, #0]
MOZ_CRASH
24a6e: deff udf #255 ; 0xff
undefined instruction. Not sure why it's there. Maybe to align the next instruction
24a70: fb02 f001 mul.w r0, r2, r1
24a74: f7f7 ba28 b.w 1bec8 <_Znaj>
return operator new(items*size)
The crash is on 24a6c. There are three paths leading to it. One of them is http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l114 and two of them are http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l81. Debug info can't know which is which, because they're all at the same address, and apparently, http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l114 is the one that is attributed. But it's a red herring.
Looking at the register values on the crash, you can see that r1 and r2 are both 0x1. If we really followed the path leading to http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l114, r1 would be items, and r2 would be items * size. If items is 1 and items * size is 1, then it means size would have been 1, and the first condition should have matched, and there should have been no crash. So this is not the code path that was followed. That leaves the two paths that lead to http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l81. In both code paths, r2 is inUse from one of the ZStreamBuf, so it could be either. Both code paths are using r3 for the address of inUse, and r3 is overwritten for the MOZ_CRASH, so it's not possible to tell which is which from r3, but in the case of windowBuf, r0 would have an odd value because of the adds r0, #1, and in the crash dump, it is even, which points to the stateBuf.get() code path being the one that is happening.
Conclusion: the crash is in http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l81 called from http://hg.mozilla.org/releases/mozilla-beta/annotate/d4f5cfb02dda/mozglue/linker/Zip.h#l110 called from https://github.com/android/platform_external_zlib/blob/master/src/inflate.c#L208
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 15•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #14)
OK, I see what you're saying. Thanks for your help.
Assignee | ||
Comment 16•9 years ago
|
||
So, I think reentrancy is the culprit here. Comments in Mappable.cpp suggest that this shouldn't be a problem, but do not seem to take into account the possibility that upon reentering, the static buffer we need will already be in use. This patch is a simple but not particularly elegant fix: create two static allocators and use the second as a fallback if the first fails. Obviously this isn't a perfect solution, as we could potentially reenter twice and once again find ourselves without an available buffer, but I don't see any general way of avoiding this possibility.
Attachment #8731688 -
Flags: review?(mh+mozilla)
Reporter | ||
Updated•9 years ago
|
tracking-fennec: 45+ → 46+
Comment 17•9 years ago
|
||
(In reply to Dylan Roeh (:droeh) from comment #16)
> Created attachment 8731688 [details] [diff] [review]
> Proposed patch
>
> So, I think reentrancy is the culprit here. Comments in Mappable.cpp suggest
> that this shouldn't be a problem
Please can you be more specific?
Assignee | ||
Comment 18•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #17)
> Please can you be more specific?
https://dxr.mozilla.org/mozilla-central/source/mozglue/linker/Mappable.cpp#508
The comment notes that the lock can be re-acquired by the thread already holding it, but goes on to suggest that the worst outcome is a possible minor performance hit. I don't think that's the case when using the static allocator, though -- we may find the needed buffer is already in use.
Comment 20•9 years ago
|
||
Comment on attachment 8731688 [details] [diff] [review]
Proposed patch
Review of attachment 8731688 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry for the delay, I had to do some digging and think about it thoroughly. The comment in Mappable.cpp is much older than the StaticAllocator code, so it was correct when it was written. The StaticAllocator changed matters, and, as it appears, broke corner cases. I think the work around makes sense, although I have a nit on the implementation, see below. I'll add that it doesn't matter that it wouldn't work if it re-entered for the third time, because in that situation, bad things would already happen because we use an alternative stack (sigaltstack). If we were re-entering again, we'd be overwriting our stack already.
::: mozglue/linker/Zip.h
@@ +136,5 @@
> }
> }
>
> ZStreamBuf<0x3000> stateBuf; // 0x3000 is an arbitrary size above 10K.
> ZStreamBuf<1 << MAX_WBITS> windowBuf;
I'd rather add two buffers here than make the caller handle two allocators.
Attachment #8731688 -
Flags: review?(mh+mozilla)
Updated•9 years ago
|
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 21•9 years ago
|
||
Alright, here's a version that adds buffers to the static allocator rather than adding a second allocator.
Attachment #8731688 -
Attachment is obsolete: true
Attachment #8733898 -
Flags: review?(mh+mozilla)
Comment 22•9 years ago
|
||
Comment on attachment 8733898 [details] [diff] [review]
Proposed patch (updated)
Review of attachment 8733898 [details] [diff] [review]:
-----------------------------------------------------------------
::: mozglue/linker/Zip.h
@@ +106,5 @@
> public:
> void *Alloc(uInt items, uInt size)
> {
> + if (items == 1 && size <= stateBuf1.size) {
> + char* res = stateBuf1.get();
A shorter form would be:
char *res = stateBuf1.get() || stateBuf2.get();
MOZ_RELEASE_ASSERT(res, "ZStreamBuf already in use");
return res;
same for windowBuf.
Attachment #8733898 -
Flags: review?(mh+mozilla) → review+
Comment 23•9 years ago
|
||
Comment 24•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Assignee | ||
Comment 25•9 years ago
|
||
Comment on attachment 8733898 [details] [diff] [review]
Proposed patch (updated)
Approval Request Comment
[Feature/regressing bug #]:1059797
[User impact if declined]:Periodic crashes
[Describe test coverage new/current, TreeHerder]:Nightly
[Risks and why]:Low risk - behavior should be identical except when reentering (in which case this should prevent a crash.)
[String/UUID change made/needed]:None
Attachment #8733898 -
Flags: approval-mozilla-beta?
Attachment #8733898 -
Flags: approval-mozilla-aurora?
Comment 26•9 years ago
|
||
Comment on attachment 8733898 [details] [diff] [review]
Proposed patch (updated)
Crash fix for android, let's take this on aurora and verify the fix there as best possible. I don't see high volume of crashes with this signature so I'd like to avoid uplifting this to beta. If you disagree let me know and we could try taking it for beta 8.
Attachment #8733898 -
Flags: approval-mozilla-beta?
Attachment #8733898 -
Flags: approval-mozilla-beta-
Attachment #8733898 -
Flags: approval-mozilla-aurora?
Attachment #8733898 -
Flags: approval-mozilla-aurora+
Comment 27•9 years ago
|
||
bugherder uplift |
status-firefox47:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•