Crash in brotli WriteRingBuffer on Alpine Linux build (against musl c-library)

RESOLVED FIXED in Firefox 48

Status

()

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: timo.teras, Assigned: natanael.copa)

Tracking

({crash})

46 Branch
mozilla49
crash
Points:
---

Firefox Tracking Flags

(firefox48 fixed, firefox49 fixed)

Details

(Whiteboard: [necko-active])

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0
Build ID: 20160513151215

Steps to reproduce:

Using firefox 46 (46.0 and 46.0.1 tested). Alpine Linux build (against musl c-library).
Open certain web-pages. Many articles of iltalehti.fi for example are affected. After quick test one directly link to test with is:
http://www.iltalehti.fi/sisustus/2016051221550617_sz.shtml


Actual results:

Firefox crashes, with backtrace:

Thread 31 "firefox" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 31952]
memcpy () at src/string/x86_64/memcpy.s:18
18    src/string/x86_64/memcpy.s: No such file or directory.
(gdb) where
#0  memcpy () at src/string/x86_64/memcpy.s:18
#1  0x00006c115e145e31 in memcpy (__n=48615, __os=<optimized out>, __od=<optimized out>) at /usr/include/fortify/string.h:51
#2  WriteRingBuffer (available_out=available_out@entry=0x6c115050e248, next_out=next_out@entry=0x6c115050e240, 
    total_out=total_out@entry=0x1801851ace0, s=s@entry=0x180185198e0)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/modules/brotli/dec/decode.c:1161
#3  0x00006c115e149c2b in BrotliDecompressStream (available_in=available_in@entry=0x6c115050e250, next_in=next_in@entry=0x6c115050e238, 
    available_out=available_out@entry=0x6c115050e248, next_out=next_out@entry=0x6c115050e240, total_out=0x1801851ace0, s=0x180185198e0)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/modules/brotli/dec/decode.c:2244
#4  0x00006c115c7bebf3 in mozilla::net::nsHTTPCompressConv::BrotliHandler (stream=<optimized out>, closure=0x1800f040fc0, 
    dataIn=0x1800e9f0924 "", aAvail=13252, countRead=0x6c115052e2ec)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/netwerk/streamconv/converters/nsHTTPCompressConv.cpp:188
#5  0x00006c115c6fb9e0 in nsInputStreamTee::WriteSegmentFun (aIn=<optimized out>, aClosure=0x1800f041160, 
    aFromSegment=0x1800e9ed560 "\a\363^\021\225\275\037.\242\242\325\303\"\022\363\001\320HY8\177\177\021\030\067\361\261\316\363=\365\373U\365\277?_\025\031\321\333\335\321X\360\366QS}\241\202\212\242\250\340\263\254eRH \025H\314L\024\254\253?\337f\365\365\253\322V\315^\n\350\364\301\234\344\352c\222\236L\207d\217\220r=\354\a(\226%\267$C\230.\374\377l\276:I\347\t\227\033j\312_\264\331\252\060T\236\062\320\356\333\267\006A\261?\270?h \232\035IfX\243\024\330\360\030Y\024\306Jq\231T9\275~\316\265\214\334t)\313_\352\344\377\337\253z\270\324\065\t\\\343\360M\313\326Bk\322{\337}\340\177Y@\262\r\276\244\016\331Y\313\v"..., aOffset=<optimized out>, aCount=<optimized out>, aWriteCount=0x6c115052e2ec)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/xpcom/io/nsInputStreamTee.cpp:200
#6  0x00006c115c704577 in nsPipeInputStream::ReadSegments (this=0x1800bcbf9e0, 
    aWriter=0x6c115c6fb9cc <nsInputStreamTee::WriteSegmentFun(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*)>, 
    aClosure=0x1800f041160, aCount=13252, aReadCount=0x6c115052e374)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/xpcom/io/nsPipe3.cpp:1283
#7  0x00006c115c7bf538 in mozilla::net::nsHTTPCompressConv::OnDataAvailable (this=0x1800f040fc0, request=0x1800bc85c48, aContext=0x0, 
    iStr=0x1800f041160, aSourceOffset=0, aCount=13252)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/netwerk/streamconv/converters/nsHTTPCompressConv.cpp:444
#8  0x00006c115c78664e in nsStreamListenerTee::OnDataAvailable (this=0x1800f0410c0, request=0x1800bc85c48, context=0x0, input=<optimized out>, 
    offset=0, count=13252) at /home/tteras/aports/testing/firefox/src/firefox-46.0/netwerk/base/nsStreamListenerTee.cpp:93
#9  0x00006c115c8702f0 in mozilla::net::nsHttpChannel::OnDataAvailable (this=0x1800bc85c00, request=<optimized out>, ctxt=<optimized out>, 
    input=0x1800bcbf9e0, offset=<optimized out>, count=13252)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/netwerk/protocol/http/nsHttpChannel.cpp:6092
#10 0x00006c115c779c84 in nsInputStreamPump::OnStateTransfer (this=this@entry=0x1800e1e5400)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/netwerk/base/nsInputStreamPump.cpp:603
#11 0x00006c115c779df3 in nsInputStreamPump::OnInputStreamReady (this=0x1800e1e5400, stream=<optimized out>)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/netwerk/base/nsInputStreamPump.cpp:430
#12 0x00006c115c6fe8ce in nsInputStreamReadyEvent::Run (this=0x1800f041120)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/xpcom/io/nsStreamUtils.cpp:94
#13 0x00006c115c7122c3 in nsThread::ProcessNextEvent (this=0x1800a351160, aMayWait=<optimized out>, aResult=0x6c115052e707)
---Type <return> to continue, or q <return> to quit---
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/xpcom/threads/nsThread.cpp:995
#14 0x00006c115c72dc87 in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=true)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/xpcom/glue/nsThreadUtils.cpp:297
#15 0x00006c115c92731e in mozilla::ipc::MessagePumpForNonMainThreads::Run (this=0x1800a356240, aDelegate=0x1800a352240)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/ipc/glue/MessagePump.cpp:355
#16 0x00006c115c917212 in MessageLoop::RunHandler (this=<optimized out>)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/ipc/chromium/src/base/message_loop.cc:227
#17 MessageLoop::Run (this=this@entry=0x1800a352240)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/ipc/chromium/src/base/message_loop.cc:201
#18 0x00006c115c715876 in nsThread::ThreadFunc (aArg=0x1800a351160)
    at /home/tteras/aports/testing/firefox/src/firefox-46.0/xpcom/threads/nsThread.cpp:401
#19 0x00006c115af4229e in ?? () from /usr/lib/libnspr4.so
#20 0x00006c11670b5487 in start (p=0x6c115052eab0) at src/thread/pthread_create.c:145
#21 0x00006c11670b7138 in __clone () at src/thread/x86_64/clone.s:21
Backtrace stopped: frame did not save the PC
(gdb) disassemble 
Dump of assembler code for function memcpy:
   0x00006c11670b3b3d <+0>:    mov    %rdi,%rax
   0x00006c11670b3b40 <+3>:    cmp    $0x8,%rdx
   0x00006c11670b3b44 <+7>:    jb     0x6c11670b3b5a <memcpy+29>
   0x00006c11670b3b46 <+9>:    test   $0x7,%edi
   0x00006c11670b3b4c <+15>:    je     0x6c11670b3b5a <memcpy+29>
   0x00006c11670b3b4e <+17>:    movsb  %ds:(%rsi),%es:(%rdi)
   0x00006c11670b3b4f <+18>:    dec    %rdx
   0x00006c11670b3b52 <+21>:    test   $0x7,%edi
   0x00006c11670b3b58 <+27>:    jne    0x6c11670b3b4e <memcpy+17>
   0x00006c11670b3b5a <+29>:    mov    %rdx,%rcx
   0x00006c11670b3b5d <+32>:    shr    $0x3,%rcx
=> 0x00006c11670b3b61 <+36>:    rep movsq %ds:(%rsi),%es:(%rdi)
   0x00006c11670b3b64 <+39>:    and    $0x7,%edx
   0x00006c11670b3b67 <+42>:    je     0x6c11670b3b6e <memcpy+49>
   0x00006c11670b3b69 <+44>:    movsb  %ds:(%rsi),%es:(%rdi)
   0x00006c11670b3b6a <+45>:    dec    %edx
   0x00006c11670b3b6c <+47>:    jne    0x6c11670b3b69 <memcpy+44>
   0x00006c11670b3b6e <+49>:    retq   
End of assembler dump.
(gdb) info registers 
rax            0x6c115050e258    118821617721944
rbx            0xbde7    48615
rcx            0x207    519
rdx            0xbde7    48615
rsi            0x6c1144a09dc8    118821421620680
rdi            0x6c1150519000    118821617766400
rbp            0xbde7    0xbde7
rsp            0x6c115050e148    0x6c115050e148
r8             0x6c11449ff020    118821421576224
r9             0x3a    58
r10            0x300000000000000    216172782113783808
r11            0x1800e9f0917    1649512745239
r12            0x6c115050e248    118821617721928
r13            0x6c115050e240    118821617721920
r14            0x1801851ace0    1649675447520
r15            0x180185198e0    1649675442400
rip            0x6c11670b3b61    0x6c11670b3b61 <memcpy+36>
eflags         0x10203    [ CF IF RF ]
cs             0x33    51
ss             0x2b    43
ds             0x0    0
es             0x0    0
fs             0x0    0
gs             0x0    0

I filed also bug against Alpine Linux at:
https://bugs.alpinelinux.org/issues/5559


Expected results:

Expected web site to open and show properly. This used to work with firefox 45, so seems to be regression for 46.

Comment 1

2 years ago
Does it crash with the official build?

Updated

2 years ago
Keywords: crash
Summary: crash in brotli WriteRingBuffer → Crash in brotli WriteRingBuffer on Alpine Linux build (against musl c-library)
(Assignee)

Comment 2

2 years ago
(In reply to Loic from comment #1)
> Does it crash with the official build?

There are no official builds with musl libc.
(Assignee)

Comment 3

2 years ago
Is there some way to build firefox against a shared system brotli? It would make it simpler to git bisect brotli to find what introduced this, without needing to rebuild firefox.
Component: Untriaged → Networking
Product: Firefox → Core
Whiteboard: [necko-would-take]
Hi Natanael, thanks for the bug report.

There is no provision for a system brotli.

The brotli library in 46 was upgraded from 45 - so its possible there is a deficiency in that transition. This is interesting. Make sure you have the BROTLI_BUILD_PORTABLE flag defined when building (46 should be including it, I mention just to be sure).

you might also want to follow bug 1275256
it would be interesting for you to substitute the newest brotli lib (which you can find on github) and see if the problem is still reproducible. That would definitely help.
(Reporter)

Comment 6

2 years ago
The issue happens with both BROTLI_BUILD_PORTABLE and without it.

Is something more required then just substituting the brotli directory with the github version?
(Assignee)

Comment 7

2 years ago
Could it be that it simply runs out of stackspace?
https://hg.mozilla.org/releases/mozilla-release/annotate/0b8492c110be/netwerk/streamconv/converters/nsHTTPCompressConv.cpp#l169

It looks like it allocates 128kb on the stack. musl set default stack space to 80k, so you need to pthread_attr_setstacksize for this to work. (I have not checked if firefox does that for its threads)

It also looks like windows set default stack size to 1MB. (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686774%28v=vs.85%29.aspx), which may explain why we see crashes on windows in bug 1275256.

GNU libc set default stack size to 8MB, which might explain why we don't see this crash on GNU/Linux.
(Assignee)

Comment 8

2 years ago
I can confirm that the problem is that it runs out of stack. The following patch solved the mentioned test page:
diff --git a/netwerk/streamconv/converters/nsHTTPCompressConv.cpp b/netwerk/streamconv/converters/nsHTTPCompressConv.cpp
index 1193529..9355ff8 100644
--- a/netwerk/streamconv/converters/nsHTTPCompressConv.cpp
+++ b/netwerk/streamconv/converters/nsHTTPCompressConv.cpp
@@ -165,7 +165,7 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
   nsHTTPCompressConv *self = static_cast<nsHTTPCompressConv *>(closure);
   *countRead = 0;
 
-  const uint32_t kOutSize = 128 * 1024; // just a chunk size, we call in a loop
+  const uint32_t kOutSize = 16 * 1024; // just a chunk size, we call in a loop
   unsigned char outBuffer[kOutSize];
   unsigned char *outPtr;
   size_t outSize;


If we really need 128kb buffer then we should malloc it:

diff --git a/netwerk/streamconv/converters/nsHTTPCompressConv.cpp b/netwerk/streamconv/converters/nsHTTPCompressConv.cpp
index 1193529..25ffe60 100644
--- a/netwerk/streamconv/converters/nsHTTPCompressConv.cpp
+++ b/netwerk/streamconv/converters/nsHTTPCompressConv.cpp
@@ -166,7 +166,7 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
   *countRead = 0;
 
   const uint32_t kOutSize = 128 * 1024; // just a chunk size, we call in a loop
-  unsigned char outBuffer[kOutSize];
+  unsigned char *outBuffer;
   unsigned char *outPtr;
   size_t outSize;
   size_t avail = aAvail;
@@ -176,6 +176,9 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
     *countRead = aAvail;
     return NS_OK;
   }
+  outBuffer = malloc(kOutSize);
+  if (outBuffer == nullptr)
+    return self->mBrotli->mStatus = NS_ERROR_OUT_OF_MEMORY;
 
   do {
     outSize = kOutSize;
@@ -192,6 +195,7 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
 
     if (res == BROTLI_RESULT_ERROR) {
       LOG(("nsHttpCompressConv %p marking invalid encoding", self));
+      free(outBuffer);
       self->mBrotli->mStatus = NS_ERROR_INVALID_CONTENT_ENCODING;
       return self->mBrotli->mStatus;
     }
@@ -202,6 +206,7 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
       MOZ_ASSERT(!avail);
       if (avail) {
         LOG(("nsHttpCompressConv %p did not consume all input", self));
+        free(outBuffer);
         self->mBrotli->mStatus = NS_ERROR_UNEXPECTED;
         return self->mBrotli->mStatus;
       }
@@ -214,6 +219,7 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
                                              outSize);
       LOG(("nsHttpCompressConv %p BrotliHandler ODA rv=%x", self, rv));
       if (NS_FAILED(rv)) {
+        free(outBuffer);
         self->mBrotli->mStatus = rv;
         return self->mBrotli->mStatus;
       }
@@ -222,11 +228,13 @@ nsHTTPCompressConv::BrotliHandler(nsIInputStream *stream, void *closure, const c
     if (res == BROTLI_RESULT_SUCCESS ||
         res == BROTLI_RESULT_NEEDS_MORE_INPUT) {
       *countRead = aAvail;
+      free(outBuffer);
       return NS_OK;
     }
     MOZ_ASSERT (res == BROTLI_RESULT_NEEDS_MORE_OUTPUT);
   } while (res == BROTLI_RESULT_NEEDS_MORE_OUTPUT);
 
+  free(outBuffer);
   self->mBrotli->mStatus = NS_ERROR_UNEXPECTED;
   return self->mBrotli->mStatus;
 }


This can likely be done nicer with a goto.

PS. there is another malloc in the same file that will not free the memory when there are error.
natanael - thanks!

If you want to work up a patch using fallible (otherwise the malloc wrapper intentionally crashes when the allocation fails instead of returning null) and uniquePtr (which will auto free the memory when it goes out of scope) I would be happy to review.
(Assignee)

Comment 10

2 years ago
will UniquePtr work for outBuffer? The code does this in a do {} while loop:

  const uint32_t kOutSize = 128 * 1024; // just a chunk size, we call in a loop
  unsigned char outBuffer[kOutSize];
  ...
  do {
    outSize = kOutSize;
    outPtr = outBuffer;
    ...
  } while (res == BROTLI_RESULT_NEEDS_MORE_OUTPUT);


As I understand UniquePtr, it cannot be copied, so the above will not work?
(Assignee)

Comment 11

2 years ago
I made a diff between commit 33aa40220b96cf95ad2b9ba61dc8d7fd2f964f2c (which mozilla 46.0.1 uses) and brotli git master in dec/ subdir.

It does not solve the problem.
I would have to fuss with the compiler to get the right syntax, but I believe you can fudge it with .get() to find a pointer for inside the loop.. ala https://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttp.cpp#343
(Assignee)

Comment 13

2 years ago
Created attachment 8757698 [details]
MozReview Request: Bug 1274732 - Avoid allocate 128k buffer on stack r?mcmanus

Move the 128k buffer for brotli decompression to the heap instead of stack.
This fixes crash with musl libc which provides 80k stack by default.

Review commit: https://reviewboard.mozilla.org/r/56090/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/56090/
Attachment #8757698 - Flags: review?(mcmanus)
Comment on attachment 8757698 [details]
MozReview Request: Bug 1274732 - Avoid allocate 128k buffer on stack r?mcmanus

https://reviewboard.mozilla.org/r/56090/#review53096

thanks!
Attachment #8757698 - Flags: review?(mcmanus) → review+
Assignee: nobody → natanael.copa
Whiteboard: [necko-would-take] → [necko-active]
Duplicate of this bug: 1275256
Hi Natanael - do you need help getting the try run and getting this landed?

https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/How_to_Submit_a_Patch

thanks again for your help here
Flags: needinfo?(natanael.copa)

Comment 17

2 years ago
Pushed by mcmanus@ducksong.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5bc75c0144b5
Avoid allocate 128k buffer on stack r=mcmanus
I landed this because I wanted it to hit gecko 49. thanks again.
Flags: needinfo?(natanael.copa)
Target Milestone: --- → mozilla49

Comment 19

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/5bc75c0144b5
Status: UNCONFIRMED → RESOLVED
Last Resolved: 2 years ago
status-firefox49: --- → fixed
Resolution: --- → FIXED
Comment on attachment 8757698 [details]
MozReview Request: Bug 1274732 - Avoid allocate 128k buffer on stack r?mcmanus

Approval Request Comment
[Feature/regressing bug #]: brotli - 46
[User impact if declined]: possible crasher
[Describe test coverage new/current, TreeHerder]: tree herder
[Risks and why]: low - been in 49 and 50 for a while
[String/UUID change made/needed]:

fix an occasional crasher - as we see more brotli (esp from fb) I would expect rate of it to climb. This is a safe change.
Attachment #8757698 - Flags: approval-mozilla-beta?
status-firefox48: --- → affected
Comment on attachment 8757698 [details]
MozReview Request: Bug 1274732 - Avoid allocate 128k buffer on stack r?mcmanus

Fix a crash, taking it.
Attachment #8757698 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 22

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/5205b34086e2
status-firefox48: affected → fixed
You need to log in before you can comment on or make changes to this bug.