Closed
Bug 1496581
Opened 7 years ago
Closed 7 years ago
Crash in arena_dalloc | nsLocalFile::Release
Categories
(Core :: DOM: Core & HTML, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla64
| Tracking | Status | |
|---|---|---|
| firefox-esr60 | --- | unaffected |
| firefox62 | --- | unaffected |
| firefox63 | --- | unaffected |
| firefox64 | + | fixed |
People
(Reporter: calixte, Assigned: baku)
References
(Blocks 1 open bug)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file, 5 obsolete files)
|
65.16 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-01b4d072-4d31-42bb-84dd-40d490181004.
=============================================================
Top 10 frames of crashing thread:
0 mozglue.dll static void arena_dalloc memory/build/mozjemalloc.cpp:3527
1 xul.dll nsLocalFile::Release xpcom/io/nsLocalFileWin.cpp:862
2 xul.dll nsFileStreamBase::DoOpen netwerk/base/nsFileStreams.cpp:378
3 xul.dll nsresult nsFileInputStream::SeekInternal netwerk/base/nsFileStreams.cpp:548
4 xul.dll mozilla::net::HttpBaseChannel::EnsureUploadStreamIsCloneable netwerk/protocol/http/HttpBaseChannel.cpp:928
5 xul.dll bool mozilla::net::HttpChannelParent::DoAsyncOpen netwerk/protocol/http/HttpChannelParent.cpp:729
6 xul.dll mozilla::net::HttpChannelParent::Init netwerk/protocol/http/HttpChannelParent.cpp:139
7 xul.dll mozilla::net::NeckoParent::RecvPHttpChannelConstructor netwerk/ipc/NeckoParent.cpp:323
8 xul.dll mozilla::net::PNeckoParent::OnMessageReceived ipc/ipdl/PNeckoParent.cpp:991
9 xul.dll void mozilla::ipc::MessageChannel::DispatchMessageW ipc/glue/MessageChannel.cpp:2175
=============================================================
There are 8 crashes (from 2 installations) in nightly 64 with buildid 20181004100222. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1494176.
[1] https://hg.mozilla.org/mozilla-central/rev?node=f2ff07352cbe
Flags: needinfo?(amarchesini)
Comment 1•7 years ago
|
||
I also got this crash in latest Nightly in one of my mobile.twitter.com tabs, when switching to it from another one. The one that crashes is in the main container, the one that works prior to that is in a Private container tab.
Crash ID: bp-09bbc02b-320b-48e9-b779-563420181005
QA Contact: overholt
| Reporter | ||
Updated•7 years ago
|
Crash Signature: [@ arena_dalloc | nsLocalFile::Release] → [@ arena_dalloc | nsLocalFile::Release]
[@ nsFileStreamBase::DoOpen]
[@ <name omitted> | nsFileStreamBase::DoOpen]
Comment 2•7 years ago
|
||
Yes, I can reliably reproduce this when just opening a tab in the standard context, or the Private context I had previously also logged into Twitter, and bring up https://mobile.twitter.com. The main Firefox process then immediately crashes. I marked further submitted reports with my handle, comments start with MarcoZ: .
| Assignee | ||
Comment 3•7 years ago
|
||
Thanks Marco. Can you tell me the correct STR? Are you able to reproduce it on linux or mac?
Flags: needinfo?(amarchesini)
| Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(mzehe)
Comment 4•7 years ago
|
||
All I have to do is:
1. Press CTRL+T to open a new tab.
2. Go to https://mobile.twitter.com and press Enter.
The crash occurs immediately upon page load.
I currently have no Mac or Linux at my disposal to try, sorry.
Flags: needinfo?(mzehe)
| Assignee | ||
Comment 5•7 years ago
|
||
I'm not able to reproduce it.
Marion, do you know anybody who can help here?
Flags: needinfo?(mdaly)
Comment 6•7 years ago
|
||
Perry, can you try to reproduce this? Once you get it reproduced, consult with Andrew on best next steps.
Flags: needinfo?(mdaly) → needinfo?(perry)
Updated•7 years ago
|
QA Contact: overholt
| Comment hidden (obsolete) |
Comment 9•7 years ago
|
||
I'm unable to reproduce this on Windows 10, macOS (High Sierra), and Linux (Ubuntu 18.04).
Comment 10•7 years ago
|
||
## Repro-Wise ##
If I go to https://mobile.twitter.com logged in and do "Work offline" from the hamburger menu's "More" sub-menu to force Firefox offline, I find that a number of things get stored in the "sync" IndexedDB data's "queue" object store.
These end up looking like the following devtools console output:
"""
body: Blob
size: 162
type: "application/x-www-form-urlencoded"
<prototype>: BlobPrototype { slice: slice(), size: Getter, type: Getter, … }
cache: "no-store"
credentials: "include"
headers: Object { accept: "*/*", "accept-language": "en-US,en;q=0.5", authorization: "ELIDED", … }
method: "POST"
mode: "cors"
redirect: "follow"
referrer: "https://mobile.twitter.com/compose/tweet"
url: "https://api.twitter.com/1.1/jot/client_event.json"
"""
Note that this is manually from doing the following in the console; the devtools' storage tab is great but hides the fact that the "body" is a Blob because it just JSONs the object or something like that.
"""
mdbReq = indexedDB.open('sync', 1)
mdb = mdbReq.result
mdbAll = mdb.transaction('queue').objectStore('queue').getAll()
mdbAll.result
"""
The contents of the Blob are URL-encoded JSON that seem to be metadata about all the videos I was potentially shown. The Blobs can become as large as 20KiB if I scroll around a lot. If I had let Twitter download more tweets before going offline, it's possible the Blob got huge.
I haven't been able to quite repro on this. See below for a theory about what's going on. To maximize repro, I expect you'd want to browse to mobile.twitter.com, switch to be offline and let the IndexedDB stuff really pile up, then close the site. Then go online again and go back to the site.
## Theory-wise ##
Looking at the crash signatures, it looks like half of the crashes happen on the StreamTransportService's thread-pool in InputStreamLengthHelper while the main thread is still in HttpChannelParent::DoAsyncOpen in InputStreamLengthHelper-related logic. This suggests that there may be a race that involves calling methods on the underlying file stream concurrently from the main thread and the STS thread.
Comment 11•7 years ago
|
||
Hm, actually, maybe don't "work offline" with twitter open too long. My mobile twitter is now broken for some reason, possible due to the massive number of backlogged data? Not sure.
Comment 12•7 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #11)
> Hm, actually, maybe don't "work offline" with twitter open too long. My
> mobile twitter is now broken for some reason, possible due to the massive
> number of backlogged data? Not sure.
Then yours is at the same state as mine now. I can't open it without immediately crashing, either. And I didn't even work offline, at least not intentionally.
Comment 13•7 years ago
|
||
Here are a few more of mine:
bp-c618ac5b-a7ce-497b-8a62-a06750181005
bp-24c8114d-03ba-47f7-a8e6-af2550181005
Comment 15•7 years ago
|
||
Same issue here, though it seems a bit less consistent with today's Nightly - I can load Twitter and scroll a bit sometimes, and then it crashes.
Comment 16•7 years ago
|
||
Today's Nightly also crashes on Linux for me: https://crash-stats.mozilla.com/report/index/0dadecf7-77c7-4e40-a907-5d3df0181007
Updated•7 years ago
|
tracking-firefox64:
--- → +
| Assignee | ||
Comment 17•7 years ago
|
||
nsFileInputStream can be easily broken because it pretends to be thread-safe, but it doesn't protect its members. This crash seems to be a race condition. I'm going to add a mutex and some security checks as first step.
| Assignee | ||
Comment 18•7 years ago
|
||
This patch adds a mutex to protect internal members for nsFileStreams.
What I suspect it's a race condition, and, probably, this patch would help to understand more what is going on.
Assignee: nobody → amarchesini
Attachment #9015497 -
Flags: review?(honzab.moz)
Comment 19•7 years ago
|
||
Comment on attachment 9015497 [details] [diff] [review]
lock.patch
Review of attachment 9015497 [details] [diff] [review]:
-----------------------------------------------------------------
There is one major bad thing about this patch: you are doing likely blocking I/O (various PR_* calls) under the lock. When main thread happens to try to acquire the lock as well, we have a jank.
You will have to have a "ref counter" for mFD, similarly to
https://searchfox.org/mozilla-central/rev/80ac71c1c54af788b32e851192dfd2de2ec18e18/netwerk/base/nsSocketTransport2.cpp#2045
https://searchfox.org/mozilla-central/rev/80ac71c1c54af788b32e851192dfd2de2ec18e18/netwerk/base/nsSocketTransport2.cpp#2115
Attachment #9015497 -
Flags: review?(honzab.moz) → review-
Updated•7 years ago
|
Priority: -- → P2
| Assignee | ||
Comment 20•7 years ago
|
||
Attachment #9015497 -
Attachment is obsolete: true
Attachment #9016229 -
Flags: review?(honzab.moz)
Comment 21•7 years ago
|
||
Comment on attachment 9016229 [details] [diff] [review]
lock.patch
dropping r after an irc discussion.
Attachment #9016229 -
Flags: review?(honzab.moz)
Comment 22•7 years ago
|
||
I get this one regularly (bp-4a210d4a-a3ac-45f3-8cff-536600181014 etc), although I'm not sure how to reproduce it in an isolated environment.
Just now, I got a deadlock opening mobile.twitter.com, and trying to send SIGABRT deadlocked again on breakpad. The backtrace doesn't look the same but I hope it's related:
(gdb) bt
#0 0x00007febbcc91edc in __lll_lock_wait () at /usr/lib/libpthread.so.0
#1 0x00007febbcc8b298 in pthread_mutex_lock () at /usr/lib/libpthread.so.0
#2 0x00007febb3262e19 in google_breakpad::ExceptionHandler::SignalHandler(int, siginfo*, void*) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#3 0x00007febb327506f in nsProfileLock::FatalSignalHandler(int, siginfo*, void*) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#4 0x00007febbcc933c0 in <signal handler called> () at /usr/lib/libpthread.so.0
#5 0x00007febbcc91edc in __lll_lock_wait () at /usr/lib/libpthread.so.0
#6 0x00007febbcc8b3f3 in pthread_mutex_lock () at /usr/lib/libpthread.so.0
#7 0x00005558aa5d28f9 in arena_dalloc(void*, unsigned long, arena_t*) ()
#8 0x00007febb369da47 in mozilla::Runnable::Release() () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#9 0x00007febb09db949 in mozilla::net::HttpChannelParent::DoAsyncOpen(mozilla::ipc::URIParams const&, mozilla::ipc::OptionalURIParams const&, mozilla::ipc::OptionalURIParams const&, mozilla::ipc::OptionalURIParams const&, unsigned int const&, mozilla::ipc::OptionalURIParams const&, mozilla::ipc::OptionalURIParams const&, unsigned int const&, nsTArray<mozilla::net::RequestHeaderTuple> const&, nsTString<char> const&, mozilla::ipc::OptionalIPCStream const&, bool const&, short const&, unsigned int const&, unsigned char const&, bool const&, unsigned int const&, bool const&, unsigned long const&, nsTString<char> const&, bool const&, nsTString<char> const&, bool const&, bool const&, bool const&, unsigned int const&, mozilla::net::OptionalLoadInfoArgs const&, mozilla::net::OptionalHttpResponseHead const&, nsTString<char> const&, unsigned int const&, unsigned long const&, mozilla::net::OptionalCorsPreflightArgs const&, unsigned int const&, bool const&, bool const&, bool const&, nsTString<char> const&, unsigned int const&, unsigned int const&, unsigned long const&, nsTString<char16_t> const&, unsigned long const&, nsTString<char> const&, unsigned long const&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, bool const&, mozilla::TimeStamp const&) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#10 0x00007febb09da57e in mozilla::net::HttpChannelParent::Init(mozilla::net::HttpChannelCreationArgs const&) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#11 0x00007febb0a4a0bc in mozilla::net::NeckoParent::RecvPHttpChannelConstructor(mozilla::net::PHttpChannelParent*, mozilla::dom::PBrowserOrId const&, IPC::SerializedLoadContext const&, mozilla::net::HttpChannelCreationArgs const&) ()
at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#12 0x00007febb0c39f44 in mozilla::net::PNeckoParent::OnMessageReceived(IPC::Message const&) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#13 0x00007febb3bf3847 in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#14 0x00007febb3bf43fe in mozilla::ipc::MessageChannel::MessageTask::Run() () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#15 0x00007febb3a3aeb2 in nsThread::ProcessNextEvent(bool, bool*) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#16 0x00007febb3a3d06a in NS_ProcessNextEvent(nsIThread*, bool) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#17 0x00007febb3bf55b9 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#18 0x00007febb3be4bc5 in MessageLoop::Run() () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#19 0x00007febb251c7f9 in nsBaseAppShell::Run() () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#20 0x00007febb31f1be3 in nsAppStartup::Run() () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#21 0x00007febb3280f3a in XREMain::XRE_mainRun() () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#22 0x00007febb3281fc7 in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#23 0x00007febb32826a9 in XRE_main(int, char**, mozilla::BootstrapConfig const&) () at /home/dx/test/asd/homer/firefox-nightly/libxul.so
#24 0x00005558aa5c0973 in do_main(int, char**, char**) ()
#25 0x00005558aa5c077a in main ()
While typing this comment, i got bp-4a210d4a-a3ac-45f3-8cff-536600181014 - this time mobile.twitter.com wasn't even focused but it did open earlier in the same session.
| Assignee | ||
Comment 23•7 years ago
|
||
Instead of looking for a workaround, here I split ::Seek() and ::Tell() in 2 different interfaces. nsPipeInputStream implements only nsITellableStream; all the other interfaces implement nsISeekableStream + nsITellableStream.
This is nice because it fixes the issue we have, plus, it removes the hack around the implementation of ::Seek(), which causes I/O on main-thread.
Attachment #9016229 -
Attachment is obsolete: true
Attachment #9017152 -
Flags: review?(nfroyd)
Attachment #9017152 -
Flags: feedback?(honzab.moz)
Comment 24•7 years ago
|
||
Comment on attachment 9017152 [details] [diff] [review]
tell_1.patch
Review of attachment 9017152 [details] [diff] [review]:
-----------------------------------------------------------------
I actually like the "tellable" name :)
::: xpcom/io/nsITellableStream.idl
@@ +16,5 @@
> + /**
> + * tell
> + *
> + * This method reports the current offset, in bytes, from the start of the
> + * stream.
nit, a bit richer comments would be good and trailing ws
Attachment #9017152 -
Flags: feedback?(honzab.moz) → feedback+
Comment 25•7 years ago
|
||
Comment on attachment 9017152 [details] [diff] [review]
tell_1.patch
Review of attachment 9017152 [details] [diff] [review]:
-----------------------------------------------------------------
Please address Honza's comment.
Please also write a commit message that explains what the problem is and why doing this addresses it. I can see that the refactoring is correct, but I'm completely in the dark as to why it might be the correct thing to do to add more indirection here.
::: netwerk/base/nsInputStreamPump.cpp
@@ +573,5 @@
> // an infinite loop. we do our best here to try to catch
> // such an error. (see bug 189672)
>
> // in most cases this QI will succeed (mAsyncStream is almost always
> // a nsPipeInputStream, which implements nsISeekableStream::Tell).
Nit: comment needs updating here.
Attachment #9017152 -
Flags: review?(nfroyd) → review-
| Assignee | ||
Comment 26•7 years ago
|
||
In this patch comments are updated and improved.
I don't think it's nice to write a comment explaining that initially, nsISeekableStream and nsITellableStream were the same class, because after X years, that comment would be nonsense.
I can tell more why this patch is needed here on bugzilla:
In the current code there are 3 main issues:
1. nsFileStream is not really thread-safe. There is nothing to protect the internal members and we see crashes.
2. nsPipeInputStream doesn't implement ::Seek() method and that caused issues in devtools when a nsHttpChannel sends POST data using a pipe. In order to fix this, bug 1494176 added a check in nsHttpChannel: if the stream doesn't implement ::Seek(), let's clone it. This was an hack around nsPipeInputStream, and it's bad.
3. When nsHttpChannel sends POST data using a file stream, nsFileStream does I/O on main-thread because of the issue 2. Plus, ::Seek() is called on the main-thread causing issue 1.
Note that nsPipeInputStream implements only ::Tell(), of the nsISeekableStream methods. It doesn't implement ::Seek() and it doesn't implement ::SetEOF().
With this patch I want to fix point 2 and point 3 (and consequentially issue 1 - but we need a separate fix for it - follow up). The patch does:
1. it splits nsISeekableStream in 2 interfaces: nsITellableStream and nsISeekableStream.
2. nsPipeInputStream implements only nsITellableStream. Doing this, we don't need the ::Seek() check for point 2 in nsHttpChannel: a simple QI check is enough.
3. Because we don't call ::Seek() in nsHttpChannel, nsFileStream doesn't do I/O on the main-thread, and we don't crash doing so.
Attachment #9017152 -
Attachment is obsolete: true
Attachment #9017802 -
Flags: review?(nfroyd)
Comment 27•7 years ago
|
||
A commit message seems like exactly the right place for those explanations...
| Assignee | ||
Comment 28•7 years ago
|
||
Attachment #9017883 -
Flags: review?(nfroyd)
| Assignee | ||
Comment 29•7 years ago
|
||
Attachment #9017802 -
Attachment is obsolete: true
Attachment #9017883 -
Attachment is obsolete: true
Attachment #9017802 -
Flags: review?(nfroyd)
Attachment #9017883 -
Flags: review?(nfroyd)
Attachment #9017892 -
Flags: review?(nfroyd)
Comment 30•7 years ago
|
||
Comment on attachment 9017892 [details] [diff] [review]
patch
Review of attachment 9017892 [details] [diff] [review]:
-----------------------------------------------------------------
Commit message final paragraph looks like it needs some line breaks to maintain the bullet points.
r=me with the changes below, thank you!
::: xpcom/io/nsISeekableStream.idl
@@ +11,5 @@
> *
> + * This class inherits nsITellableStream in order to let streams to implement
> + * ::Tell() without implementing the whole nsISeekableStream interface. Callers
> + * can QI the stream to know what is implemented. This is mainly done for
> + * nsPipeInputStream.
I feel like this kind of comment--slightly rewritten, of course, really belongs in nsITellableStream, 'cause it's not really obvious why nsITellableStream is a standalone interface.
::: xpcom/io/nsITellableStream.idl
@@ +19,5 @@
> + /**
> + * tell
> + *
> + * This method reports the current offset, in bytes, from the start of the
> + * stream.
Nit: trailing whitespace on these two lines.
Attachment #9017892 -
Flags: review?(nfroyd) → review+
Comment 31•7 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #27)
> A commit message seems like exactly the right place for those explanations...
Don't want to start political discussion here, but I find it way more useful to have an explanation in bugzilla (maybe as well?). Commit messages just pollute hg logs and when I first find the bug (not the changeset) when hunting for explanations, I will never look at the commit messages, I will just give up with "there is no explanation!"
Comment 32•7 years ago
|
||
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2da6504c901e
Split nsISeekableStream in 2 classes: nsISeekableStream and nsITellableStream, f=mayhemer, r=froydnj
Comment 33•7 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Comment 35•7 years ago
|
||
Some notes from chat with baku:
there are 3 issues here:
1. I/O on main-thread
2. the crash in nsFileStream (because of concurrent access to mOpenParams.localFile)
3. the bug in nsPipeInputStream not implementing ::Seek()
#1 and #3 from the above listed issues have been fixed by this bug, but we definitely need a followup for the issue #2. it should at least add diagnostic assertions on single thread access to mOpenParams.localFile (which is a nsCOMPtr<nsIFile>.
Updated•6 years ago
|
Component: HTML: Form Submission → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•