Closed Bug 1396395 Opened 7 years ago Closed 7 years ago

Firefox crashes when submitting form

Categories

(Core :: Networking: HTTP, defect)

52 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- fixed

People

(Reporter: bugzilla-mozilla, Assigned: schien)

References

Details

Crash Data

Attachments

(3 files, 3 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0
Build ID: 20170802111520

Steps to reproduce:

1. Open http://sagamusix.untergrund.net/fileupload.htm in Firefox 52 ESR.
2. Select a small file and hit the submit button.



Actual results:

Firefox crashes when trying to upload the file. Trying to upload to the same domain as the HTML file resides on interestingly does not trigger the crash, nor when opening the file from the file:// protocol.


Expected results:

Firefox should certainly not crash when trying to submit a form.
Note: The crash apparently only happens if the uBlock Origin add-on is installed.
After further observation, the crash happens only if uBlock Origin and either one of the following legacy add-ons is installed:
* BetterStop 1.0.1
* Image Zoom (ugly fixes) 0.0.1
* Web Developer 1.2.13
Did you report the crashes to us? Did it crash the browser or just the tab on that page?

Please open about:crashes and give us the crash IDs. If they show up in the "unsubmitted" section (at the top, if you have any) then click the link to make sure it's sent first (the ID will change when assigned by our server).
Flags: needinfo?(bugzilla-mozilla)
Group: firefox-core-security → dom-core-security
Component: Untriaged → DOM
Product: Firefox → Core
Yes, I submitted those crashes. Since I am not using the new process separation feature, the entire browser crashed.
The submitted crashes are:

bp-7571746f-4690-4f61-b08d-3eee50170906
bp-ec70578b-e71c-4e0c-a82f-6ca530170906
bp-0ed5794b-82c5-452e-8b1c-81c620170903
bp-93ce9787-1369-4b38-a483-c1e340170903
bp-40f7d74f-f29b-452b-a79f-1882d0170903
bp-1ffade32-4e8c-4e09-a406-79c1a0170903
bp-7b13c0ab-f6f0-435c-b4cd-665810170903
Flags: needinfo?(bugzilla-mozilla)
Thanks for the crash reports. Those look like null pointer crashes, so I'm unhiding it. Henri, these are HTML parser crashes. Do you have any ideas here?
Group: dom-core-security
Component: DOM → HTML: Parser
Flags: needinfo?(hsivonen)
bug 943150 is an older bug filed on this issue, but now there are STR at least.
I have a crash (not sure if the crash) in rr. For starters:

[22452] ###!!! ASSERTION: Got data on wrong stream.: 'mRequest == aRequest', file /opt/Projects/gecko/parser/html/nsHtml5StreamParser.cpp, line 1163
[22452] ###!!! ASSERTION: Got Stop on wrong stream.: 'mRequest == aRequest', file /opt/Projects/gecko/parser/html/nsHtml5StreamParser.cpp, line 1068
[22452] ###!!! ASSERTION: DoDataAvailable called when stream not open.: 'STREAM_BEING_READ == mStreamState', file /opt/Projects/gecko/parser/html/nsHtml5StreamParser.cpp, line 1085
Assertion failure: mRawPtr != nullptr (You can't dereference a NULL RefPtr with operator->().), at /opt/Projects/gecko/obj-x86_64-pc-linux-gnu/dist/include/mozilla/RefPtr.h:319
Assignee: nobody → hsivonen
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(hsivonen)
(And, yes, it the reported crash.)
Findings so far: The synchronous XHR on the test case page runs a nested event loop, and the nested loop manages to deliver OnDataAvailable to the parser without delivering OnStartRequest first.
In nsURILoader.cpp we have:
  rv = DispatchContent(request, aCtxt);

  LOG(("  After dispatch, m_targetStreamListener: 0x%p, rv: 0x%08X", m_targetStreamListener.get(), rv));

  NS_ASSERTION(NS_SUCCEEDED(rv) || !m_targetStreamListener,
               "Must not have an m_targetStreamListener with a failure return!");

  NS_ENSURE_SUCCESS(rv, rv);
  
  if (m_targetStreamListener)
    rv = m_targetStreamListener->OnStartRequest(request, aCtxt);

DispatchContent() ends up calling nsDocumentViewer::PageHide, which causes sync XHR to be run and has this gem: AutoDontWarnAboutSyncXHR disableSyncXHRWarning;.

It's a particularly dangerous point to be running a nested event loop, because the line in nsURILoader.cpp that calls OnStartRequest hasn't run yet, but Necko can already place OnDataAvailable runnables in the event queue.
While I could add counter-measures to the HTML parser, it wouldn't fix other stream listeners.

mcmanus, would it be feasible to add some mechanism to Necko that would prevent OnDataAvailable and OnStopRequest runnables from being enqueued before OnStartRequest has returned?
Flags: needinfo?(mcmanus)
I bet :schien has a good insight into this
Flags: needinfo?(mcmanus) → needinfo?(schien)
I can also reproduce this issue in non-e10s window with latest nightly build, using the STR in description.

In e10s mode, HttpChannelChild already guarantees that OnDataAvailable/OnStopRequest will not be enqueued on target thread before OnStartRequest has returned, with the ChannelEventQueue mechanism.

in non-e10s mode, nsHttpChannel uses on nsInputStreamPump to control the sequence of OnStartRequest/OnDataAvailable/OnStopRequest. Need to capture corresponding moz log to analyze the callback order. Keep the ni? before providing the investigation result.
See Also: → 1396396
Based on my analysis in comment #14, nsHttpChannel should not resume the nsInputStreamPump before |mCallOnResume| is complete, to ensure that no input stream event can interrupt the resumed call stack before it finished.
Steal from @hsvionen because we should fix it in nsHttpChannel. Test case is on the way and I'd like my patch to be reviewed in parallel.
Assignee: hsivonen → schien
Component: HTML: Parser → Networking: HTTP
comment #14 is partially wrong, it is the suspension during "http-on-examine-response" that causes the problem, not "http-on-modify-request".

1. OnStartRequest callback chain is interrupted by add-on during the "http-on-examine-response" observer event. Therefore, nsInputStreamPump think OnStartRequest is finished.

2. After resuming http channel, nsHttpChannel asynchronously continue the OnStartRequest procedure and synchronously resume the nsInputStreamPump.

3. Before nsDocumentOpenInfo invoke the next OnStartRequest on the listener chain, sync XHR in web content is executed on the call stack. This will spin main thread event queue and will eventually callback OnDataAvailable/OnStopRequest on the same call stack.
Attachment #8907439 - Attachment is obsolete: true
Comment on attachment 8908543 [details]
Bug 1396395 - Part 1, resume input pump after callback finished.

https://reviewboard.mozilla.org/r/180208/#review186614

Nice analysis.
Attachment #8908543 - Flags: review?(dd.mozilla) → review+
Comment on attachment 8909708 [details]
Bug 1396395 - Part 2, add test case.

https://reviewboard.mozilla.org/r/181226/#review186620
Attachment #8909708 - Flags: review?(dd.mozilla) → review+
Attached patch esr52-bug1396395-p1.patch (obsolete) — Splinter Review
for esr52 uplift
Attached patch esr52-bug1396395-p2.patch (obsolete) — Splinter Review
for esr52 uplift
Pushed by schien@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8979c55c1a4b
Part 1, resume input pump after callback finished. r=dragana
https://hg.mozilla.org/integration/autoland/rev/d7c3a31373a6
Part 2, add test case. r=dragana
https://hg.mozilla.org/mozilla-central/rev/8979c55c1a4b
https://hg.mozilla.org/mozilla-central/rev/d7c3a31373a6
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Will the patches be uplifted to 56 and 57 too?

This bug is can easily be triggered by usual user actions.
For a self-contained and reliable STR see https://bugzilla.mozilla.org/show_bug.cgi?id=1401516#c5
This is too late for 52 & 56 but we could take a fix to 57 if it is really safe.
(In reply to Sylvestre Ledru [:sylvestre] from comment #31)
> This is too late for 52 & 56 but we could take a fix to 57 if it is really
> safe.

I understand we won't take this into 56.0.x, but why not for 52.x.0 ESR?
Flags: needinfo?(sledru)
Because we only take security fixes in ESR after the version 2. (or real super important crashes)
With ESR, we focus on security and stability of the product, not crashes fixes.

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process#What_should_land_on_mozilla-esr52.2Fmozilla-esr59 for more information
Flags: needinfo?(sledru)
(In reply to Sylvestre Ledru [:sylvestre] from comment #33)
> With ESR, we focus on security and stability of the product, not crashes
> fixes.

OK. I thought a bug like this would have qualified as "stability". (I find the characterization "stability [...] not crash fixes" very confusing.)
Sorry for the bad wording. By stability, I meant that we commit in not changing anything for the sysadmin and users.

Not crashes
Comment on attachment 8908543 [details]
Bug 1396395 - Part 1, resume input pump after callback finished.

Approval Request Comment
[Feature/Bug causing the regression]: maybe 392837
[User impact if declined]: crash might happen while viewing web content if e10s is disabled and particular extension is installed
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no, covered by auto test
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: low
[Why is the change risky/not risky?]:
[String changes made/needed]: no
Flags: needinfo?(schien)
Attachment #8908543 - Flags: approval-mozilla-beta?
Comment on attachment 8909708 [details]
Bug 1396395 - Part 2, add test case.

Approval Request Comment
[Feature/Bug causing the regression]: maybe 392837
[User impact if declined]: crash might happen while viewing web content if e10s is disabled and particular extension is installed
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no, covered by auto test
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: not risky
[Why is the change risky/not risky?]: add test case
[String changes made/needed]: no
Attachment #8909708 - Flags: approval-mozilla-beta?
(In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #36)
> Comment on attachment 8908543 [details]
> Bug 1396395 - Part 1, resume input pump after callback finished.
> 
> Approval Request Comment
> [Is the change risky?]: low
> [Why is the change risky/not risky?]:
only adjusting runnable sequence
Comment on attachment 8910107 [details] [diff] [review]
esr52-bug1396395-p1.patch

too late for esr52
Attachment #8910107 - Attachment is obsolete: true
Attachment #8910108 - Attachment is obsolete: true
Comment on attachment 8909708 [details]
Bug 1396395 - Part 2, add test case.

This is not a new regression and a fix for non-e10s which is not our default setting. I'd prefer to limit the risk to beta57 by letting this fix ride the 58 train.
Attachment #8909708 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #8908543 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
I really thought this would be fixed for FF57. I see many reports of such crashes, including for Android:

- https://github.com/gorhill/uBlock/issues/3171 (uBlock Origin)
- https://github.com/gorhill/uBlock/issues/3087 (uBlock Origin)
- https://github.com/pfn/passifox/issues/612 (PassIFox)
- https://github.com/pfn/passifox/issues/620 (PassIFox/Privacy Badger)

The issue occurs for any extension using a `blocking` webRequest.onHeadersReceived listener. For example, it also happens with Privacy Badger and Adguard AdBlocker -- those I happened to have tested, surely there is many more on AMO.
ni? @ritu to see if this matched with any of the uplifting criteria, since this bug might affecting the stability of Fennec 57 based on comment #41.
Flags: needinfo?(rkothari)
Hi Schien, Thanks for the ping. I looked at the crash volume for the following signature "nsHtml5StreamParser::ParseAvailableData" for the past 3 months and see ~40 hits on Fennec56 and <10 on Fennec Beta57. This seems to be a very low volume crash and I am not convinced this needs to be uplifted to Beta57.

NI Kevin, Marcia to correct me if I am wrong about the crash sign or volume on this one.
Flags: needinfo?(rkothari)
Flags: needinfo?(mozillamarcia.knous)
Flags: needinfo?(kbrosnan)
I can corroborate what Ritu says in Comment 43. I don't see very high crash volume for 57 on Fennec, and the crash hasn't been seen since 57.0b7 (I am searching nsHtml5StreamParser::ParseAvailableData)
Crash Signature: [@ nsHtml5StreamParser::ParseAvailableData]
Flags: needinfo?(mozillamarcia.knous)
Flags: needinfo?(kbrosnan)
Depends on: 1502055
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: