Closed
Bug 1470355
Opened 6 years ago
Closed 6 years ago
Firefox is unresponsive when visiting https://www.youtube.com/premium
Categories
(Core :: Networking, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla63
People
(Reporter: github, Assigned: baku)
References
()
Details
(Keywords: nightly-community, regression, Whiteboard: [necko-triaged])
Attachments
(3 files, 1 obsolete file)
168.13 KB,
text/plain
|
Details | |
851 bytes,
patch
|
valentin
:
review+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
1.02 KB,
patch
|
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Just tested it with a fresh profile under macOS 10.13.3. STR: Open https://www.youtube.com/premium What happens: Under macOS the Spinning Rainbow Wheel appears as the page is loaded partly, only a hard kill with the Activity Manager (Task Manager) helps. As I'm running into some issues with the regression tool it's hard for me to point to a exact regressions. I definitely opened that page some days ago and it worked then (I'm always on the last builds)
Reporter | ||
Updated•6 years ago
|
Keywords: nightly-community
Comment 1•6 years ago
|
||
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:63.0) Gecko/20100101 Firefox/63.0 Build ID: 20180626220124 I didn't managed to reproduce this issue on the latest nightly build 63.0a1 (2018-06-26). Could you please provide additional steps to the STR? Also, could you provide your hardware information? (model and year) And please test if the issue is reproducible in safe mode, here is a link that can help you: https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode
Flags: needinfo?(github)
Reporter | ||
Comment 2•6 years ago
|
||
Unfortunately there's no more STR; I just type in the URL, press enter and wait till Firefox becomes unresponsive. I've just tested it with 63.0a1 (2018-06-27) and 63.0a1 (2018-06-26) in a fresh profile and with safe mode on; still the same issue. One thing that might would have an effect on this is the Country your IP's is from in combination with the language? Youtube might deliver different pages for different countries. I'm currently in the US with en_US I also tried to safe the source code with Safari and than open it with Nightly but there I don't have any issue with loading the page. I've also tried it with 60.0b14 and 62.0b3 and both time I don't run into the issue.
Flags: needinfo?(github) → needinfo?(george.craciun)
Comment 3•6 years ago
|
||
I can confirm the issue on macOS 10.13 and Firefox Nightly. Firefox freezes after opening the link from comment #0.
Reporter | ||
Comment 4•6 years ago
|
||
After asking for confirmation in a forum I think the issue is not specific to a language (so far crash for German and English site is confirmed) and Linux is probably also affected but Windows (10) definitely not.
Comment 5•6 years ago
|
||
User Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:63.0) Gecko/20100101 Firefox/63.0 Build ID 20180627222831 Finally I've managed to reproduce this issue on the latest Nightly build. The issue also occurs on Linux.
Status: UNCONFIRMED → NEW
Component: Untriaged → JavaScript Engine
Ever confirmed: true
Product: Firefox → Core
Comment 6•6 years ago
|
||
This sounds a little bit similar to bug 1471409. Is this a perma-hang? If someone here is able to hit it reliably in macOS, can you please then open up the Activity Monitor, choose Firefox in the list of processes, and collect a process sample? (The gear icon near the top of the Activity Monitor should let you do that). Once a process sample is gathered, can it then be attached to this bug please? That way, hopefully, we'll get some visibility into what the hang is, well, hanging on.
See Also: → 1471409
Reporter | ||
Comment 7•6 years ago
|
||
Here we go, attached the sample of the non-responding process.
Comment 8•6 years ago
|
||
+ 222 nsBaseAppShell::NativeEventCallback() (in XUL) + 113 [0x10d277cb1] + 222 NS_ProcessPendingEvents(nsIThread*, unsigned int) (in XUL) + 107 [0x10af821bb] + 222 nsThread::ProcessNextEvent(bool, bool*) (in XUL) + 1800 [0x10af83d78] + 222 nsInputStreamReadyEvent::Run() (in XUL) + 43 [0x10af5de5b] + 222 nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (in XUL) + 265 [0x10b01a5c9] + 222 nsInputStreamPump::OnStateTransfer() (in XUL) + 510 [0x10b01aa5e] + 222 mozilla::net::nsHttpChannel::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long long, unsigned int) (in XUL) + 923 [0x10b32d1cb] + 222 mozilla::net::HttpChannelParent::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long long, unsigned int) (in XUL) + 352 [0x10b2f9290] + 222 NS_ReadInputStreamToString(nsIInputStream*, nsTSubstring<char>&, long long, unsigned long long*) (in XUL) + 136 [0x10b023c68] + 222 NS_ReadInputStreamToBuffer(nsIInputStream*, void**, long long, unsigned long long*) (in XUL) + 1200 [0x10b023a00] + 222 mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) (in libmozglue.dylib) + 28 [0x101f1268c] + 222 _pthread_cond_wait (in libsystem_pthread.dylib) + 732 [0x7fff4ffc5662] + 222 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff4fe88cee] Interesting - we're waiting on a lock somewhere in Necko? Hey Dragana, any idea what might be going on here?
Flags: needinfo?(dd.mozilla)
Comment 9•6 years ago
|
||
There has been a recent change in nsNetUtil.cpp that has changed some locks in the BufferWriter class. Let's ask the author, I think the date the patch is pushed corresponds to this hang. Andrea can you take a look if this is caused by bug 1469290.
Flags: needinfo?(dd.mozilla) → needinfo?(amarchesini)
Assignee | ||
Comment 10•6 years ago
|
||
> Interesting - we're waiting on a lock somewhere in Necko? We use the lock only when the inputStream->ReadSegments() returns NS_BASE_STREAM_WOULD_BLOCK. When this happens, we call ::AsyncWait() and the reading will continue on a separate thread. The question is: why input->ReadSegments() is not able to read data from the stream immediately? We read what has been asked here: https://searchfox.org/mozilla-central/rev/d2966246905102b36ef5221b0e3cbccf7ea15a86/netwerk/base/nsNetUtil.cpp#1658,1664 Where mCount is what HttpChannelParent::OnDataAvailable receives as argument here: https://searchfox.org/mozilla-central/rev/d2966246905102b36ef5221b0e3cbccf7ea15a86/netwerk/protocol/http/HttpChannelParent.cpp#1594,1613,1621-1623,1638 So, in theory, that amount of data should be available in the stream. I could change NS_ReadInputStreamToBuffer() to avoid blocking, but that seems a bad way to hide a bug. Dragana, does it sound reasonable? Is there an STR to reproduce this issue?
Flags: needinfo?(amarchesini) → needinfo?(dd.mozilla)
Updated•6 years ago
|
Component: JavaScript Engine → Networking
Comment 11•6 years ago
|
||
HttpChannelParent::OnDataAvailable is called from here: https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsInputStreamPump.cpp#597 and count is set to the actual available amount of data. So this data should be available. This code has always worked like this. please do not change NS_ReadInputStreamToBuffer. I have try to reproduce it using https://www.youtube.com/premium but I could not. Maybe log for nsHttp:5,nsSocketTransport:5,nsStreamPump:5 would help
Flags: needinfo?(dd.mozilla)
Comment 12•6 years ago
|
||
Cristian, can you try to make a http log: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging use modules timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5.
Reporter | ||
Comment 13•6 years ago
|
||
Was my first time but here the link for the archive with logs: (both links are for the same file) https://send.firefox.com/download/206e828cfc/#kFqSHTH-pYJGSVSnlE6vwg https://nofile.io/f/R0JgKMgCGu2/http-logs.zip
Flags: needinfo?(george.craciun)
Comment 14•6 years ago
|
||
The stream is coming from the cache. Michal, has something change in the cache recently? Can you take a look? There is 262144 available on the stream, but we end up waiting for data... (the log is missing cache log :( )
Flags: needinfo?(michal.novotny)
Comment 15•6 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #14) > The stream is coming from the cache. Michal, has something change in the > cache recently? Can you take a look? There is 262144 available on the > stream, but we end up waiting for data... > > (the log is missing cache log :( ) AFAIK, there was no recent change in the cache streams. Can you point me to an exact lines in the log?
Flags: needinfo?(michal.novotny) → needinfo?(dd.mozilla)
Reporter | ||
Comment 16•6 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #14) > (the log is missing cache log :( ) I was using the timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5 modules as you said, is there anything else that I would had to include? Or is there something I missed? Was my first time that I was using it.
Comment 17•6 years ago
|
||
Please use timestamp,sync,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,cache2:5
Reporter | ||
Comment 18•6 years ago
|
||
Was finally able to upload it somewhere; https://nofile.io/f/apb1wmnZUPW/http-logs-with-cache.tar.bz2 If this log is somehow different than all the other one, please let me know. The unresponsiveness happened at a different stage of the website loading.
Comment 19•6 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #10) > > Interesting - we're waiting on a lock somewhere in Necko? > > We use the lock only when the inputStream->ReadSegments() returns > NS_BASE_STREAM_WOULD_BLOCK. When this happens, we call ::AsyncWait() and the > reading will continue on a separate thread. This approach seems to be wrong. What if the stream needs to run an event on this thread to get more data? > The question is: why input->ReadSegments() is not able to read data from the > stream immediately? We read what has been asked here: > https://searchfox.org/mozilla-central/rev/ > d2966246905102b36ef5221b0e3cbccf7ea15a86/netwerk/base/nsNetUtil.cpp#1658,1664 > > Where mCount is what HttpChannelParent::OnDataAvailable receives as argument > here: > > https://searchfox.org/mozilla-central/rev/ > d2966246905102b36ef5221b0e3cbccf7ea15a86/netwerk/protocol/http/ > HttpChannelParent.cpp#1594,1613,1621-1623,1638 > > So, in theory, that amount of data should be available in the stream. > I could change NS_ReadInputStreamToBuffer() to avoid blocking, but that > seems a bad way to hide a bug. > Dragana, does it sound reasonable? The amount is available in the stream. The problem is that BufferWriter::WriteAsync calls ReadSegments (with length == 0) again even if it reads all available data and that CacheFileInputStream::ReadSegments returns NS_BASE_STREAM_WOULD_BLOCK if aCount is 0 and there is no data. > Is there an STR to reproduce this issue? It happens when there are 2 HTTP channels, one writes the data to the cache entry and the other reads from the entry. The bug seems to occur on chunk boundary which is 256KB.
Flags: needinfo?(dd.mozilla)
Comment 20•6 years ago
|
||
Attachment #8989112 -
Flags: review?(valentin.gosu)
Comment 21•6 years ago
|
||
Baku, the patch should fix this particular problem, but BufferWriter really shouldn't call ReadSegments with aCount=0. And blocking the thread when NS_BASE_STREAM_WOULD_BLOCK is returned is IMO a potential deadlock.
Flags: needinfo?(amarchesini)
Updated•6 years ago
|
Assignee: nobody → michal.novotny
Priority: -- → P2
Whiteboard: [necko-triaged]
Assignee | ||
Comment 22•6 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #21) > Baku, the patch should fix this particular problem, but BufferWriter really > shouldn't call ReadSegments with aCount=0. And blocking the thread when > NS_BASE_STREAM_WOULD_BLOCK is returned is IMO a potential deadlock. I'll file a follow up bug and work on it.
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 23•6 years ago
|
||
> > We use the lock only when the inputStream->ReadSegments() returns
> > NS_BASE_STREAM_WOULD_BLOCK. When this happens, we call ::AsyncWait() and the
> > reading will continue on a separate thread.
>
> This approach seems to be wrong. What if the stream needs to run an event on
> this thread to get more data?
This is definitely an issue in NS_ReadInputStreamToBuffer() when used to read the full content of the stream in a buffer.
Currently NS_ReadInputStreamToBuffer() is used in 2 different scenarios:
1. to read what is available, without blocking and when we are sure that that amount of data data is available in the stream. This the scenario of OnDataAvailable(). No lock should be used.
2. When we want to read the full content of the stream in memory. This definitely has the issue you are describing. We should try to remove all the uses of NS_ReadInputStreamToBuffer() for this scenario. Fortunately, there are no streams that needs to
dispatch runnables on the current thread, as far as I know.
Probably, a good first step would be to split the 2 scenarios in 2 different methods. This will allow us to add assertions and do a proper audit of how/when NS_ReadInputStreamToBuffer() is used.
Assignee | ||
Comment 24•6 years ago
|
||
Actually, it's probably better to land my patch with yours.
Attachment #8989120 -
Flags: review?(michal.novotny)
Updated•6 years ago
|
Attachment #8989112 -
Flags: review?(valentin.gosu) → review+
Updated•6 years ago
|
Attachment #8989120 -
Flags: review?(michal.novotny) → review+
Updated•6 years ago
|
Keywords: checkin-needed
Comment 25•6 years ago
|
||
Pushed by dluca@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/198a4644c09a Firefox is unresponsive when visiting https://www.youtube.com/premium, r=valentin https://hg.mozilla.org/integration/mozilla-inbound/rev/50b0bfa64b2c BufferWriter should stop reading data when the wanted amount has been reached, r=michal
Keywords: checkin-needed
Comment 26•6 years ago
|
||
Backed out 2 changesets (bug 1470355) for causing multiple failures Log: https://treeherder.mozilla.org/logviewer.html#?job_id=185941953&repo=mozilla-inbound Failure push: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=50b0bfa64b2c631dfd146657207c04cb2ca31f65&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified&selectedJob=185941953 Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/f53f375f3ecec0dbb331574d4aa6f059d52f6ed2
Flags: needinfo?(michal.novotny)
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 27•6 years ago
|
||
Assignee: michal.novotny → amarchesini
Attachment #8989120 -
Attachment is obsolete: true
Flags: needinfo?(amarchesini)
Comment 28•6 years ago
|
||
Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d5e2a4d678f0 BufferWriter should stop reading data when the wanted amount has been reached, r=michal https://hg.mozilla.org/integration/mozilla-inbound/rev/715cc1bda8a2 Firefox is unresponsive when visiting https://www.youtube.com/premium, r=valentin
Comment 29•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d5e2a4d678f0 https://hg.mozilla.org/mozilla-central/rev/715cc1bda8a2
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox63:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Comment 30•6 years ago
|
||
Comment on attachment 8989112 [details] [diff] [review] patch v1 - don't return NS_BASE_STREAM_WOULD_BLOCK from CacheFileInputStream::ReadSegments when aCount is 0 Approval Request Comment [Feature/Bug causing the regression]: bug 1469290 [User impact if declined]: possible deadlock on a main thread [Is this code covered by automated tests?]: the code is executed by all tests using disk cache [Has the fix been verified in Nightly?]: no [Needs manual test from QE? If yes, steps to reproduce]: no, it's not easy to trigger the bug [List of other uplifts needed for the feature/fix]: the patch should land together with the second patch in this bug [Is the change risky?]: no [Why is the change risky/not risky?]: simple change [String changes made/needed]: none
Flags: needinfo?(michal.novotny)
Attachment #8989112 -
Flags: approval-mozilla-beta?
Comment 31•6 years ago
|
||
Comment on attachment 8989151 [details] [diff] [review] stream.patch Approval Request Comment [Feature/Bug causing the regression]: bug 1469290 [User impact if declined]: possible deadlock on a main thread [Is this code covered by automated tests?]: the code is executed by all tests using disk cache [Has the fix been verified in Nightly?]: no [Needs manual test from QE? If yes, steps to reproduce]: no, it's not easy to trigger the bug [List of other uplifts needed for the feature/fix]: the patch should land together with the second patch in this bug [Is the change risky?]: no [Why is the change risky/not risky?]: simple change [String changes made/needed]: none
Attachment #8989151 -
Flags: approval-mozilla-beta?
Updated•6 years ago
|
Blocks: 1469290
Keywords: regression
Updated•6 years ago
|
OS: Mac OS X → All
Comment 32•6 years ago
|
||
Patrick, does the problem seem to be fixed for you in the latest Nightly build? I realized you may not be able to reproduce the issue 100% of the time but it sounds like you hit it fairly often, originally.
Flags: needinfo?(github)
Comment 33•6 years ago
|
||
Comment on attachment 8989151 [details] [diff] [review] stream.patch Fix for a new regression in 62, has some test coverage. Let's try this on beta 7. I'd still love to see any verification that the issue is fixed (either in nightly now, or in next week's beta)
Attachment #8989151 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Reporter | ||
Comment 34•6 years ago
|
||
I can confirm that the issue is no longer appearing with 2018-07-06 nightly builds. Website can be loaded and is responsive. (tested with my old profile and a new one)
Flags: needinfo?(github)
Comment 35•6 years ago
|
||
Fabulous, thank you!
Comment 36•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/14485670c079 https://hg.mozilla.org/releases/mozilla-beta/rev/db0c2c7b3631
Comment 37•6 years ago
|
||
Comment on attachment 8989112 [details] [diff] [review] patch v1 - don't return NS_BASE_STREAM_WOULD_BLOCK from CacheFileInputStream::ReadSegments when aCount is 0 Fix for a hang with YouTube, verified in Nightly. Let's uplift for beta 7.
Attachment #8989112 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•6 years ago
|
Flags: qe-verify+
Comment 38•6 years ago
|
||
I have managed to reproduce this issue on Ubuntu using Fx buildID: 20180626220124 and macOS 10.13.6 using Fx buildID: 20180621220127. The issue is verified as fixed using Fx 63.0b7 buildID: 20180917143811 and Fx 62.0 buildID: 20180830143136 on both Ubuntu 16.04 x64 and macOS 10.13.6 environments. The browser no longer becomes unresponsive when loading the www.youtube.com/premium webpage.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in
before you can comment on or make changes to this bug.
Description
•