Closed Bug 1591343 Opened 5 years ago Closed 2 years ago

CPU time utilization tab hangs. The tab does not respond.

Categories

(Core :: XML, defect, P3)

70 Branch
defect

Tracking

()

RESOLVED INVALID
Performance Impact medium

People

(Reporter: sergey.matushevsky, Unassigned)

Details

(Keywords: hang, perf:responsiveness)

Attachments

(7 files)

Attached image Firefox 70 (64 bit).jpg

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36

Steps to reproduce:

We are working with Oracle ADF application (Oracle ADF version 11.3.2.1), it turned out that at some points the tab hangs, in the process manager it is clear that one of the processes constantly loads the CPU by 20-30%.
The tab does not respond, the console cannot be called (F12). Could only connect to the process and see the stack (stack screen attached)

Actual results:

The tab does not respond

Severity: normal → critical
Keywords: hang

Hi Sergey,

Is this issue something you've only encountered recently or after a recent update (Fx 70)?

Could you check if this also occurs on the latest Firefox Nightly?
You can find it here: https://www.mozilla.org/en-US/firefox/channel/desktop/

Thank you!

Component: Untriaged → General
Flags: needinfo?(sergey.matushevsky)
Product: Firefox → Core
Attached image screen.png

Hi Peter

Tried version 72.0a1 (2019-10-28) (64-bit)
at one time it seemed that everything worked without errors, after working a little again it hung. Stack attached screen.png

Flags: needinfo?(sergey.matushevsky)

Hi Sergey,

Is this issue something you've only encountered recently or after a recent update (Fx 70)?

We tried to use older versions up to version 58, anyway the tab hangs. The interesting thing is that neither in chrome nor in the Internet explorer does not freeze, as we did not try. The investigation is complicated by the fact that the process simply loads the CPU, but there are no errors either in the application itself or in firefox, and it is not clear what to understand.

There is a process dump made by the Microsoft system Task Manager.
firefox.DMP
https://drive.google.com/open?id=1-vWsU1JKCfVO0XV1vVMEDrxih90u-0Tw

Hi Peter.
What else can we try to help resolve this issue?

I'm going to set a component in order to involve the development team in reviewing this issue.
If this does not seem like the correct one, please feel free to set it to a more suitable one.

Thanks for the update Sergey!

Status: UNCONFIRMED → NEW
Component: General → Performance
Ever confirmed: true

Hi Peter_M
Are there any possible solutions from the development team? Because of this problem, we are not able to work in firefox.

Firefox is now hanging on an external site. Not our infrastructure. A screen of memory processes attached (firefox_061119)

Attached image firefox_061119.png

Hi Sergey,

Could you try to capture a performance profile while reproducing the issue?

In order to do this you will need to install and use the Cleopatra add-on
You can find information about how to do this by going to:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
https://perf-html.io/

Please also note that this add-on only works on Firefox Nightly.

Flags: needinfo?(sergey.matushevsky)

As I understand it, Cleopatra is a profile viewer. Does the profile itself need to be created using firefox? (F12 on the Performance tab)

Flags: needinfo?(sergey.matushevsky)
Attached file profile_nigh.json

Playback on external resources does not play back so often.
I attach a profile to our production environment.
profile_nigh.json

Hi Mike,

Could you please take a look over this?

Thank you Sergey!

Flags: needinfo?(mconley)

Here is the last function that was called by profile. Taken from Oracle ADF libraries.

AdfDataTransferService.prototype._doStreamingTransfer = function(requestItem) {
this._streamingMsgQueue = null;
var doc = this._getDomDocument();
var div = doc.getElementById(AdfDataTransferService._ADF_STREAMING_IFRAME_ID);
if(!div) {
div = doc.createElement("DIV");
div.id = AdfDataTransferService._ADF_STREAMING_IFRAME_ID;
div.style.display = "none";
doc.body.appendChild(div)
}
var actionUrl = requestItem._actionURL + "&" + AdfDataTransferService._WINDOW_PARAM + "=" + this._window.name;
div.innerHTML = '<IFRAME src="' + actionUrl + '" onload="AdfPage.PAGE.getDataTransferService().processStreamingResponse(' + "'parent.AdfPage.PAGE.streamingResponseComplete();');"></IFRAME>";
this._streamingFrame = div.firstChild;
AdfAssert.assertDomElement(this._streamingFrame);
AdfAssert.assert(this._streamingFrame.tagName == "IFRAME")
};

Sent a crash report ID a40de1cb-4e9b-42d2-b25b-a30160512656

This crash report appeared only after I killed the process in the task manager.

Perhaps you advise something to change in about:config settings for testing?

Sergey, comment 8 says the hang happens on an external site too. Would it be possible to share the URL to that so that one could test and profile this?

Flags: needinfo?(sergey.matushevsky)
Priority: -- → P3
Whiteboard: [qf:investigate]

Unfortunately, this happened only once, I was not able to repeat it again. Here is this page.
https://vc.ru/flood/24957-design-data-tables

Flags: needinfo?(sergey.matushevsky)

It was found empirically that if you work on the HTTP protocol, turn on work in 1 process and turn off hardware acceleration in the browser, the likelihood of a tab hanging up is very small.

Ok. Can you perhaps try to reprofile the issue then on the site you see this being a problem?
(install addon from https://profiler.firefox.com/ and profile and then share the url by clicking "publish".
If you don't want to share the link to everybody, feel free to send the link to me).

Or is there some way I could access the problematic site?

Hi Olli Pettay!

We set the profiler, let's try to write an error in the profiler.
Unfortunately, we can’t provide access to the site, as this is within the corporate structure.

Played a hang, a profile attached.

We seem to have found a temporary solution to the problem. When the cache (browser.cache.disk.enable = false; browser.cache.memory.enable = false) is off, everything works without freezes. We ran autotests on 400 tabs, everything worked without a single error. Maybe this will prompt you to solve the problem.
If you leave at least one of the caches turned on, then freezing occurs in the region from 1 to 20 of the tab.

Interesting. The profile hints that there is some ongoing network request being processed. But then, there is also that HangMonitor runnable which somehow calls HTMLContentSink. That part of the stack can't be right.

Honza, does this ring any bells, the cache bits of this?
(this doesn't seem to be recent regression per https://bugzilla.mozilla.org/show_bug.cgi?id=1591343#c3)

Component: Performance → Networking: Cache
Flags: needinfo?(honzab.moz)

Are there any solutions to this problem?

(In reply to Olli Pettay [:smaug] from comment #26)

Interesting. The profile hints that there is some ongoing network request being processed. But then, there is also that HangMonitor runnable which somehow calls HTMLContentSink. That part of the stack can't be right.

Honza, does this ring any bells, the cache bits of this?

No. It seems we are repeatedly blocked inside calls to nsresult nsExpatDriver::ConsumeToken and int nsScanner::Mark calling down to void nsScannerString::DiscardPrefix which is all in the html parser.

I believe we are looping here (looking at the Stack Chart of the span the main thread is so hugely blocked), something exponential happening:
https://searchfox.org/mozilla-central/rev/cac340f10816707e91c46db6d285f80259420f07/parser/htmlparser/nsParser.cpp#952

W/o a reproducible case this will be hard to figure out.

The HangMonitor is probably just an optimization code sharing and a red herring.

Component: Networking: Cache → DOM: HTML Parser
Flags: needinfo?(honzab.moz)

The component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit auto_nag documentation.

Priority: P3 → --

(In reply to Sergey from comment #25)

We seem to have found a temporary solution to the problem. When the cache (browser.cache.disk.enable = false; browser.cache.memory.enable = false) is off, everything works without freezes. We ran autotests on 400 tabs, everything worked without a single error. Maybe this will prompt you to solve the problem.
If you leave at least one of the caches turned on, then freezing occurs in the region from 1 to 20 of the tab.

This is interesting. I wonder if perhaps the disk we're attempting to cache to in this environment is slow to read and write from?

Hi Sergey, if you re-enable the caches, and then visit about:cache, for disk, where is the Storage Disk Location pointed to? Can we presume that it's pointed to a local directory on the disk that Firefox is running on?

Flags: needinfo?(mconley) → needinfo?(sergey.matushevsky)

Hi Mike.

Storage Disk Location is the same as the firefox startup disk.
C:\Users\<username>\AppData\Local\Mozilla\Firefox\Profiles\9z8ebwti.default-nightly\cache2
Samsung SSD 850 PRO

Flags: needinfo?(sergey.matushevsky)

Despite the source path saying html, nsScannerString is no longer an HTML parser thing but an XML parser thing.

Component: DOM: HTML Parser → XML
Priority: -- → P3

Greetings.
Unfortunately, we again had this problem with browser freezing. At another site. And disabling cache options did not help.
This problem occurred after switching from IBM WebSphere to Oracle Weblogic.
Firefox Installed - 68.9.0esr (64-bit)
Firefox nightly did not improve the situation, hangs still appearing.
How can we fix the problem?

Hi!
We have a critical situation with the browser freezing. We have to transfer clients to other browsers. But since the development is conducted specifically under firefox, not everything works as smoothly as I wanted. Your help is needed to resolve the problem. What else can we provide to advance the resolution of the issue?

Could you try to capture a new profile using Nightly?
https://nightly.mozilla.org/
https://profiler.firefox.com/

Flags: needinfo?(sergey.matushevsky)

Closing as incomplete until profile information is provided as requested in comment #35. Please reopen with the information if this problem is still occurring in current release builds.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE

I'm sorry that did not respond, and for my english.
Now you can directly play tab hangs. I saved two profiles, but there is one thing: when the tab hangs, the console also becomes almost inactive, you cannot stop recording the profile, but you can save it.

Status: RESOLVED → REOPENED
Flags: needinfo?(sergey.matushevsky)
Resolution: INCOMPLETE → ---
Attached file profile.json
Attached file profile1.json

Hey Sergey,
Can you still reproduce this issue or should we close it?

Flags: needinfo?(sergey.matushevsky)

Marking this as Resolved > Incomplete as per reporter's lack of response.
If anyone can still reproduce this issue re-open it or file a new bug.

Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → INCOMPLETE

Good day! I apologize for the long silence. We can now reproduce this bug with the ability to provide remote access as needed. The profile still cannot be recorded because when the tab is hung like this, the console hangs with it (F12).
Currently reproduced in the following versions:
94.0b8 64bit
93.0 64 bit
78.15.0esr 32 bit
Will there be an opportunity to help with a solution?

Status: RESOLVED → REOPENED
Flags: needinfo?(sergey.matushevsky)
Resolution: INCOMPLETE → ---
Performance Impact: --- → ?
Whiteboard: [qf:investigate]

Hey Sergey,

Sorry again for the delay as we're retriaging our older bugs.
On this page you'll find detailed steps to capture a profile using the new profiler, that doesn't use the devtools. Hopefully this could work better:
https://firefox-source-docs.mozilla.org/performance/reporting_a_performance_problem.html

If the hangs prevent capturing the profile, another solution could be using what we call a "shutdown" profile.
You can find some information about that at the end of the page https://profiler.firefox.com/docs/#/./guide-startup-shutdown?id=shutdown. Basically you'll need to specify an environment variable with a file name, then start the profiler using the popup like described above, and hopefully when leaving firefox it will write there the json file.

Hopefully we'll manage to get more information about your issue this way.

Thanks again for your patience!

Flags: needinfo?(sergey.matushevsky)
Performance Impact: ? → P2

Redirect a needinfo that is pending on an inactive user to the triage owner.
:peterv, since the bug has high severity and recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(sergey.matushevsky) → needinfo?(peterv)

(In reply to Sergey from comment #42)

We can now reproduce this bug with the ability to provide remote access as needed.

Remote access would be very helful too, if you can!

Flags: needinfo?(peterv) → needinfo?(sergey.matushevsky)

Redirect a needinfo that is pending on an inactive user to the triage owner.
:peterv, since the bug has high severity and recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(sergey.matushevsky) → needinfo?(peterv)

Let's reopen once we get some new information.
Sergei, I'm so sorry this took so long, but we can't move forward currently.

Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Flags: needinfo?(peterv)
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: