Closed Bug 1236745 Opened 8 years ago Closed 8 years ago

Firefox 42+ hangs when loading iboostimmunity.ca/pictures

Categories

(Core :: Layout, defect)

42 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox42 --- wontfix
firefox43 --- wontfix
firefox44 - wontfix
firefox45 + fixed
firefox46 + fixed
firefox47 --- fixed

People

(Reporter: dylanmccall+1, Assigned: dbaron)

References

Details

(Keywords: regression)

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36

Steps to reproduce:

With Firefox 42 / 43, browse to http://iboostimmunity.ca/pictures (or most other pages at iboostimmunity.ca).


Actual results:

Before rendering iboostimmunity.ca/pictures, Firefox pegs a single CPU at 100% and the entire application hangs. It very quickly becomes unresponsive.

This appears to be a regression introduced between version 41 and 42. We have reproduced the problem in Firefox 42 and 43 on OS X, Fedora 23, and Windows. Here is a crash report created in Fedora 23 by following the steps at https://developer.mozilla.org/en-US/docs/How_to_Report_a_Hung_Firefox, with Firefox in safe mode, after it had been hanging for 5 minutes:

https://crash-stats.mozilla.com/report/index/df7efc49-c802-446b-8336-832e72160105

I'm happy to poke around with GDB if you more information will help :)
I should add that the page renders correctly with Javascript disabled.
On my Win 7, the hang doesn't appear at the 1st load of the website but just after clicking on the menu entry called "Booster U".

The funny thing is that's plugin-container which eats CPU (25%) and hangs firefox with bad builds.
With good builds, CPU use of plugin-container spikes but comes back to normal (near 0%) after loading the webpage.

Reg range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9f4c7aae16b1f7f6b60fe403ec334d7413a472e6&tochange=b002f72aa0515d0a22c9da9dee03d949215f9653

Suspected bug 1207157.
Blocks: 1207157
Status: UNCONFIRMED → NEW
Component: Untriaged → Layout: Floats
Ever confirmed: true
Flags: needinfo?(dbaron)
Keywords: regression
Product: Firefox → Core
A simplified testcase would be useful.  It wouldn't surprise me if it's a preexisting bug that the changes in bug 1207157 caused to affect this page.
Keywords: testcase-wanted
Summary: Firefox 42+ crashes when loading iboostimmunity.ca/pictures → Firefox 42+ hangs when loading iboostimmunity.ca/pictures
Not quite a simplified testcase, but here's a copy of an offending page retrieved with `wget -k -p`, with some excess image and font files stripped out. Hope it helps! Thanks for taking a look at this so quickly :)
Okay, I wish I could provide something easier to work with, but I poked around a little searching for a workaround. There's an element on the page with id="block-views-booster-shots-block". If we remove that element or change its ID, Firefox no longer hangs.

In the attached copy of the page, there is a generated CSS file called iboostimmunity.ca/sites/default/files/css/css_76qF6v7xYrdd-Y4IIN3nkoXHQUbdL9qkw7KVIL-016A.css. Somewhere in there lies this CSS rule:

    #block-views-booster-shots-block .views-field-field-html-image{margin:10px;overflow:hidden;-moz-border-radius:5px;-webkit-border-radius:5px;border-radius:5px;-khtml-border-radius:5px;}

If we remove `overflow: hidden` or change it to `overflow: visible`, Firefox also works fine.

Sorry to make you deal with minified CSS. Drupal pulls it together from a bunch of pieces:

 * http://iboostimmunity.ca/sites/all/themes/custom/bccdcv3/css/bootstrap.css
 * http://iboostimmunity.ca/sites/all/themes/custom/bccdcv3/css/base.css
 * http://iboostimmunity.ca/sites/all/themes/custom/bccdcv3/css/layout.css
 * http://iboostimmunity.ca/sites/all/themes/custom/bccdcv3/css/theme.css
 * http://iboostimmunity.ca/sites/all/themes/custom/bccdcv3/css/state.css
 * http://iboostimmunity.ca/sites/all/themes/custom/bccdcv3/css/module.css

The CSS rule I found is in module.css.

Trouble is there are various other interdependencies in that page which I'm finding tricky to unravel, but hopefully this will help? :)
I tried to reduce the webpage showing the issue, by removing useless CSS/JS libs/class. The reduced testcase is a sliding carousel displaying images.

Load the testcase in Firefox, then refresh the page or restart the browser and reload the page, it should hang immediately.
Forget my comment #2, the regression range I gave is wrong.
Indeed, the testcase hangs Firefox 42+.

New reg range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=838a7d98c1d41c2bf67497d0a80cca98a6001899&tochange=5dcb38c7f1b84d7fc4d0255a2b10165dfaceb941

But it's always your patches, David. ;)
No longer blocks: 1207157
Component: Layout: Floats → Layout
Daniel, I thought I'd start with you to see if this is something worth investigating, as Loic already has some interesting leads. Also, as such I do not think this issue should be release blocking for 44, I know we shipped 43 with it. But my worry is if it is masking a bigger issue that manifests itself on more than just this website. Thoughts?
Flags: needinfo?(dholbert)
I'm not able to reproduce any hangs at the original URL in comment 0, or the "Booster U" section in comment 2.

However, I *am* able to reproduce with Loic's attached testcase, though I had to do two special things to make it hang:
 * I had to disable mixed-content protection (to allow served-over-HTTP) resources. Click the lock in URLbar, then the rightarrow at edge of dialog, then "disable protection for now".)
 * I had to load it (and then reload) in a smallish window (846px wide,  691px tall). It didn't hang for me when it was in a large Firefox window, or on first load in a smallish window.

I'll poke around a bit when I get a chance (leaving needinfo open), though this may want to wait for dbaron's eyes, since he's the one in the regression range & hence may be better-able to figure out the change that broke this.

In any case, I agree this shouldn't block any particular release, given that we've already shipped 2 other releases with it.  (We should definitely strive to fix it, but it doesn't make sense as a blocker.)
Yes, ofc, you need to disable mixed content security when loading the testcase (due to JS libs/CSS files loaded from the 3rd-party website given in the original message).
The hang appears when you refresh the page after the 1st load or after restarting the browser when your reload the page.

"Forget the website" (here BMO with the testcase) or clearing the browser cache is enough to make the hang disappear.
Loic, if you'd be up for further-reducing the testcase (in particular, saving local copies of the CSS & ripping out everything that's not needed, and then just including the remaining CSS directly in an HTML file), that would definitely help isolate the issue.

If you don't get a chance, though, no worries; that's probably one of the first things I'll plan on doing when I take a closer look at this.
Thanks Daniel for the investigation. It seems this one is already too late for Fx44.
Tracking for 45+ since this is likely a recent regression (from 42 or 43, not certain yet)
What's happening here is that we enter an infinite loop by oscillating
between two states.  The code assumes that (a) the available space will
never grow, only stay the same or shrink, and (b) that we should break
out of the loop if it stays the same.  This also means we hit the
assertion about the available space growing every other time through the
loop.

This is in the inner loop in nsBlockFrame::ReflowBlockFrame that was
introduced in https://hg.mozilla.org/mozilla-central/rev/80ef9bb2c2e9 .

The problem is fundamentally a logic error in that code.  The makes the
assumption that if you reduce the width available to a block formatting
context or replaced block-level element, its height does not shrink.
(The "replaced block" (really block formatting context) in this case, as
in the original testcase, is a scroll frame.  I didn't debug the
original testcase enough to figure out what caused its sizing
characteristics, although a percentage-width image does seem like the
most likely candidate.)

Without the patch, the reftest test (but not reference) hangs, as does
the semi-simplified test in the bug (given a narrow window).

With the patch, neither the semi-simplified test in the bug nor the
reference hangs, and the reftest passes.

Review commit: https://reviewboard.mozilla.org/r/32811/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/32811/
Attachment #8713778 - Flags: review?(dholbert)
Attachment #8713778 - Flags: review?(dholbert)
Comment on attachment 8713778 [details]
MozReview Request: Bug 1236745 - Fix infinite loop resulting from block formatting context entering resize oscillation due to considering floats over its whole height when sizing it.  r?dholbert

https://reviewboard.mozilla.org/r/32811/#review30079

::: layout/generic/nsBlockFrame.cpp:944
(Diff revision 1)
> +  return aNewAvailableSpace.IStart(aWM) > aOldAvailableSpace.IStart(aWM) ||

Superficially, this return statement doesn't actually seem like it checks for shrinking (since it checks each side independently and returns true if either side moved inwards).  It's only valid in light of all the assertions above it (which is clear after reading through all the assertions, but not otherwise clear).

Consider adding a comment to make this return statement's reasoning/justification clearer, maybe something like the following:
  // If available space shrunk on one side or the other, then it also shrunk
  // overall. (The assertions above guarantee this.)

::: layout/generic/nsBlockFrame.cpp:3403
(Diff revision 1)
> -                                floatAvailableSpace.mRect)) {
> +                                floatAvailableSpace.mRect, true)) {

I'm not clear on why we're passing "true" for aCanGrow here, and "false" at the other callsite below.  (Your comment 16 says that space \*does\* sometimes grow here, but I don't know this section of code well enough to intuitively see why it can grow here vs. why it can't grow in the other spot.)

Could you explain why growth is allowed here but not at the other callsite? (perhaps in code comments, to make this clearer for other readers as well)
Flags: needinfo?(dholbert)
Comment on attachment 8713778 [details]
MozReview Request: Bug 1236745 - Fix infinite loop resulting from block formatting context entering resize oscillation due to considering floats over its whole height when sizing it.  r?dholbert

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32811/diff/1-2/
Comment on attachment 8713778 [details]
MozReview Request: Bug 1236745 - Fix infinite loop resulting from block formatting context entering resize oscillation due to considering floats over its whole height when sizing it.  r?dholbert

https://reviewboard.mozilla.org/r/32811/#review30113

Thanks for adding those comments! Much easier to follow now.

r=me, with nits below addressed however you see fit.

::: layout/generic/nsBlockFrame.cpp:915
(Diff revision 2)
> + * additional floats.

This last phrase here seems misleading ("because its height caused..." -- we don't actually check anything about the height here).

Maybe:
"its height caused" --> "its height might've grown & caused"

(Also, technically this should have s/height/block-size/, though I'm also fine leaving it with "height" for readability.)

::: layout/generic/nsBlockFrame.cpp:3407
(Diff revision 2)
> +      // caused the block to become taller.  Note that since we're

s/has caused/could've caused/, I think? (since we don't actually know if it caused the block to be taller or not)

Also: Maybe extend this comment slightly to hint at why we're breaking out of the loop?  Right now it just describes what we're checking (& why we need to check it), but doesn't explain our reaction to the check.
Attachment #8713778 - Flags: review?(dholbert) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/65e84ada79e41a3afdcb0d8148a84a7152888df1
Bug 1236745 - Fix infinite loop resulting from block formatting context entering resize oscillation due to considering floats over its whole height when sizing it.  r=dholbert
[Canceling needinfo=dbaron from comment 2; pretty sure that needinfo has been addressed, now that he's written & landed a patch.]
Flags: needinfo?(dbaron)
https://hg.mozilla.org/mozilla-central/rev/65e84ada79e4
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Comment on attachment 8713778 [details]
MozReview Request: Bug 1236745 - Fix infinite loop resulting from block formatting context entering resize oscillation due to considering floats over its whole height when sizing it.  r?dholbert

Approval Request Comment
[Feature/regressing bug #]: bug 538194
[User impact if declined]: firefox hangs on certain web pages
[Describe test coverage new/current, TreeHerder]: reftest (testing not only for non-hang, but also for correct behavior) is in the tree
[Risks and why]: low risk; it's adjusting a test to check more carefully whether either edge has changed position, rather than doing the check in a way that assumes that edges can only move inwards and only a size check is needed
[String/UUID change made/needed]: no
Attachment #8713778 - Flags: approval-mozilla-beta?
Attachment #8713778 - Flags: approval-mozilla-aurora?
Comment on attachment 8713778 [details]
MozReview Request: Bug 1236745 - Fix infinite loop resulting from block formatting context entering resize oscillation due to considering floats over its whole height when sizing it.  r?dholbert

Fix a hang, taking it.
Should be in 45 beta 4.
Attachment #8713778 - Flags: approval-mozilla-beta?
Attachment #8713778 - Flags: approval-mozilla-beta+
Attachment #8713778 - Flags: approval-mozilla-aurora?
Attachment #8713778 - Flags: approval-mozilla-aurora+
Removing the "testcase-wanted" keyword as it was given in comment 6.
Keywords: testcase-wanted
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: