VERIFIED FIXED in mozilla13

Status

()

Core
Layout
--
critical
VERIFIED FIXED
5 years ago
4 years ago

People

(Reporter: Swarnava, Assigned: mats)

Tracking

({hang})

Trunk
mozilla13
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [inbound], URL)

Attachments

(6 attachments)

(Reporter)

Description

5 years ago
Firefox seems to hang permanently when I load the Blue Dog Coalition page on Wikipedia. I tried loading it with JavaScript disabled and in Safe Mode, to no avail. It loads fine in IE8. I tried saving the page from IE8 and then opening the html file in Fx, freeze. 
i can reproduce this issue on Firefox 10 and 11 beta 1
(Reporter)

Updated

5 years ago
(Reporter)

Updated

5 years ago
Severity: normal → critical
OS: Windows 7 → All
Hardware: x86 → All

Comment 1

5 years ago
Reproduced:
Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.26) Gecko/20120128 Firefox/3.6.26
Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0
Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0
Mozilla/5.0 (X11; Linux x86_64; rv:12.0a2) Gecko/20120208 Firefox/12.0a2
Mozilla/5.0 (X11; Linux x86_64; rv:13.0a1) Gecko/20120208 Firefox/13.0a1
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7

WFM:
Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.10.229 Version/11.61

Observation:
Firefox hangs with 100% on one CPU core, but the amount allocated memory does no gallop.
Keywords: hang
The windbg log points to Layout
Component: General → Layout
Product: Firefox → Core
QA Contact: general → layout
Created attachment 595469 [details]
windbg stacktrace
Created attachment 595472 [details]
Saved source to reproduce the hang in case the page changes
The page is using column styles, but only in Gecko and WebKit.  That's what causes the hang, so it's not surprising there's no hang in IE8: it's getting a different page.

WebKit hangs on this page just like we do.

Looking at a profile, the callstack shows that we're spending all our time deep inside reflow of a columnset.  This columnset has another columnset as its ancestor, and so on, at least 6 levels deep.  I will bet money all of these are auto-height-balancing.

Reflow of a height-balancing of a columnset is O(N log(K)), iirc where N is the number of kids and K is the height (because we have to do a binary search to find the right height, basically).  Once you start nesting columnsets you can easily get to situations where the innermost block is being reflowed a huge number of times.  In this case that reflow is particularly expensive because list renumbering is involved; that's 30% of the profile right there.

roc, thoughts?
We could probably optimize this by caching stuff on column frames --- say a map of (available height, available width) to balance height --- but it would be complex and fragile and I'm struggling to care.
(Assignee)

Comment 7

5 years ago
Created attachment 595541 [details] [diff] [review]
wip

Perhaps we should simply refuse to balance more than N nested column sets?
Comment on attachment 595541 [details] [diff] [review]
wip

Review of attachment 595541 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsColumnSetFrame.cpp
@@ +375,5 @@
> +        ++cnt;
> +      }
> +    }
> +    if (cnt == MAX_NESTED_COLUMN_BALANCING) {
> +      numColumns = 1;

Why don't you just set isBalancing to false?
(Assignee)

Comment 9

5 years ago
Created attachment 595562 [details]
log

> Why don't you just set isBalancing to false?

I tried three versions:
1. numColumns = 1
2. isBalancing = false
3. isBalancing = false; numColumns = 1;

The page layout looks normal with 1.
2. also fixes the bug, but it looks like the deep columns have zero width
so they have a single word per line for the affected content.
3. does not fix the bug

I've attached the output from the printf at the end of ChooseColumnStrategy
for comparison of the three versions.

Note that for 1. colWidth=71940 or thereabout,
for 2. some have colWidth=1 or very low,
for 3. we have colHeight=0
Comment on attachment 595541 [details] [diff] [review]
wip

Review of attachment 595541 [details] [diff] [review]:
-----------------------------------------------------------------

ok
Attachment #595541 - Flags: review+
(Assignee)

Comment 11

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/1cd8c9b40035
Whiteboard: [inbound]
Target Milestone: --- → mozilla13
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/0c1fdf386ac1 since I wasn't sure which reftest failures and which crashtest assertion count failures went with which cset in the push.
Whiteboard: [inbound]
Target Milestone: mozilla13 → ---
(Assignee)

Comment 13

5 years ago
Thanks philor, sorry about the orange.
(Assignee)

Comment 14

5 years ago
Created attachment 596673 [details] [diff] [review]
addendum

I forgot to initialize 'mIsColumnBalancing' in the ctor.
(I'll fold this into the earlier patch before landing but I figured it be
easier to review separately)

Try run is green this time:
https://tbpl.mozilla.org/?tree=Try&rev=a8e4006a6a88
Assignee: nobody → matspal
Attachment #596673 - Flags: review?(roc)
Attachment #596673 - Flags: review?(roc) → review+
(Assignee)

Comment 15

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/748efe64d073

QA: wikipedia has now fixed the page, use the "Saved source to reproduce the hang..."
attachment to verify.  (Good thinking, Matti!)
Whiteboard: [inbound]
Target Milestone: --- → mozilla13
https://hg.mozilla.org/mozilla-central/rev/748efe64d073
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Whiteboard: [inbound]

Comment 17

5 years ago
(In reply to Mats Palmgren [:mats] from comment #15)
rv. 62695656d7bd hangs on both
Using http://en.wikipedia.org/w/index.php?title=Blue_Dog_Coalition&oldid=475569463
and
the saved source.

Comment 18

5 years ago
It takes about 30 s until Fx regains responsiveness.
(Assignee)

Comment 19

5 years ago
> It takes about 30 s until Fx regains responsiveness.

On which platform?  On what kind of hardware?  I assume you're testing
with Mozilla Nightly builds?

Comment 20

5 years ago
(In reply to Mats Palmgren [:mats] from comment #19)
> On which platform?  On what kind of hardware?

Linux x86_64

>  I assume you're testing with Mozilla Nightly builds?

No.  I have compiled the sources of the m-c tree revision 62695656d7bd.  I just confirmed it using a fresh build of revision 748efe64d073.
I also still see the hang using  Mozilla/5.0 (Windows NT 6.1; rv:13.0a1) Gecko/20120214 Firefox/13.0a1 SeaMonkey/2.10a1, Built from http://hg.mozilla.org/mozilla-central/rev/60edf587f4af
(Assignee)

Comment 22

5 years ago
(In reply to Stefan from comment #20)
> (In reply to Mats Palmgren [:mats] from comment #19)
> > On which platform?  On what kind of hardware?
> 
> Linux x86_64

Thanks, but I was unclear, I need to know if this a phone/laptop/desktop/super-computer?
which brand/model of CPU(s), memory etc is it using?
"30 seconds" have no meaning otherwise.

> No.  I have compiled the sources of the m-c tree revision 62695656d7bd.  I
> just confirmed it using a fresh build of revision 748efe64d073.

OK, I appreciate the feedback, but I would prefer numbers from official Mozilla Nightly
builds so I know how the binary was built.   http://nightly.mozilla.org/
(Assignee)

Comment 23

5 years ago
(In reply to Matthias Versen (Matti) from comment #21)
> I also still see the hang using  Mozilla/5.0 (Windows NT 6.1; rv:13.0a1)
> Gecko/20120214 Firefox/13.0a1 SeaMonkey/2.10a1, Built from
> http://hg.mozilla.org/mozilla-central/rev/60edf587f4af

That build doesn't have the fix as far as I can tell, it's built from mozilla-central
cset 86714, the fix is cset 86760.

If you re-test later, I'd like to know: For how long does it hang? on what hardware?

Comment 24

5 years ago
(In reply to Mats Palmgren [:mats] from comment #22)
> Thanks, but I was unclear, I need to know if this a
> phone/laptop/desktop/super-computer?

Desktop

> which brand/model of CPU(s), memory etc is it using?

intel E6300, 4 GiB core

> > No.  I have compiled the sources of the m-c tree revision 62695656d7bd.  I
> > just confirmed it using a fresh build of revision 748efe64d073.
> 
> OK, I appreciate the feedback, but I would prefer numbers from official
> Mozilla Nightly
> builds so I know how the binary was built.   http://nightly.mozilla.org/

The latest nightly seems to be

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012/02/2012-02-14-03-12-27-mozilla-central/

and is build from revision 60edf587f4af. This does not seem to contain the 748efe64d073 changeset, right?
(Assignee)

Comment 25

5 years ago
> intel E6300, 4 GiB core

Thanks, that gives me a data point I can use.

> and is build from revision 60edf587f4af

Correct, so it's not in Nightly yet.  If you can verify the result when the
fix becomes available in Nightly (probably the next one) that would be great.

Comment 26

5 years ago
This nightly contains 748efe64d073 and hangs:

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012/02/2012-02-15-03-11-55-mozilla-central/firefox-13.0a1.en-US.linux-x86_64.tar.bz2
linux64-ix-slave13
x86_64-unknown-linux-gnu
Built from http://hg.mozilla.org/mozilla-central/rev/d45c7d7b0079
I get a 20s hang with a Core i3 with 2.1Ghz CPU and todays nightly
(Assignee)

Comment 28

5 years ago
I get a 14s hang using a Linux x86-64 Nightly on a Core i7 870 @ 2.93 GHz.
Hmm, perhaps it's still a bit too long, especially on lower-end CPUs...

MAX_NESTED_COLUMN_BALANCING=4 in a local build cuts the time in half, as expected.

I think the goal we want is to recover from the hang before the user kills the
application...  I think less than 10s is probably good enough for that purpose.

roc, what's the minimum nesting level we must support in your opinion?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I've never seen anyone use nested columns in a real way. So I think supporting 2 would be enough.
(Assignee)

Comment 30

5 years ago
Created attachment 598116 [details] [diff] [review]
max 2 nested balancing levels

OK, let's see if someone complains ;-)

Reftests are green on Try:
https://tbpl.mozilla.org/?tree=Try&rev=448fc22cd215
Attachment #598116 - Flags: review?(roc)
Attachment #598116 - Flags: review?(roc) → review+
(Assignee)

Comment 31

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/f0ac2d720c7b
Whiteboard: [inbound]
https://hg.mozilla.org/mozilla-central/rev/f0ac2d720c7b
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
verified fixed with  Mozilla/5.0 (Windows NT 6.1; rv:13.0a1) Gecko/13.0a1 Firefox/13.0a1 SeaMonkey/2.10a1
The rendering is instant now on my system. I hope that nobody complains about the limit :-)

Mats: Thank you very much for this fix !
Status: RESOLVED → VERIFIED

Updated

4 years ago
Depends on: 822053
You need to log in before you can comment on or make changes to this bug.