Open Bug 1164189 Opened 6 years ago Updated 1 year ago

Text layout is not interruptible within a line

Categories

(Core :: Layout, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

Tracking Status
e10s - ---
firefox41 --- affected

People

(Reporter: jimm, Unassigned)

References

(Blocks 2 open bugs)

Details

Steps:
1) Make sure the that the dom.ipc.processCount is set to the default setting of 1.
2) Open 2 tabs and in the second tab open [1].
3) Wait about 10 seconds for the page to load (You may or may not see stuff printed to the screen).
4) Close the second tab that has [1] open in it.

[1] http://trowbotham.com/bug_tests/iloop.php

Actual Results:
After closing the tab, the spinner shows until the PHP script times out.  The browser chrome appears to be responsive, however, the browser may as well be hung since the spinner makes the browser useless.  If you try to exit the browser completely, it will hang until the PHP script has timed out.

Expected results:
No spinner. No browser hang when closing the browser.

I tested setting dom.ipc.processCount = 20 and I no longer get a spinner nor does the browser hang when trying to exit it completely.
I think a good place to start looking at why this happens is how Firefox handles text.  I have about 15 crash reports all related to testing this page.  I never actually see the crash reporter, so presumably these crashes occur during tab close or sometime afterwards.  Here is a small sample of the crash reports:

bp-a00db9e2-6a52-42d9-a938-4d1282150512
bp-3ef6dbde-df29-4132-b723-e438e2150509
bp-a6260521-93bb-43c0-aa87-1e3972150509
bp-69695406-32fa-4bae-9d7f-292602150509
bp-da09434a-21b6-420f-b5f6-cd4aa2150509

See also: Bug 1162698
Assignee: nobody → mrbkap
Flags: needinfo?(mrbkap)
I think Trevor is right on. My gdb profiling shows us hanging out in gfxFontGroup::ComputeRanges for ever and ever. Someone should probably re-run this under a profiler with an optimized build, but looking at that function, it's an uninterruptible loop right in the middle of layout, which causes the child process to appear to hang. The parent, of course, is unaffected unless it blocks on the child for any reason.

I don't think we can block e10s on this. Once we have multiple child processes this will be less of a problem and this is really a layout (text rendering) perf bug.
Flags: needinfo?(mrbkap)
I agree that this probably should not block e10s.  I tried to profile this on today's nightly, but it appears that the symbol server is still down for Windows.

Based on what we see with this test case, which is on the extreme end of the scale, being an infinite loop and all, I'm willing to bet that, on a smaller scale, this may be a (large?) contributing factor to Bug 1135719.  Does that seem plausible?

FWIW, Chrome and Opera seem to handle the test case quite well, but IE chokes with both its content and chrome being locked up.  I don't have a Mac to test on Safari.
Flags: needinfo?(mrbkap)
Yeah, bugs like this definitely would cause that.

I tested in an opt build and we are hanging in text layout. We could make it better by making text layout interruptible, but I'm not convinced that's really worth it now.
Flags: needinfo?(mrbkap)
OK, this sounds like it is not e10s specific. NI'ing Bill to file a follow up bug about dropping a slow page dialog when this happens.
Flags: needinfo?(wmccloskey)
Summary: [e10s] Browser gets bogged down loading infinite page → Browser gets bogged down loading infinite page
Filed bug 1171700 to create a hang monitor that would detect this.
Flags: needinfo?(wmccloskey)
See Also: → 1256279, 1256710, 1261506, 1299750
(In reply to Blake Kaplan (:mrbkap) (new baby, should be back before 3/21) from comment #2)
> I think Trevor is right on. My gdb profiling shows us hanging out in
> gfxFontGroup::ComputeRanges for ever and ever. Someone should probably
> re-run this under a profiler with an optimized build, but looking at that
> function, it's an uninterruptible loop right in the middle of layout, which
> causes the child process to appear to hang. The parent, of course, is
> unaffected unless it blocks on the child for any reason.
> 
> I don't think we can block e10s on this. Once we have multiple child
> processes this will be less of a problem and this is really a layout (text
> rendering) perf bug.

Well, it will still block at least 1/4 of your tabs, and cause other fun effects if something tries to talk to that Content process.  Tabs in other content processes generally remain performant.

There are several phases to the almost-iloop.  One of them is ComputeRanges()
Also: SplitAndInitTextRun(), SetupBreakSinksForTextRun(), and nsTextFragment::Append() (30 seconds locked there in my profile!)

Are there any bugs (designs?) for making this interruptible? (mrbkap isn't accepting NI; falling back to jimm to poke the right people)
Flags: needinfo?(jmathies)
I think comment 8 is a question for layout people rather than e10s people. I ended up with this as an artifact of how we were triaging bugs a couple of years (!) ago.

Jet, do you know who might want to look at making text layout interruptible or otherwise speeding things up here?
Assignee: mrbkap → nobody
Flags: needinfo?(bugs)
(In reply to Randell Jesup [:jesup] from comment #8)
> (In reply to Blake Kaplan (:mrbkap) (new baby, should be back before 3/21)
> from comment #2)
> > I think Trevor is right on. My gdb profiling shows us hanging out in
> > gfxFontGroup::ComputeRanges for ever and ever. Someone should probably
> > re-run this under a profiler with an optimized build, but looking at that
> > function, it's an uninterruptible loop right in the middle of layout, which
> > causes the child process to appear to hang. The parent, of course, is
> > unaffected unless it blocks on the child for any reason.
> > 
> > I don't think we can block e10s on this. Once we have multiple child
> > processes this will be less of a problem and this is really a layout (text
> > rendering) perf bug.
> 
> Well, it will still block at least 1/4 of your tabs, and cause other fun
> effects if something tries to talk to that Content process.  Tabs in other
> content processes generally remain performant.
> 
> There are several phases to the almost-iloop.  One of them is ComputeRanges()
> Also: SplitAndInitTextRun(), SetupBreakSinksForTextRun(), and
> nsTextFragment::Append() (30 seconds locked there in my profile!)
> 
> Are there any bugs (designs?) for making this interruptible? (mrbkap isn't
> accepting NI; falling back to jimm to poke the right people)

This has come up before, I remember build logs causing problems like this prior to e10s. Definitely not an e10s specific issue, just a sucky bug that e10s couldn't fix. I think moving this to text layout is the right direction.
Flags: needinfo?(jmathies)
I've been told that Maire is actually the correct person to ask. Please see comment 9.
Flags: needinfo?(bugs) → needinfo?(mreavy)
Bobby - can you answer the question in comment 9?  Thanks!
Flags: needinfo?(mreavy) → needinfo?(bobbyholley)
So the issue here is that interruptible reflow seems not to be granular enough to check for interrupt within a given line (specifically during text run construction), leading to unbounded time in layout for long lines of text. Here's a profile [1].

This testcase is obviously contrived, but the issue does bite us in the real world, in cases like bug 1466704. So I think we want to fix this. I'm adding it to the docket of layout performance work.

Once we fix this specific bug, we should look into any other cases where we may need other interrupt checkpoints. I'll file a separate bug about that.

Given that this happens during lazy text run construction, it's worth noting that we're talking about changing that setup in bug 1471309. These two issues don't need to block each other per se, but we should keep one in mind when working on the other.

[1] https://perfht.ml/2KlwjJ9
Blocks: layout-perf
Component: General → Layout
Flags: needinfo?(bobbyholley)
Summary: Browser gets bogged down loading infinite page → Text layout is not interruptible within a line
Blocks: 1466704
Blocks: 1471689
No longer blocks: 1466704
Duplicate of this bug: 1357305
You need to log in before you can comment on or make changes to this bug.