Open Bug 1725781 Opened 3 years ago Updated 7 months ago

SAP application very slow since 72.0b1, with deeply nested percent-height tables

Tracking

()

Status:

NEW

Project Flags:

Performance Impact

medium

People

(Reporter: joerg.goetting, Unassigned)

References

(Depends on 1 open bug)

Details

(4 keywords)

Attachments

(4 files)

Saved page of SAP application "Lean Request" 3 years ago Jörg Götting 1.31 MB, application/x-zip-compressed		Details
testcase 1 (somewhat reduced) 3 years ago Daniel Holbert [:dholbert] 127.93 KB, text/html		Details
testcase 2 (further reduced) 3 years ago Daniel Holbert [:dholbert] 120.77 KB, text/html		Details
testcase 3 (just nested 100%-height table/td; hangs for ~6 seconds on dholbert's Thinkpad) 3 years ago Daniel Holbert [:dholbert] 490 bytes, text/html		Details

Jörg Götting

Reporter

Description

•

3 years ago

Attached file Saved page of SAP application "Lean Request" — Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0

Steps to reproduce:

Attached example html (saved state of an SAP application) runs very slow in current Firefox versions.
Tests with older versions show that the important change occurred in 72.0b1. All older versions including 71.0 do fine. In all versions starting from 72.0b1 including Beta 92.0b3 (32-Bit), Dev 92.0b3 (64-Bit) and Nightly 93.0a1 (2021-08-13) (64-Bit) rendering is very slow.

Actual results:

In the live application, Firefox shows message "The page slows down Firefox ...".
With the saved test html file it takes more than a minute to render the page.

Current Edge and Chrome display the page immediately.
Edge, Chrome had a quite similar performance problem with the same SAP application two weeks ago in the new version 92. The problem was fixed with this bug (included in Chromium 92 in the meantime):
https://bugs.chromium.org/p/chromium/issues/detail?id=1232804
("Issue 1232804: Performance regression in Version 92.0.4515.107 - Page layout with nested flexbox")

Expected results:

The page should be displayed as quickly as up to version 71.0.

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

3 years ago

The Bugbug bot thinks this bug should belong to the 'Core::Layout: Flexbox' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Layout: Flexbox

Product: Firefox → Core

Mayank Bansal

Comment 2

•

3 years ago

regressin : https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9e3f44e87a1a2be927c7fab135653c6b86b982c9&tochange=88db9bea4580df16dc444668f8c2cddbb3414318

Prfie : https://share.firefox.dev/2VNA3Lb

Status: UNCONFIRMED → NEW

Ever confirmed: true

Daniel Holbert [:dholbert]

Comment 3

•

3 years ago

(In reply to Jörg Götting from comment #0)

Edge, Chrome had a quite similar performance problem with the same SAP application two weeks ago in the new version 92. The problem was fixed with this bug (included in Chromium 92 in the meantime):
https://bugs.chromium.org/p/chromium/issues/detail?id=1232804
("Issue 1232804: Performance regression in Version 92.0.4515.107 - Page layout with nested flexbox")

This could conceivably be a regression from bug 1591925, then (one of the bugs in comment 2's pushlog).

I don't see a lot of flexbox in the stacks for comment 2's profile, though -- so I'm also wondering if this might be a regression from bug 1595199.

Daniel Holbert [:dholbert]

Updated

•

3 years ago

Flags: needinfo?(dholbert)

Whiteboard: [qf]

Jonathan Kew [:jfkthame]

Comment 4

•

3 years ago

Quite an old regression, it seems, but would be good to get to the bottom of this; presumably there must be some number of other pages/sites also being affected.

Severity: -- → S3

Has Regression Range: --- → yes

Keywords: regression

Bas Schouten (:bas.schouten)

Comment 5

•

3 years ago

I'm marking this p1. The reason being that this seems to constitute an indefinite hang. By valid webcontent. That Chrome and Edge have addressed.

Whiteboard: [qf] → [qf:p1:responsiveness]

Daniel Holbert [:dholbert]

Comment 6

•

3 years ago

Attached file testcase 1 (somewhat reduced) — Details

This standalone testcase (reduced from previously-attached SAP zip file) takes around 1 or 2 seconds to render for me in Firefox Nightly 94.0a1 (2021-09-22), whereas it renders with no noticeable delay in Chrome.

Profile: https://share.firefox.dev/2ZhQQHn (including a 1.7 second reflow)

Daniel Holbert [:dholbert]

Comment 7

•

3 years ago

Attached file testcase 2 (further reduced) — Details

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Comment 8

•

3 years ago

I just double-checked the regression range locally (using the original testcase to be on the safe side), and I got tighter range than in comment 2 -- perhaps my platform has finer-grained builds vs. those that were available for comment 2:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9e3f44e87a1a2be927c7fab135653c6b86b982c9&tochange=35436d4e7917bf9d9b96a6173201ca001a8ff7bc

This does not include any flexbox changes, so this definitely isn't a flexbox bug. I suspect this belongs in tables, given that the testcases here are just deeply-nested tables, basically.

Component: Layout: Flexbox → Layout: Tables

Daniel Holbert [:dholbert]

Comment 9

•

3 years ago

Attached file testcase 3 (just nested 100%-height table/td; hangs for ~6 seconds on dholbert's Thinkpad) — Details

This is a further-reduced testcase, which I made from scratch after noticing lots of height:100% on nested table/td in the original testcase (and noticing that the load time rapidly improves if I reduce those).

The layout time here seems to increase exponentially with the level of nesting, so I can make this testcase arbitrarily-bad by just adding a few more levels of <table><td>.

This loads ~instantly in Chrome, FWIW.

Also notable, though -- this is slow in builds from before this bug's regression range. So this particular testcase is a broader / longer-standing bit of pain. I suspect that perhaps the regression here was that we were avoiding this slow path for some common case, and now we are not. (But nonetheless, it was & remains a slow path in Firefox which happens to not be a slow path in Chrome.)

Daniel Holbert [:dholbert]

Comment 10

•

3 years ago

Here's a profile of testcase 3: https://share.firefox.dev/3u2r3OR
It takes ~6 seconds of layout on my machine, with 20 levels of nested tables (and I can make it arbitrarily worse by adding a few additional levels).

In Chrome, it loads with no noticeable delay.

Daniel Holbert [:dholbert]

Comment 11

•

3 years ago

Possibly the same core issue as bug 792508 (with the wrinkle that we had a regression around 2019-11-13 where we changed to fall into this slow path in a case where we previously avoided doing so).

Updated

•

3 years ago

Attachment #9242594 - Attachment description: testcase 3 (just nested 100%-height table/td) → testcase 3 (just nested 100%-height table/td; hangs for ~6 seconds on dholbert's Thinkpad)

Daniel Holbert [:dholbert]

Comment 12

•

3 years ago

•

Edited

Also: I can make older builds (before the regression range) reproduce the issue, with the original testcase, if I add

* { overflow: visible !important; }

So in old builds, we were being "saved" by some scrollable element effectively breaking a nested chain of height:100% elements. And bug 1595199 correctly fixed that by propagating a percent-height-resolution flag through scrollable elements, which means the scrollframe doesn't "save" us anymore.

So, the proximal regression here was from bug 1595199, which was a correctness fix that made us propagate a flag that we should have been propagating all along. And the real lower-level issue here is long-time bug 792508.

Frank Griffith Jr

Comment 13

•

3 years ago

Can we retest this after Fission?

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Comment 14

•

3 years ago

This isn't related to fission. It seems to be a long-standing inefficiency in our <table> layout code.

But in any case, I confirmed it's still slow in Nightly where I have Fission active (confirmed in about:support). Testcase 3 takes ~6 seconds to load on my thinkpad, which matches the behavior observed before.

Flags: needinfo?(dholbert)

Frank Griffith Jr

Comment 15

•

3 years ago

Just wanted to understand if this was still an issue post-Fission, not necessarily related to Fission. Thanks!

Daniel Holbert [:dholbert]

Comment 16

•

3 years ago

•

Edited

[edit: inadvertently submitted a partially-written comment]

We might want to dupe this and bug 792508, since they're the same core issue (modulo comment 12).

Whatever we do, testcase 3 here is a nice minimal testcase for whoever works on this to start with; and bug 792508 comment 2 - 3 has some analysis/thoughts from Mats 10 years ago, too.

Summary: SAP application very slow since 72.0b1 → SAP application very slow since 72.0b1, with deeply nested percent-height tables

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P1

Keywords: perf:responsiveness

Whiteboard: [qf:p1:responsiveness]

Frank Doty [:fdoty]

Comment 17

•

2 years ago

The Performance Priority Calculator has determined this bug's performance priority to be P1.

Platforms: [x] Windows [x] macOS [x] Linux [x] Android
Impact on site: Causes noticeable jank
Configuration: Rare
Page load impact: Some
[x] Affects major website
[x] Able to reproduce locally
[x] Bug affects multiple sites

Keywords: regression → perf:pageload, reproducible, top50

Daniel Holbert [:dholbert]

Updated

•

2 years ago

Keywords: regression

Daniel Holbert [:dholbert]

Updated

•

2 years ago

Comment 18

•

2 years ago

The severity field for this bug is set to S3. However, the Performance Impact field flags this bug as having a high impact on the performance.
:dholbert, could you consider increasing the severity of this performance-impacting bug? Alternatively, if you think the performance impact is lower than previously assessed, could you request a re-triage from the performance team by setting the Performance Impact flag to ??

For more information, please visit auto_nag documentation.

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Comment 19

•

2 years ago

•

Edited

Replying to comment 17 comment 18: In terms of prioritization, I'm not sure the Affects major website / top50 assessment is accurate. I would guess/hope that this is rare in the wild.

SAP is itself a prominent web property, and this bug was reported for a SAP-hosted web application, but importantly there's no indication that this bug affects any substantial portion of SAP applications in general. (If it did, that would be important to know & would be quite concerning, and a severity-bump would totally be merited.)

This perf issue would still be great to fix, but it probably won't happen right away; the fix won't be trivial and will likely include some substantial risk of breakage (and hence will need careful thought/testing/vetting), given this is a long-standing inefficiency with deeply nested <table> elements, which are a pretty-old/crufty part of our codebase.

I'll take the bot's suggestion to reset Performance Priority Calculator to ? (and I'll remove the top50 keyword). On the Performance Priority Calculator, I think the appropriate Websites affected: value is Rare (though Configuration should be General, since this doesn't require any special configuration on the client side). This lowers the perf impact assessment a bit, to "low" when I click through the calculator tool. (If we have signals that this is actually a common issue, I'm of course happy to be corrected on this.)

Performance Impact: high → ?

Flags: needinfo?(dholbert)

Keywords: top50

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 20

•

2 years ago

Thanks @dholbert, we've recalculated the impact as medium.

The Performance Impact Calculator has determined this bug's performance impact to be medium. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: Windows
Impact on site: Causes noticeable jank
Page load impact: Some
Websites affected: Rare
[x] Able to reproduce locally

Performance Impact: ? → medium

Emilio Cobos Álvarez (:emilio)

Updated

•

7 months ago

Depends on: 1883699

You need to log in before you can comment on or make changes to this bug.