Closed Bug 371867 Opened 18 years ago Closed 17 years ago

Severe performance regression on FAA flight status page

Tracking

()

Status:

VERIFIED FIXED

People

(Reporter: bzbarsky, Assigned: vlad)

References

(
URL
)

Details

(Keywords: perf, regression)

Attachments

(3 files, 2 obsolete files)

testcase 18 years ago Andrew Schultz 3.67 KB, text/html		Details
seamonkey stack 18 years ago Andrew Schultz 7.95 KB, text/plain		Details
apply same-pixel optimization to all images, not just 1x1 17 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) 1.74 KB, patch		Details \| Diff \| Splinter Review
updated 17 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) 2.29 KB, patch		Details \| Diff \| Splinter Review
optimize same-color images 17 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) 2.29 KB, patch	pavlov : review+	Details \| Diff \| Splinter Review

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

18 years ago

BUILD: Current trunk build (Seamonkey and Firefox have the same behavior; I just don't have old Firefox builds).

STEPS TO REPRODUCE:
1)  Load http://www.fly.faa.gov/flyfaa/usmap.jsp
2)  Click on the "Select a Major Airport" combobox on the top right
3)  Try to use the keyboard to scroll.

ACTUAL RESULTS: Combobox takes 3-5 seconds to show up.  Scrolling takes 2-4 seconds per item.  The page is basically unusable, in addition to which it makes other apps unusable too (see below).

EXPECTED RESULTS:  Do all that much faster.

I see the expected results in a 2007-01-31-01 build, and the regression in a 2007-02-01-01 build.  In between those, Seamonkey flipped on cairo.  Given that the CPU usage seems to be in X, not in Gecko (so jprof shows nothing useful, even with -sync), this seems more likely than any of the actual checkins to cause this problem...

Note that this is likely to be Linux-only and depend on the exact X server version.  If there's any more info I can provide, I would be happy to do that.

Also note that it's clearly _possible_ to be a lot faster here.  We were on Jan 31...

Flags: blocking1.9?

Andrew Schultz

Comment 1

•

18 years ago

Attached file testcase — Details

It seems that absolutely positioned transparent images are the cause of the slowdown.  The original page has 43 (this testcase has only 6).

With a cairo build on linux, the page tends to take a few seconds to respond to most user actions (selecting the dropdown, scrolling), although once I start scrolling, it scrolls quickly.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 2

•

18 years ago

Which X server version and driver?

Andrew Schultz

Comment 3

•

18 years ago

I'm running FC6's Xorg 1.1.1 and ATI Radeon driver 6.6.3

Boris Zbarsky [:bzbarsky]

Reporter

Comment 4

•

18 years ago

Andrew, could you possibly install the debuginfo RPMs for X and try sysprof on these testcases?

And thank you for confirming that it's not just me!  ;)

Andrew Schultz

Comment 5

•

18 years ago

Sysprof indicates that ~90% of the time is spent in fbCompositeSrc_8888x8888mmx (with my testcase).

Boris Zbarsky [:bzbarsky]

Reporter

Comment 6

•

18 years ago

With what callstack?

Andrew Schultz

Comment 7

•

18 years ago

/me figures out how sysprof works

The full stack (back to Xorg's main) where all the fun happens is:
fbCompositeSrc_8888x8888mmx
fbComposite
XAAComposite
cwComposite
damageComposite
CompositePicture
ProcRenderComposite
ProcRenderDispatch
Dispatch
main

Boris Zbarsky [:bzbarsky]

Reporter

Comment 8

•

18 years ago

Hmm...  If you run Mozilla with -sync, do you get a stack back into mozilla code?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 9

•

18 years ago

I guess we're recompositing all those images on top of each other... but I wonder why that ends up being so slow on the X side.  6 images is not exactly a lot.

Andrew Schultz

Comment 10

•

18 years ago

> Hmm...  If you run Mozilla with -sync, do you get a stack back into
> mozilla code?

No, it's always back to Xorg.

Andrew Schultz

Comment 11

•

18 years ago

Attached file seamonkey stack — Details

Actually, the Mozilla side is pretty easy.  I just started with --sync and broke in with gdb while it was repainting.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Updated

•

18 years ago

Flags: blocking1.9? → blocking1.9+

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Updated

•

18 years ago

Target Milestone: --- → mozilla1.9beta1

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 12

•

17 years ago

Try adding

  Option "XAANoOffscreenPixmaps" "true"

to the Device section in xorg.conf.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Updated

•

17 years ago

Depends on: 392934

Boris Zbarsky [:bzbarsky]

Reporter

Comment 13

•

17 years ago

For what it's worth, vlad's suggestion improved performance here a good bit for me (less than 0.5 seconds to change combobox items, instead of 3-4 seconds).

Still slower than pre-cairo, but at least it's usable....

Stuart Parmenter

Comment 14

•

17 years ago

not sure what else we can really do here.  this sounds like a cairo/X server bug more than anything.

Priority: -- → P5

Boris Zbarsky [:bzbarsky]

Reporter

Comment 15

•

17 years ago

I'll buy that it's a cairo bug, perhaps.  In which case we should try to get them to fix it, if at all possible.  It's really nice to be able to get flight status and all.

In general, I seem to recall us working around library or OS bugs that led to serious issues for our users.  Not sure how feasible that is in this case.

Stuart Parmenter

Updated

•

17 years ago

Target Milestone: mozilla1.9 M8 → ---

Stuart Parmenter

Comment 16

•

17 years ago

this is a little slow on my fc6 under parallels but certainly usable.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 17

•

17 years ago

How does it compare to 1.8 performance?  Note that your hardware is likely faster than mine (dunno how it compares to our typical user)...  and that I bet scrolling to the bottom of the list is still a lot more painful than on 1.8, even on your machine.

Mike Schroepfer

Comment 18

•

17 years ago

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9a9pre) Gecko/2007110604 Minefield/3.0a9pre ID:2007110604

Can confirm that trunk perf is noticbly worse than branch on Mac (build above)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 20

•

17 years ago

Copying the higher priority from the dup bug.  Though it's not quite the same steps to reproduce, so it's remotely possible they're separate issues.  At first blush, I doubt they are.

Priority: P5 → P2

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 21

•

17 years ago

Note that the duplicate bug has a regression date that appears NOT to be when cairo was turned on.  (Though before when cairo was turned on for SeaMonkey nightlies.)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 22

•

17 years ago

Er, never mind, it does match when cairo was enabled (although about:buildconfig seems to be the only way to tell).

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Updated

•

17 years ago

Assignee: nobody → vladimir

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 23

•

17 years ago

Okay; the way that this is structured is a map image, and then about 2 dozen of the same image layered on top of it, which are just used for imagemaps.  That image, you might notice, is http://www.fly.faa.gov/images/pacing_green.gif -- which is a 692x455 fully transparent gif image.  The reason why this worked quickly before is that we were doing 1 bit mask transparency rendering before for GIFs.  That's one piece of functionality that we lost (there's no notion of 1-bit transparency in the new world), so we end up doing full 8-bit alpha blending for each image.  On linux especially, this has the potential to be really expensive.

I can fix this and similar cases by extending our image detection stuff -- we already detect when ARGB32 images should be kept as RGB24 (in the JPG case); I can extend this to also detect when the images are a constant color, and kick in an optimization in that case.  Does that seem useful?  It would fix this particular testcase, but I don't know how general that approach is on the web, and it wouldn't kick in if, say, you had just 1 pixel that was different than the others on an image.

Then again, given imagemaps have to go on images, fully transparent images are probably not that uncommon.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 24

•

17 years ago

Fully transparent spacer GIFs sized to a pretty large size are not that uncommon.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 25

•

17 years ago

Attached patch apply same-pixel optimization to all images, not just 1x1 (obsolete) — Details — Splinter Review

Patch.  Gonna do some perf tests to make sure the effect isn't noticeable on normal images, but I don't think it will be -- we should abort pretty early on out of that loop on real images.

Stuart Parmenter

Comment 26

•

17 years ago

Comment on attachment 299957 [details] [diff] [review]
apply same-pixel optimization to all images, not just 1x1

given that the image data is guaranteed-aligned you could just loop through the whole image rather than going line by line....  you could also do this in parallel with some mmoy love.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 27

•

17 years ago

Attached patch updated (obsolete) — Details — Splinter Review

Updated to do single loop; I'm not sure if using MMX/SSE here is worth it, because I think this type of thing should already be very fast -- and to get it faster we'd have to complicate this code a good chunk (at least two paths).  This patch gives me a very visible win on the testcase in this bug.

Attachment #299957 - Attachment is obsolete: true

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 28

•

17 years ago

Attached patch optimize same-color images — Details — Splinter Review

Ok, updated, now also correctly frees old surfaces so this could be a good memory win as well.

Attachment #299966 - Attachment is obsolete: true

Attachment #300523 - Flags: review?(pavlov)

Stuart Parmenter

Updated

•

17 years ago

Attachment #300523 - Flags: review?(pavlov) → review+

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Updated

•

17 years ago

Blocks: 415854

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 29

•

17 years ago

Checked in.  Reopen please if this is still bad, but I expect it to be significantly better.

Status: NEW → RESOLVED

Closed: 17 years ago

Resolution: --- → FIXED

Asaf Romano (gone)

Comment 30

•

17 years ago

Backed this out (twice or maybe even trice) to figure out whether it's the cause for the orange, please re-land this once the tree is green again.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 31

•

17 years ago

Fixed; reftest failures were unexpected passes on win32, see my comments in bug 371316 comment 8.

Status: REOPENED → RESOLVED

Closed: 17 years ago → 17 years ago

Resolution: --- → FIXED

Boris Zbarsky [:bzbarsky]

Reporter

Comment 32

•

17 years ago

That patch helps a ton.  It's quite usable now.  Thank you!

Status: RESOLVED → VERIFIED

philippe (part-time)

Updated

•

17 years ago

Depends on: 429915

You need to log in before you can comment on or make changes to this bug.