Closed Bug 636585 Opened 13 years ago Closed 11 years ago

1 CPU core maxed while viewing BMC remedy (uses a flash plugin)

Categories

(Core Graveyard :: Plug-ins, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jack, Unassigned)

References

Details

User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b12pre) Gecko/20110216 Firefox/4.0b12pre
Build Identifier: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/1297896459/firefox-4.0b12pre.en-US.win32.installer.exe

We have this enterprise ticketing system at work, and it started spiking my CPU with recent minefield builds. Since I'm on a dual core, it goes to 50% and stays there, and there is a big delay when trying to switch to another tab or open any of firefox's menus, or using any of the form widgets in the ticketing system. I narrowed it down using tinderbox builds to find out roughly when it started.

The last build that worked fine was this:
http://hg.mozilla.org/mozilla-central/rev/52246c1b1799 - 1297892857

And the first one (and everyone one since) that spikes the CPU is
http://hg.mozilla.org/mozilla-central/rev/0f777e59d48c - 1297896459

This is a really unsatisfying bug report, but I figured it couldn't hurt too much to try sharing what I have seen, and hope maybe someone has some advice on collecting info on how to reproduce the problem.

I think this is related to plugins, because I noticed, on one of the broken pages, that the CPU goes back to normal when I click a button to hide this empty box that supposedly holds an invisible Flash widget. The hg log shows changes related to bug 629799 and bug 626602 during the time between good build 52246c1b1799 and CPU-spiking build 0f777e59d48c, so that seems to add up, for what its worth.

It's really huge HTML page with ridiculous quantities of completely unreadable javascript that still won't run without this 2nd data channel back to the system, so I can't save a page to show you. I think a screenshot would be worthless, because the actual visible portion of the page looks actually kind of normal and harmless, and the flash widgets that are embedded in the page are invisible unless I uninstall or misconfigure flash, so I can get the gray box showing the plugin couldn't be started.

This web app doesn't spike the CPU in older builds of minefield, firefox, IE, chrome or opera, so I don't think it's likely a "broken as designed" kind of situation. Otherwise, I'd be inclined to blame this ticket system rather than firefox.

Reproducible: Always

Steps to Reproduce:
1. ?? I can reproduce it every time, but I don't know how anyone else outside my company could.
This is from bug 606602. Can you paste the graphics information from about:support? I'd like to know what kind of acceleration we're using.

Also, what size is the plugin in question? If it's really huge (larger than a single page), we might be having some problems with painting more than we need to.
bug 626602, that is.
Blocks: 626602
Is this what you were looking for?

Adapter Description: ATI Radeon HD 4800 Series
Vendor ID: 1002
Device ID: 9442
Adapter RAM: 1024
Adapter Drivers: atiu9p64 aticfx64 aticfx64 atiu9pag aticfx32 aticfx32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64
Driver Version: 8.712.0.0
Driver Date: 3-2-2010
Direct2D Enabled: Blocked on your graphics driver. Try updating your graphics driver to version 10.6 or newer.
DirectWrite Enabled: false (6.1.7601.17514, font cache n/a)
WebGL Renderer: Google Inc. -- ANGLE -- OpenGL ES 2.0 (ANGLE 0.0.0.541)
GPU Accelerated Windows: 0/1

I'm surprised that it's so far out of date. I thought I'd updated more recently than that.

I am really unclear on what exactly the flash plugin even does. I think one of these ticket pages has one (The "Change" form where I can work around the CPU prob by hiding the flash plugin), and it's a small maybe, 250x80 area devoted to it.  The other one ("Incident form") has three or more, and I can't even see where they are; I think they're hidden on another layer or something strange.  A few months ago, when I was having some kind of flash problem, the gray error boxes were about 300x200, but before and since then, when it's been working, there's nothing visible except the same HTML form widgets in that space.
I didn't update my graphics driver (yet) in case you might find it useful to have a willing tester with a relatively easily-breakable configuration, but I did confirm that 
gfx.direct2d.force-enabled = true 
is a viable workaround to the CPU issue.
Version: unspecified → Trunk
At this point I suspect that we aren't going to fix this issue, whatever it is. The only thing I'm worried about is that the non-accelerated readback path is very slow, or constantly looping, in ways that the accelerated readback path isn't. cjones/bas, what do you know about the non-accelerated readback path? Is there some sort of logging that Jack can do with a release build that would help us figure out whether this is something important?

Jack, any way to get a reduced/public testcase would be useful. I'm presuming that there is an <embed type="...flash" wmode="transparent"> on the page, which you could probably verify using DOM inspector.
Jack, when the CPU is spiking, which specific process(es) are using the most?  In the task manager, you should see a firefox-bin.exe which is the main browser process, and then plugin-container.exe process(es) in which plugin code runs.  We need to know which is using more CPU.

Beyond that, you'd need to install a debug build like [1] for us to get diagnostic information.  A couple of things that would be helpful are
 - load firefox with NSPR_LOG_MODULES="IPCPlugins:5;Layers:5" in the environment, and then open the page that's slow for a brief period of time.  This will generate a ton of output so you'd want to redirect it to a file.  Then, send us the file and we can analyze.
 - if you're comfortable with profiling tools, follow the instructions here [2] to get some approximate profiling data.  (Approximate because it's a debug build.)

[1] http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32-debug/1298660553/firefox-4.0b13pre.en-US.win32.installer.exe
[2] https://developer.mozilla.org/En/Profiling_with_Xperf
But yeah, a small-ish testcase would be most useful of all.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I'll start by apologizing because I haven't done any of the things you said, at least not yet, but I did find out a couple things.

1) The CPU usage is in firefox.exe (no -bin on windows), while plugin-container's is about normal for a flash video playing, or slightly less due to contention.

2) I found a public website that can give me the CPU & responsiveness problem.
http://www.funnyordie.com/videos/ef668caf14/drunk-history-vol-6-w-john-c-reilly-crispin-glover

3) I found out that I can simulate the bug behavior even after updating my driver (I had been putting off the optional windows update for it), by setting layers.acceleration.disabled = true. Just to make sure that it's the same thing, I tried that with a build from before the changes on the 16th, and the video flickers but the CPU & unresponsiveness does not occur -- so I think it's safe to say it's sort of the same thing. (disabling direct2d works around that flickering, btw.)
When I was using gfx.direct2d.force-enabled with the old driver, I had a pretty bad lock up, and I have no way of knowing whether there was any causal relationship -- for all I know, that's why firefox doesn't accelerate with that older driver. I've had veeeery rare system problems on this PC, but not zero.

4) I found that the severity of the CPU problem depends on how much other stuff is on the page. Before I spotted the same problem on funnyordie, I was trying to make a smaller test case from my work's ticket system, by using Aardvark and DOM Inspector to delete page elements, and as I deleted them, the CPU usage went down. If I clear out everything but the video player on funnyordie.com, the CPU gets down under 5% and firefox's menus are only slightly less snappy than usual.

I'm about to read through the links you gave me about debug builds and profiling. I presume it's something I'll be able to do.

My problem is solved, by updating my driver so acceleration can be enabled. Would it be correct to presume that there are likely to be many others that would be affected?
Just to be clear, you still see the bug with layers.acceleration.disabled AND with direct2d disable as well? (Accelerated layers disabled and direct2d enabled is not a supported configuration.)
So much for my public example - funnyordie is fine with both disabled, but my original problem, the ticket system, is still doing the same thing whether direct2d is disabled or not, as long as layers.acceleration is disabled. I guess it's not the same thing.

I'm still working on preparing an uploadable test case as I have time. I have a pretty small example myself, but I think I still have to re-create something similar because that ticket system's HTML & CSS is "confidential information" and I am bound by license terms.
(In reply to Jack Eidsness from comment #10)
> I'm still working on preparing an uploadable test case as I have time. I
> have a pretty small example myself, but I think I still have to re-create
> something similar because that ticket system's HTML & CSS is "confidential
> information" and I am bound by license terms.

(caught my eye because I used Remedy in the past)

Jack were you able to create a testcase?
If problem is gone, please close the bug as WORKSFORME
Flags: needinfo?(jack)
Keywords: testcase-wanted
I don't have access to a remedy server anymore, so if this bug still exists, it's somebody else's problem. The funnyordie video loads fine.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(jack)
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.