Open
Bug 599616
Opened 14 years ago
Updated 2 years ago
Chrome Experiment: Depth of Field runs slower in Minefield than Chromium
Categories
(Core :: Graphics, defect)
Tracking
()
NEW
People
(Reporter: kpangilinan, Unassigned)
References
()
Details
(Keywords: perf, Whiteboard: [painting-perf][chromeexperiments] )
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:2.0b7pre) Gecko/20100916 Firefox/4.0b7pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10 There is a significant performance issue with Minefield: Ball movements are very slow. Changing the angle with the mouse is unresponsive. Moving the window around will stall computer for a few seconds. Reproducible: Always Steps to Reproduce: 1. Open Depth of Field experiment 2. Try changing the view angle 3. Try moving the window around Actual Results: 1. Balls are very slow 2. Unresponsive to angle change 3. Computer stalls for a few seconds Expected Results: Smooth ball movements, responsive to angle change, and computer should not stall/freeze when moving the window around. In Chromium this experiement works perfectly.
Comment 1•14 years ago
|
||
Does this happen with Firefox trunk? JS has been majorly improved there. ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/
Version: unspecified → 1.9.2 Branch
I *think* Ken was using the sept 16 build, from the User-Agent indicator, but the Build Identifier line is a little confusing in that context, it's true. I certainly see it with Firefox trunk as of yesterday's nightly. Tyler: do you not see it with your builds?
Comment 3•14 years ago
|
||
I am seeing variable performance. Some spots, such as the end, run at full speed, others don't run very well. However, I has several tabs and windows open, and this is not a fresh profile, so it's not the best platform for testing ATM
Comment 4•14 years ago
|
||
Shark shows that a large part of this is painting. 49.1% under gfxSurfaceDrawable::Draw.15% of total time in sseCGSBlendXXXX8888. Can somebody verify this?
Updated•14 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Whiteboard: [chromeexperiments]
Comment 5•14 years ago
|
||
Moving to Core:Graphics, as painting seems to take at least 60% on OS X. This demo runs much faster (smoother) in both Chrome and Safari.
Assignee: general → nobody
Component: JavaScript Engine → Graphics
QA Contact: general → thebes
Whiteboard: [chromeexperiments] → [painting-perf][chromeexperiments]
Comment 6•14 years ago
|
||
On OS X this is also a bit jerky in Chrome, but in Safari it's very fast. I wonder how this performs on Direct2D/IE9...
Comment 7•14 years ago
|
||
Performs poorly with D3D9 layers. It performs reasonable using D2D in Firefox, but has some artifacts. It performs better in IE9. I'll investigate this.
Comment 8•14 years ago
|
||
I might note IE9 does not oscillate the ball sizes, so it's not completely correct.
Comment 9•14 years ago
|
||
With Direct2D with D3D10, this performs 'alright' but not smooth! About 50-60% of the time is spent in JS, particularly in js_NativeSet(some 20%). Some 20% in total is spent inside nsCSSFrameConstructor::RestyleElement, of which the majority in ComputeStyleChangeFor. Some 30% is spent in painting, over half of which is spent in BuildContainerLayerFor.
Comment 10•14 years ago
|
||
> particularly in js_NativeSet(some 20%).
In or under? Over here I see time under it, in particular under nsDOMCSSDeclaration::ParsePropertyValue. That would be consistent with the page, say, setting style.top and style.left on DOM nodes. No obvious hotspots under here other than the silly before/after attr change notifications we have to send here and the actual parsing of the number, as usual.
Given that, the style reresolution you're seeing also makes sense....
The remaining "JS" time seems to be stubs::add (usual thing of doing |n + "px"| which involves number to string conversions), stubs::Mul (calling js::ValueToNumberSlow which calls js_strtod), getting .style off nodes, and some stuff like floor() and the like.
For me, (mac, testing without GL layers on for various reasons) painting is over 60% of the total time.
Comment 11•14 years ago
|
||
Note that that inline style stuff is something we've been optimizing for a while now. Webkit can do some of it faster because it just does it wrong in edg-y cases (see "attr change notifications" above).
Comment 12•14 years ago
|
||
(In reply to comment #10) > > particularly in js_NativeSet(some 20%). > > In or under? Over here I see time under it, in particular under > nsDOMCSSDeclaration::ParsePropertyValue. That would be consistent with the > page, say, setting style.top and style.left on DOM nodes. No obvious hotspots > under here other than the silly before/after attr change notifications we have > to send here and the actual parsing of the number, as usual. > > Given that, the style reresolution you're seeing also makes sense.... > > The remaining "JS" time seems to be stubs::add (usual thing of doing |n + "px"| > which involves number to string conversions), stubs::Mul (calling > js::ValueToNumberSlow which calls js_strtod), getting .style off nodes, and > some stuff like floor() and the like. > > For me, (mac, testing without GL layers on for various reasons) painting is > over 60% of the total time. What you're seeing seems to agree largely with what I'm seeing. The reason for the higher painting weight in your case is probably just D2D.
Comment 13•13 years ago
|
||
I profiled this again today, it seems that in current D2D builds this is pretty slow, we're spending a relatively high amount of time creating intermediate surfaces(and so textures) in something called 'CreateSamplingRestrictedDrawable' (gfxUtils.cpp:394). This probably needs some looking into if there's something better we can do there. That codepath is quite bad for any accelerated system.
Comment 14•13 years ago
|
||
If I comment out that codepath (using 'incorrect' sampling) I get much better perf, but it's still worse than IE9 or Chromium.
Comment 15•13 years ago
|
||
The other observation here is that we spend a lot of time creating new textures and copying stuff around, because rather than a ball being retained as a layer, it seems to get redrawn all the time (more often than I believe needed due to different sampling sizes), and we're getting different texture rects every frame causing texture recreation and copying around lots of data. See ThebesLayerD3D10.cpp:222, which is the path being hit.
Comment 16•13 years ago
|
||
This demo is using a single images with multiple pictures of the ball, each with varying levels of focus. It's then setting the style.background to this image and using style.backgroundPosition and style.width/height to only show one of the ball images. I think if we could get this to result in an ImageLayer, with the image only being a single texture, and just drawn with different texture coordinates, we would get a massive performance improvement. FrameLayerBuilder.cpp:ContainerState::ThebesLayerData::Accumulate and FrameLayerBuilder.cpp:ContainerState::ThebesLayerData::CanOptimizeImageLayer handle doing something similar to this for nsDisplayImage, and we'd need to extend this to also handle nsDisplayBackground. This could also probably be implemented by letting nsDisplayBackground directly create it's own layer, but I'm not sure how to detect that it is the only item in the ThebesLayer.
We could extend the nsDisplayImage code to handle other kinds of images. Or we could create an nsDisplayImage (or an extended version thereof) directly instead of nsDisplayBackground in certain cases. However, we'll run into the problem again that we need to restrict sampling of the source image to a particular rectangle and we don't have a good way to do that in layers :-(.
Comment 18•13 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #17) > However, we'll run into the problem again that we need to restrict sampling > of the source image to a particular rectangle and we don't have a good way > to do that in layers :-(. What's the complication here? As far as I know modifying ImageLayer::SetCurrentImage to include a source rectangle parameter should be easy. Unless I'm missing something, we already do this for PlanerYCbCrImages, where we upload the entire frame and only sample from the picture rect.
Comment 19•13 years ago
|
||
*Planar Never mind, forgot about resamping entirely :)
(In reply to Matt Woodrow (:mattwoodrow) from comment #18) > Unless I'm missing something, we already do this for PlanerYCbCrImages, > where we upload the entire frame and only sample from the picture rect. The difference is that with video images, we can assume that the pixels outside the sample rect are "pretty close" to the pixels inside the sample rect. With arbitrary CSS background images, we can't.
We could do something hairy like add a source rectangle parameter to SetCurrentImage and keep a cache of extracted subrectangles, by suballocating areas of a single large texture with thick EXTEND_PAD borders. Or something like that. Or, Bas could think of some magical way to make the GPU handle this for us :-).
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•