Closed Bug 606672 Opened 14 years ago Closed 5 years ago

Fennec scrolling code is too expensive.

Categories

(Firefox for Android Graveyard :: Panning/Zooming, defect)

ARM
Maemo
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: romaxa, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(21 files, 2 obsolete files)

200.49 KB, text/plain
Details
1.74 MB, text/plain
Details
2.15 KB, patch
Details | Diff | Splinter Review
1.18 KB, text/plain
Details
208.99 KB, text/plain
Details
8.77 KB, text/plain
Details
466.88 KB, text/plain
Details
7.02 KB, text/plain
Details
215.73 KB, text/plain
Details
211.10 KB, text/plain
Details
1.50 KB, patch
mfinkle
: review+
Details | Diff | Splinter Review
2.92 KB, patch
mfinkle
: review+
Details | Diff | Splinter Review
4.25 KB, patch
Details | Diff | Splinter Review
1.31 KB, patch
mfinkle
: review+
Details | Diff | Splinter Review
2.00 KB, patch
Details | Diff | Splinter Review
71.04 KB, text/plain
Details
4.49 KB, text/plain
Details
51.22 KB, text/plain
Details
774.49 KB, text/plain
Details
528.20 KB, image/svg+xml
Details
263.06 KB, image/svg+xml
Details
I've measured scrolling speed on latest fennec (mc:a2f52283e534, mb:1fc62351ecc8) with layers, enabled methodjit for chrome.

I was building GTK fennec.

and found that fennec scrolling still lot more expensive comparing to microb (default browser).

Fennec  ~ 35-39 FPS
MicroB  ~ 50-52 FPS

for test was used:
fennec -url file:///usr/share 

and special program-wrapper which is measuring XSHMPutImage frequency.
With default browser we have lot of samples in CPU idle and maximum FPS rate, and for fennec we have lower FPS rate and almost no free CPU

samples %      image name              app name              symbol name
Default Browser MICROB
2711  30.1692  vmlinux-2.6.28-omap1    vmlinux-2.6.28-omap1  omap3_enter_idle
1791  19.9310  libtablet-browser-view  browser               scale_image_16(short*, int, int, int, sh
768    8.5466  libc-2.5.so             Xorg                  memcpy
164    1.8251  oprofiled               oprofiled             /usr/bin/oprofiled
77     0.8569  libpixman-1.so.0.17.3   browser               pixman_rasterize_edges
58     0.6454  vmlinux-2.6.28-omap1    oprofiled             __link_path_walk
55     0.6121  vmlinux-2.6.28-omap1    vmlinux-2.6.28-omap1  tick_nohz_stop_sched_tick
54     0.6009  vmlinux-2.6.28-omap1    vmlinux-2.6.28-omap1  tick_nohz_restart_sched_tick
52     0.5787  busybox                 busybox               /bin/busybox
51     0.5675  vmlinux-2.6.28-omap1    Xorg                  unix_poll
50     0.5564  oprofile                oprofile              /oprofile

Fennec Browser
666    6.9274  libc-2.5.so             Xorg                  memcpy
648    6.7402  libxul.so               fennec                pixman_composite_src_0565_0565_asm_neon
424    4.4102  libxul.so               fennec                pixman_composite_src_n_0565_asm_neon
183    1.9035  oprofiled               oprofiled             /usr/bin/oprofiled
98     1.0193  libxul.so               fennec                .plt
95     0.9881  vmlinux-2.6.28-omap1    vmlinux-2.6.28-omap1  omap3_enter_idle
91     0.9465  vmlinux-2.6.28-omap1    Xorg                  unix_poll
83     0.8633  vmlinux-2.6.28-omap1    Xorg                  do_select
73     0.7593  fennec                  fennec                malloc
64     0.6657  vmlinux-2.6.28-omap1    fennec                schedule
63     0.6553  oprofile                oprofile              /oprofile
Attached file Flat oprofile
checking fennec profile I  found some overhead from JS (mjit), .plt, and lot of other functions 0.5% each.. 

InlineGetProp(js::VMFrame&
js::mjit::stubs::UncachedCallHelper
js::Interpret
....
Attached file Callgraph profile
question is would it be possible to have scrolling/zooming code without JS interpret/methodjit overhead and which bugs need to be fixed before we can get that?

Does it make sense to keep critical functionality in native code?
Depends on: 606730
Attached patch Scroll testSplinter Review
we are loosing ~6 % only in JS code, and ~ 4% in wasting BgColor rendering

This patch is emulate scrolling without bunch of JS code, just scroll by 1 pixel up/down (activating with long tap)

with this test we have:
56 fps and ~16 % CPU idle 
56 fps and ~24 % CPU idle  - with bg color rendering disabled
Trace log
Attachment #485557 - Attachment description: trace log TMFLAGS=aborts → trace log TMFLAGS=aborts (x86)
This is oprofile with methodjit.chrome = false (only tracejit = true for content and chrome).

by some reason I  don't see any trace function in this build..
here is my mozconfig:
***************************
ac_add_options --enable-application=mobile
ac_add_options --enable-optimize
ac_add_options --disable-crashreporter
ac_add_options --enable-update-channel=nightly
ac_add_options --enable-update-packaging
ac_add_options --disable-debug
ac_add_options --enable-tests
ac_add_options --enable-codesighs
ac_add_options --enable-chrome-format=flat
ac_add_options --enable-optimize=" -g -O2 -fno-omit-frame-pointer "
ac_add_options --enable-default-toolkit=cairo-gtk2
export CFLAGS="-gdwarf-2"
export CXXFLAGS="-gdwarf-2"
export MOZ_DEBUG_SYMBOLS=1
ac_add_options --enable-debug-symbols="-gdwarf-2"
export MOZILLA_OFFICIAL=1
mk_add_options PROFILE_GEN_SCRIPT=@TOPSRCDIR@/build/profile_pageloader.pl
#mk_add_options MOZ_MAKE_FLAGS="-j4"
mk_add_options MOZ_OBJDIR="obj-fn-gtk-arm5"
ac_add_options --with-maemo-version=5
ac_add_options --disable-pedantic --with-arm-kuser
***************************

looks like we have only JS-interpretation in current GTK n900 builds
this trace log from arm.
trace log show only one new entry:

trace stopped: 12876: object used as index
Abort recording of tree resource://gre/modules/XPCOMUtils.jsm:127@19 at resource://gre/modules/XPCOMUtils.jsm:127@31: getelem.

which is pointing to:
http://mxr.mozilla.org/mozilla-central/source/js/src/xpconnect/loader/XPCOMUtils.jsm#127
generateQI: function XPCU_generateQI(interfaces) {
return makeQI([Ci[i].name for each (i in interfaces) if (Ci[i])]);
full TMFLAGS log
I've checked backtrace with simple scrolltest patch applied (https://bugzilla.mozilla.org/attachment.cgi?id=485556&action=edit).
breakpoint on js::Interpret show that interpretation done for scrollTimeout: function().
even if I don't do any scrolling, I  have js::Interpret breakpoint working for
nsUpdateTimerManager.js: notify: function TM_notify(timer)

also
nsPlacesDBFlush.js:notify: function DBFlush_timerCallback
nsPlacesDBFlush.js:XPCOMUtils.defineLazyGetter(this, "_db", function()

for normal (use scrolling) I have js::Interpret working at
chrome://browser/content/input.js", lineno = 118
MouseModule.prototype = {
  handleEvent: function handleEvent(aEvent) {


something really goes wrong here, and tracing does not work here, and I  don't understand why....
(In reply to comment #6)
> Created attachment 485557 [details]
> trace log TMFLAGS=aborts (x86)
> 
> Trace log

Looks like the code in the scrolling test should find a way to move the QI out of the loop:

frmldr =  getBrowser().QueryInterface(Ci.nsIFrameLoaderOwner).frameLoader;
(In reply to comment #11)
> even if I don't do any scrolling, I  have js::Interpret breakpoint working for
> nsUpdateTimerManager.js: notify: function TM_notify(timer)

Needed for update checking

> nsPlacesDBFlush.js:notify: function DBFlush_timerCallback
> nsPlacesDBFlush.js:XPCOMUtils.defineLazyGetter(this, "_db", function()

I believe this timer is fired to update the tempdb tables for Places

> for normal (use scrolling) I have js::Interpret working at
> chrome://browser/content/input.js", lineno = 118
> MouseModule.prototype = {
>   handleEvent: function handleEvent(aEvent) {
> 
> 
> something really goes wrong here, and tracing does not work here, and I  don't
> understand why....

Do you have an abort log for this one?
tracking-fennec: --- → ?
it is not about removing flush timer... it is about why this code not jitted? and why trace abort does not show any warnings...

does anyone have profile stuff on android? to verify that it is not my local problem?
tracking-fennec: ? → ---
ok, thanks... after landing bug 605415 and enabling methodjit for chrome jsInterpret has disappeared from profile...

Question is what else prevent us from enabling method jit by default?
Depends on: 605415
this and previous profile taken from latest central build, with disabled color-layer rendering and applied QT-RENDER_DIRECT patch.

approximately with and without mjit I have ~48-49 FPS scrolling
without mjit we have js::Interpret = 4.1337%
with mjit we have:
0.8303  anon (tgid:8125 range:0x49f00000-0x49f49000)
0.4940 InlineGetProp
0.4940 js::mjit::stubs::CallProp
0.4414 anon (tgid:8125 range:0x48b8c000-0x48c00000
0.3573 js::Interpret
~2.6%


should it be like this?
I have checked why js::Interpret with mjit enabled and gdb show this:

Breakpoint 3, js::Interpret (cx=0x43f77300, entryFrame=0x46900290, inlineCallCount=0, interpMode=JSINTERP_NORMAL) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/jsinterp.cpp:2168
2168	Interpret(JSContext *cx, JSStackFrame *entryFrame, uintN inlineCallCount, JSInterpMode interpMode)
(gdb) bt
#0  js::Interpret (cx=0x43f77300, entryFrame=0x46900290, inlineCallCount=0, interpMode=JSINTERP_NORMAL) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/jsinterp.cpp:2168
#1  0x413da7e8 in js::RunScript (cx=0x43f77300, script=<value optimized out>, fp=0x46900290) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/jsinterp.cpp:637
#2  0x413db544 in js::Invoke (cx=0x43f77300, argsRef=<value optimized out>, flags=0) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/jsinterp.cpp:740
#3  0x41526ce4 in js::mjit::stubs::SlowCall (f=@0xbeef8090, argc=1095920868) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/methodjit/InvokeHelpers.cpp:197
#4  0x41520304 in SlowCallFromIC (f=@0x43f77300, ic=<value optimized out>) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/methodjit/MonoIC.cpp:417
#5  0x4dafbf2c in ?? ()
Cannot access memory at address 0x1ffff8
(gdb) frame 2
#2  0x413db544 in js::Invoke (cx=0x43f77300, argsRef=<value optimized out>, flags=0) at /home/romaxa/mozdev/mozillahg/mozilla-central/js/src/jsinterp.cpp:740
740	        ok = RunScript(cx, script, fp);
(gdb) p *script
...
 filename = 0x48abf151 "chrome://browser/content/browser.js", lineno = 765, 

browser.js", lineno = 765, pointing to:
http://mxr.mozilla.org/mobile-browser/source/chrome/content/browser.js#794
I check after some investigation I found that if just call 
frameLoader.scrollViewportBy(x, y);
somewhere in beginning of browser.js:"dragMove:"  function,  and with disabled this._waitingForPaint handler check

then I  have ~60FPS

full fennec scrolling give us only ~36 FPS

so we just loosing half of scrolling performance in JS mouse move/kinetic handler code.

For example this 2 lines:
x = Math.floor(Math.max(0, Math.min(docWidth - viewportWidth, frameLoader.viewportScrollX + x)) - frameLoader.viewportScrollX);
y = Math.floor(Math.max(0, Math.min(docHeight - viewportHeight, frameLoader.viewportScrollY + y)) - frameLoader.viewportScrollY);

drop FPS rate by 5 FPS 55->50...
also I  found that we have very deep JS stack in scrolling code:
handleEvent -> _onMouseMove -> _doDragMove -> _dragBy -> dragMove -> _panScroller -> scrollBy(browser.js) -> scrollBy (browser.xml)

and each stack level remove ~1 FPS from our 60FPS range

so if I just move with hack frmldr.scrollViewportBy into handleEvent or _onMouseMove
I getting 60 FPS without any problem
Also one mega super slow functions is 
_panControlsAwayOffset:
panControlsAwayOffset - calculation and document.getElementById calls drops FPS from 59->55
hmm very strange but I tried to minimize function calls amount and include whole Browser.contentScrollboxScroller scroll functionality in one function, and found that this piece of code cost us ~12 FPS... don't know why.
*************************
      var x = {};
      var y = {};
      var x1 = {};
      var y1 = {};
      Browser.contentScrollboxScroller.getPosition(x, y);
      ///  frameloader scrollViewportBy
      Browser.contentScrollboxScroller.getPosition(x1, y1);
      doffset.subtract(x1 - x0, y1 - y0);
*************************
also commenting out of "doffset.subtract" line give us lot of FPS...

Also I  found that doffset.subtract even more expensive for JS with methodjit comparing to JS it tracejit.
OS: Linux → Maemo
Hardware: x86 → ARM
Ok I've separated subtract into next:
1)
var x2 = {};
var y2 = {};
2)
x2 = x1 - x0;
y2 = y1 - y0;
3)
doffset.x -= x2;
doffset.y -= y2;

and found that with 2) we have FPS drop
if 2) commented out and only 3) available, then perf is ok
(In reply to comment #25)
> Ok I've separated subtract into next:
> 1)
> var x2 = {};
> var y2 = {};

I assume this could be written as: var x2, y2;

We shouldn't need the {} in this change.
Ok, I  replaced:
- Browser.contentScrollboxScroller.getPosition(x, y);
+ let scroll = getBrowser().getPosition();
- Browser.contentScrollboxScroller.getPosition(x1, y1);
+ let scroll1 = getBrowser().getPosition();

doffset.subtract(scroll1.x - scroll.x, scroll1.y - scroll.y);

and it works fast
Attachment #490568 - Flags: review?(mark.finkle)
Comment on attachment 490568 [details] [diff] [review]
Scrolling pipeline optimize part1


>   /** Pan scroller by the given amount. Updates doffset with leftovers. */
>   _panScroller: function _panScroller(scroller, doffset) {
>-    let { x: x0, y: y0 } = Browser.getScrollboxPosition(scroller);
>+    let scroll = Browser.getScrollboxPosition(scroller);

use "scroll0", just so it's easier to tell them apart
Attachment #490568 - Flags: review?(mark.finkle) → review+
Attached patch Part 2Splinter Review
Attachment #490581 - Flags: review?(mark.finkle)
Comment on attachment 490581 [details] [diff] [review]
Part 2

>diff -r 673c3b2feb98 chrome/content/browser-ui.js

>   ["controls",           "browser-controls"],
>   ["panelUI",            "panel-container"],
>   ["viewBuffer",         "view-buffer"],

Please remove this one. It's not used in the code anymore

>   ["toolbarContainer",   "toolbar-container"],
>-  ["browsers",           "browsers"]
>+  ["browsers",           "browsers"],
>+  ["contentViewPort",    "content-viewport"],

"contentViewport" please
Attachment #490581 - Flags: review?(mark.finkle) → review+
Nice!
sad, our mouse move event handling code is lot more expensive comparing to RenderLayer code.

without rendering and mouse handling we have      ~46% CPU free
with rendering and without mouse handling we have ~30% CPU free
without rendering and with mouse handling we have ~21% CPU free

so, our GL rendering code using ~15% and at the same time 21% just for mouse move event handling (Fennec mouse module just return immediately without any handling)
I've replaced MouseMove event with Pinch gesture event, in order to drop some unexpected expensive mousemove listeners, and got +2% CPU idle.
ok, things looks really bad here, because we do next:
- call getBoundingClientRect (call FlushLayout)
 http://mxr.mozilla.org/mobile-browser/source/chrome/content/browser.js#1150
- scroll change parts of layout
 http://mxr.mozilla.org/mobile-browser/source/chrome/content/browser.js#1170
- call again getBoundingClientRect which flushLayout e.t.c
http://mxr.mozilla.org/mobile-browser/source/chrome/content/browser.js#841

This makes scrolling FPS drop 60->47 :(
I  think we should rewrite scrolling code, and make URLbar/sidebars position calculation without calling into expensive getBoundingClientRect, and calculate ahead how and what will be changed after calling scrollTo...

Also we should check for URLBar state and if it is not visible and scrolling from top->bottom, then we should not calculate anything at all.
Summary: GTK-Fennec scrolling too expensive, and slower comparing to N900 default browser → Fennec scrolling code is too expensive.
Attachment #491515 - Flags: review?(mark.finkle)
Comment on attachment 491515 [details] [diff] [review]
Remove getBoundingClientRect from scrollBy

Did you test that the values coming from boxObject are the same as getBoundingClientRect?

Have you done a good amount of scroll testing on different web pages?

r+ if the answers are "yes"
Attachment #491515 - Flags: review?(mark.finkle) → review+
Argh, boxObject is evil and must die, and the fact it doesn't flush is really a bug.

Is there some other way to resolve this problem?
Cache bcr. Browser element only uses width/height, so as long as browser doesn't resize it is valid.

If only there was an element resize event we could listen for in browser...

But perhaps Fennec could just add its own custom browser resize event that the browser element could listen for.
> Is there some other way to resolve this problem?
we need to rewrite this code in such way that getBoundingClientRect never called in onmousemove event.

so we can even remember rect in dragStart function
This is seems better version which just remember boundRect on dragStart and reset on dragStop

Previous patch with boxObject seems breaking kinetic scrolling
Attachment #491515 - Attachment is obsolete: true
Attachment #492047 - Flags: review?(mark.finkle)
Also I found that we have 2 most expensive functions executed in layout which handling mouseMove:
1) nsIFrame::BuildDisplayListForChild  ~4% of libxul.so oprofile
is there are any existing perf patches of bugs related to BuildDisplayListForChild functionality?

2) nsBoxFrame::GetMouseThrough() const ~3% of libxul.so oprofile
roc: would it be possible to cache somehow GetMouseThrough value, and clear that cache when frame state changes?
Are the patches in r+'ed attachments still worth landing?

attachment 490568 [details] [diff] [review]
attachment 490581 [details] [diff] [review]

Both patches are low risk, so if they make a difference in scrolling performance, we should land them (with nits addressed) for b3.

I guess attachment 491351 [details] [diff] [review] is the combined patch with most nits addressed
We could try using a frame state bit instead of calling GetMouseThrough (which is virtual). That might help.

We don't have any optimizations for BuildDisplayListForChild, and it's probably hard to optimize well. Then again, 4% of the profile isn't a lot.
Comment on attachment 492047 [details] [diff] [review]
Another version of getBoundingClientRect remove from mousemove

Does not make sense to do this small fix... we need to rework scrolling code
Attachment #492047 - Flags: review?(mark.finkle)
I removed all boundClientRects and made some optimized scrolling test page.. and we still consume ~25% CPU in libxul/layout display list builders... comparing to simple QT/Gtk/Qt-QML embedding browser (only ~5% CPU) it is too much.

Can we do anything about buildDisplayListForContext functionality, cache it or something like that?
Also I see lot of nsDisplayList::HitTest calls on each motion event which is always returning the same nsIFrame target 

roc: cjones: do you have any ideas about what could be optimized in this profile, and what we can drop for fennec UI/remoteViewport scrolling case?
While we do viewport scrolling with simple mouse move then we have:
BuildDisplayListForStackingContext for ~6 frames coming from
http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp#6551

+ repeat of BuildDisplayListForStackingContext coming from 
http://mxr.mozilla.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp#1309

and then we repeating for each mouse move
nsPresShell.cpp#6551 with different move coordinates, and always getting same frame target (RemoteViewportFrame)
and nsLayoutUtils.cpp#1309 with exactly the same dirtyRect area.
I think one option is Create some layout API and do targetFrame cache in
nsLayoutUtils::GetFrameForPoint (between mousePress/MouseRelease)

Also it make sense to cache main view displayList between mousePress/Release and don't waste CPU on rebuilding display list on each mouse move event (60 times per second.)
I don't want to start caching display lists. That will add even more overhead and complexity, making other things slower.

Instead we should try to make display list construction faster. For event processing we only need to traverse a small number of frames, it should be fast.
> Instead we should try to make display list construction faster. For event
> processing we only need to traverse a small number of frames, it should be
> fast.

I think number of frames is already small enough (only UI elements).
the problem is that we are processing these frames using full HTML/SVG/XUL build display path, which is not very cheap...
So "make display list construction faster." - is something not very clear for me here. could you clarify how we can make it faster for certain UI handling cases?
Profile it carefully and figure out how to make the code paths shorter.

For event handling, I'm guessing we spend quite a lot of time scanning the child lists of XUL frames in nsBoxFrame::BuildDisplayListForChildren, trying to find the child or children that overlap the dirty rect. Maybe we can reorder stuff in BuildDisplayListForChild so that the "does not intersect" case is reached faster. But that guess needs to be verified with profiling.
> find the child or children that overlap the dirty rect. Maybe we can reorder
> stuff in BuildDisplayListForChild so that the "does not intersect" case is

don't understand what are you suggesting here. is it something simple to try or require first full understanding how BuildDisplayListForChild works in all details? (don't know that code for now)
I tried to disable some functions and checked CPU usage:

  function commented          CPU %     CPU idle%
- no function               - 23.6176   0
- GetEventTarget            - 20.6176   3
- list.PaintRoot            - 17.2733   18 (high jump because rendering was gone, no GPU/sgx drivers involved)
- nsLayoutUtils::PaintFrame - 14.575    18

the rest of libxul.so CPU consumption is in attached opreport

~3.7% spent in functions "js::" ~2% spent in GetProp related functionality, seems JS code need some variable cache.
Attached image libxul opreport graph (obsolete) —
Ok with callgraph I found one problem with WidgetQt, which was calling setCursor repeatedly even if cursor all the way the same...
and I got +8% CPU free!
Depends on: 621931
(In reply to comment #56)
> Created attachment 500169 [details]
> no rendering, no BuildDisplayList calls
> 
> I tried to disable some functions and checked CPU usage:
> 
>   function commented          CPU %     CPU idle%
> - no function               - 23.6176   0
> - GetEventTarget            - 20.6176   3

This result is a bit uncertain because going from 0% idle to 3% idle might mean we're painting more frames.

The profile graph shows 17% of the profile in nsLayoutUtils::GetFrameForPoint, which seems like a lot. Unfortunately the call stacks for nsIFrame::BuildDisplayListForStackingContext for painting and event handling got merged, so it's not easy to see what's going on in the event handling path.

The thing to do here might be to write a microbenchmark using elementFromPoint that tests finding an element in a stack of XUL boxes, and profile/optimize that.
Attachment #500201 - Attachment is obsolete: true
Depends on: 621984
> This result is a bit uncertain because going from 0% idle to 3% idle might mean
> we're painting more frames.
in all cases we are painting 60 FPS, that is limited with vsync
Depends on: 622090
Depends on: 623485
Depends on: 623487
Comment on attachment 491515 [details] [diff] [review]
Remove getBoundingClientRect from scrollBy

I know roc doesn't like boxObject, but it works and we seem to use boxObject.width and boxObject.height all over mozilla-central. Let's move forward with this patch in the interim.
Attachment #491515 - Attachment is obsolete: false
Attached patch Patch for _panControlsAwayOffset (obsolete) — Splinter Review
Oleg asked me to get rid of getBoundingClientRect calls in _panControlsAwayOffset. 

Here is the first version of patch for feedback. The implementation is not bullet proof yet, there is some cases where calculations goes a bit wrong. 

But anyhow, to be able to get rid of those getBoundingClientRect calls, the only possible approach I found is to maintain few variables for current visibility of side and top bars and according to those try calculate out how much certain side bar pans away.

I compared how much time is spent in both implementations, in the new one and in old one. According to my measurements new implementation for vertical panning is ~70% faster and for horizontal panning ~35% faster.

I really appreciate for feedback and suggestions.
Attachment #503138 - Flags: feedback?(mark.finkle)
> Patch for _panControlsAwayOffset

err, we have separate bug for this problem already bug 622090
Comment on attachment 503138 [details] [diff] [review]
Patch for _panControlsAwayOffset

>err, we have separate bug for this problem already bug 622090
Moved the patch to correct bug, and set this as obsolete.
Attachment #503138 - Attachment is obsolete: true
Attachment #503138 - Flags: feedback?(mark.finkle)
Comment on attachment 524580 [details]
Scrolling callgraph for latest tip

The graph has the call sequence PresShell::DispatchSynthMouseMove, nsViewMananger::DispatchEvent, nsViewManager::Refresh, but I don't think that is possible. The DispatchEvent call from DispatchSynthMouseMove can't call Refresh because it only dispatches a mouse move event, not paint events.
Depends on: 657844
Depends on: 657859
Ok, one more measurement:
with fennec scrolling code disabled, only this._dragger._contentView.scrollBy(0, 1); in onMouseMove call we have:
MouseEventSendTime: CurreTime:1297843735.139312
ScrollByStart: CurreTime:1297843735.143066, prevDiff:0.003.754 - 3.7ms for dispatching mouse move event and call contentView.scrollBy
Invalidate: CurreTime:1297843735.143493, prevDiff:0.000427
ScrollByFinished: CurreTime:1297843735.143615, prevDiff:0.000122
PaintStart: CurreTime:1297843735.146362, prevDiff:0.002747
PaintFinished: CurreTime:1297843735.150909, prevDiff:0.004547 - 4.5 ms for painting
Full paint round ~12ms

With fennec JS scrolling code enabled:
MouseEventSendTime: CurreTime:1297850461.586303
ScrollByStart: CurreTime:1297850461.597076, prevDiff:0.010773 - ~11ms for processing mousemove event, practically ~7ms spent in expensive js code
Invalidate: CurreTime:1297850461.597412, prevDiff:0.000336
ScrollByFinished: CurreTime:1297850461.597503, prevDiff:0.000091
PaintStart: CurreTime:1297850461.603759, prevDiff:0.006256
PaintFinished: CurreTime:1297850461.607635, prevDiff:0.003876
Found that we sending TapMove event for each mouseMove... is that really need to be un-conditional? can we init that event only when it is time to send that event?
wes ^
Not quite sure what you mean by "init that event only when it is time to send that event". We use TapMove events for two things I know of. Moving things in the UI (right now that's just selection markers, but we'll probably eventually allow things like drag and drop reordering of tabs, moving sliders, etc.) and for sending touchevents to the child process.

Getting around the first shouldn't be hard. We may be able to user our dragging code a bit more efficiently, or implement something similar but more efficient.

Currently, we check if we think the child process might have touchevent listeners on touchdown and use that to determine whether we need to wait and watch for preventDefault before scrolling pages. We could make that better (bug 654129):

http://mxr.mozilla.org/mozilla-central/source/mobile/chrome/content/browser.js#1956

We could backport that... somehow... to input.js in order to prevent even sending the TapMove events at all. We'd need to be careful about how we do that since we've been trying to keep input.js isolated from the browser. I'm not sure from those numbers how much win there is there.
Yes, it would be nice to not send TapMove events when:
1) page does not have mouse/touch-move listeners.
2) first TapMove event was not handled during one press->move->release iteration. (we drag element  which does not handle mousemove events)

I'm checking MouseMove->Paint path, which is need to be <=16ms. and now we have that time equals ~20-25ms. Every such possible fix will not give use obvious significant win, but 10 such tweaks will make us close to target value.
Closing all opened bug in a graveyard component
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.