Open Bug 536910 Opened 10 years ago Updated 1 year ago

[meta] Too many ID-less form elements makes Firefox slow

Categories

(Firefox :: Session Restore, defect)

x86
All
defect
Not set

Tracking

()

People

(Reporter: mcepl, Unassigned)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

(Keywords: meta, perf, Whiteboard: [fxperf:meta])

Attachments

(6 files)

(originally filed as https://bugzilla.redhat.com/show_bug.cgi?id=518515, whole story and more details are there)

After installing Fedora 11, the new version of Firefox (xulrunner-1.9.1.6-1.fc12.i686
 and firefox-3.5.6-1.fc12.i686) has serious performance problems compared to Firefox 3.0 in Fedora 10.

I am a web developer and some pages of an enterprise application I'm working on are showing serious performance degradation while being used.  The problem is not during the page loading or page rendering but while the user is interacting with the form elements in the web page.  Every once in a while, when the user tries to fill in a form input (eg. text box, radio button, select or check box) the browser freezes for about 3 or 5 seconds consuming 100% of the CPU, it doesn't happens all the time but often enough to be very annoying.

The problem is specially bad with big and complex pages, for example, I have a web page with well over 700 form elements (including hidden input elements) and some of the elements have JavaScript code used to validate input data or to perform calculations while the user is typing.

But the great news is that I managed to make a simple test case with 100 check boxes and with no JavaScript at all.  I'll attach the test case later.

The problem is a lot more obvious on a low spec computer.  My laptop is a Core Duo 1.8 GHz so the problem is not that notorious.

At work I have an old workstation, Intel(R) Pentium(R) 4 CPU 2.80GHz so the slowness is more evident.  The problem is that this is a Fedora 11 system but the Firefox version is the same as Fedora 12 so I guess it's ok.  Do you want me to do more testing here? maybe a sysprof log with debugging packages?

Please remember to cross reference the upstream bug report with this one.
Attached file reproducer
emphasizing ... this has absolutely nothing to do with Javascript ... reproducible on the low-end hardware with plain HTML.
see bug 148338 for huge tables (50,000 rows) and bug 148636 for lots of form
controls
Matej, does changing the "browser.sessionstore.privacy_level" preference to the integer 2 make the problem go away?  This is sounding a lot like bug https://bugzilla.mozilla.org/show_bug.cgi?id=477564 (certainly that's what the testcase looks like).  But that claims to be fixed in 1.9.1.3, by capping to 100 elements.  Since you only have 100 elements, it might be that in your case even doing 100 elements takes too long....

In any case, can you please check the effect of that preference?
Asked our reporter for reproducing
Hello, I'm the original reporter of this bug in the Red Hat database.

Changing the value of "browser.sessionstore.privacy_level" to 2 seems to fix the problem (I'm testing with Firefox 3.5.6 from Fedora 11).  I reproduced the problem under valgrind just to see the slowness very clearly.  When it's set to 1 it takes almost one minute for valgrind to stop using the whole CPU on a single click of the check box, when I change the value to 2 it just takes 1 second, basically the slowness is fixed.

So, how come this is still a problem if bug #477564 is fixed? my problem is reproducible even on fast computers, so the capping to 100 elements doesn't seem to work here.

The thing is that I'm not really convinced that the problem is merely related to check boxes, may be the problem has to do with so many items having the same name.  In the Red Hat bug I reported that I changed the last 99 check boxes of my test case to text boxes (with the same name) and leave the first one as a check box (with another name) and still the bug is reproducible.

I then made a quick test at my office with the web application and I can reproduce the same problem changing the focus between radio buttons with the keyboard (The classical Yes/No question).

I'll try to make a test case with a page from my application and try to reproduce the problem.  In this case my app has form controls of almost every kind: text boxes, combos, radio buttons, check boxes and a few buttons.  There is no Flash involved.  And the names are not the same, they follow this pattern: "answer[XXXXX][code]" or "answer[XXXXX][value]" where XXXXX is the id of the question in the database.  I use "[" and "]" in the name to process it as an array in PHP.
Mmmmm... it just occurred to me that this is why I don't use the new interface in Yahoo Mail because I have tons of mails there and every single mail has a check box and that's why I think the new interface is awfully slow (all the messages are shown on a scrollable pane).

Even the old interface sometimes is slow because I configured it to show 200 messages per page.

I'll test the new Yahoo Mail interface with the privacy_level set to 2 and report back any findings.
> Changing the value of "browser.sessionstore.privacy_level" to 2 seems to fix
> the problem 

OK.  Over to session restore, then.

> I reproduced the problem under valgrind just to see the slowness very
> clearly. 

The capping introduced in bug 477564 was designed based on normal execution speeds, not valgrind ones.  Sessionstore _does_ still take time; the idea was to make it not normally be user-perceivable as laggy.  Since valgrind introduces about a 100x slowdown into the system, typically, it can easily move things from imperceptible to quite noticeable.

That said, the cap there was an absolute cap on the number of controls we generate XPath expressions for.  If in your case each XPath expression takes a long time to generate, for some reason, then 100 might not be a low enough cap...

> So, how come this is still a problem if bug #477564 is fixed?

Presumably because it's a somewhat different problem; that's why I didn't mark it duplicate.  :)

> my problem is reproducible even on fast computers

On your original page, or on the one attached to this bug?  I'm not seeing the problem on the testcase attached to this bug...  The computer over here is an 18-month-old laptop (reasonably high-end laptop back then, granted).

> I'll try to make a test case with a page from my application and try to
> reproduce the problem.

That would be very very helpful!
Component: Layout: Form Controls → Session Restore
Product: Core → Firefox
QA Contact: layout.form-controls → session.restore
Just a quick question, what features do I lose if I leave the privacy_level set in 2?
Session restoration won't restore the values of form controls.  The default value of 1 restores form control values only on non-SSL pages.  Setting it to 0 restores form control values on all pages.

This mostly matters if the browser crashes while you're typing something in a textarea.  ;)
Against which version of Firefox was this filed?
I have been testing this with Firefox 3.5.6 in both Fedora 11 and Fedora 12.
Is this an issue on trunk and 3.6?

You can find builds for those here:

ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/ <-- trunk

ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-1.9.2/ <-- 3.6
Version: unspecified → 3.5 Branch
(In reply to comment #12)
> Is this an issue on trunk and 3.6?

It's likely still a problem on trunk & 3.6 since the code for that hasn't changed.

FWIW the max of 100 nodes from bug 477564 is only nodes that don't have an id and thus have to create the XPath expression for.

William - I'd be interested to know if the problem goes away if you gave all of the fields ids.

So perhaps the ideal (but slightly difficult) solution would be to update form data just for the node that fires the input event instead of trashing the form data we have and gathering it all again (when it's highly unlikely that each form field has changed). This might have been suggested in bug 477564 or somewhere else. This might also be overkill since the vast majority of forms don't have 100+ form fields.
Version: 3.5 Branch → unspecified
(In reply to comment #13)
> This might also be overkill since the vast majority of forms
> don't have 100+ form fields.

What about web mail clients with thousands of mails and each one with a check box right on the left?  The problem with this bug is that this seems to be a regression, 2.X series of Firefox were faster with huge forms and now that Firefox 3.X is supposed to perform better, it feels completely the opposite.  There are some oddities when you are scrolling slowly with the wheel of the mouse and all of a sudden it decides to block for a few seconds, it's not encouraging and other people is reporting this problem too:
http://www.gnome.org/~michael/blog/2009-12-03.html

I'll try the ids trick to see if it solves the problem, I did read something about it in the other bug report.
(In reply to comment #12)
> Is this an issue on trunk and 3.6?

I just tested both, 3.6 and trunk and the problem is still there.  Disabling the privacy_level fixed the problem in both.

Since I'm now at home running on a Intel Core 2 Quad Core I had to write a little C program with an infinite loop in it and run it 8 times in the background to slow things down a bit to notice the problem.
Attached file Reproducer with ids
(In reply to comment #13)
> William - I'd be interested to know if the problem goes away if you gave all 
> of the fields ids.

I just did it and in fact the problem goes away.  I couldn't reproduce the slowness with privacy_level set to 1.

Would you explain this to me a little bit more? how come the ids fixes the problem?


> So perhaps the ideal (but slightly difficult) solution would be to update form
> data just for the node that fires the input event instead of trashing the form
> data we have and gathering it all again (when it's highly unlikely that each
> form field has changed). This might have been suggested in bug 477564 or
> somewhere else. This might also be overkill since the vast majority of forms
> don't have 100+ form fields.

May be it's not overkill, a few reasons comes to my mind:

1) There is no reason to waste CPU.

2) This bug seems to me like an scalability problem that could have deeper repercussions in the future.

3) Nowadays web apps are very complex and it's completely valid to have huge forms partially hidden in a tabbed interface where just a subset of the form elements are visible to the user.

4) Embedded devices are, generally, under powered.  Besides, battery life is an issue here.
(In reply to comment #16)
> Would you explain this to me a little bit more? how come the ids fixes the
> problem?

It's actually the XPath generation that is the expensive part. IDs are by definition supposed to be unique, so we have an identifying attribute there. Without the ID we need to generate something to identify the field, and that something is the XPath expression.

The code that does the XPath generation is here: http://hg.mozilla.org/mozilla-central/file/ba8780007126/browser/components/sessionstore/src/nsSessionStore.js#l3005

Perhaps it would be enough to just make that code faster (at least as another short-term solution). We could get rid of the recursion so we have a chance of getting traced. In fact I'll try that tomorrow and see if it has any effect.

Boris - I'm sure Simon would have looked into this, but do we not have a native xpath generation method? That seems like something we would have considering our whole frontend is just a DOM.
Ok, if you need me to test nightly builds just tell me.  Do you know if a sysprof profile with debugging symbols might be useful?  I don't know if you consider sysprof a good tool to apply to this problem.
William, are you testing on the testcase in this bug, or your original web page in comment 15?

Generating 100 XPath expressions should really not take that long if we do it in a smart way...

> Perhaps it would be enough to just make that code faster (at least as another
> short-term solution

Maybe...  That code looks like an algorithmic problem more than just a pure performance one (e.g. generating an XPath expression is O(N) in number of previous siblings, giving O(N^2) behavior overall).  Hence the capping, but for a large form you might still lose with the 100*O(N) bit.

> but do we not have a native xpath generation method? 

Not a reusable one.  But see bug 478107 (referenced in bug 477564).
For what it's worth, a sysprof profile with symbols would be somewhat useful, but given steps to reproduce (which includes the page to reproduce on) I can also generate a Shark profile as needed...
(In reply to comment #19)
> William, are you testing on the testcase in this bug, or your original web page
> in comment 15?

I got the results in comment 15 with the test case attached to this bug report.  But I have been doing more testing with the web page from my application and I have been able to observe the problem using radio buttons and even typing on text boxes.  Given the features offered by session restore I guess it's understandable that this problem can be observed with any type of form input (not just check boxes as implied by the other bug report).

Boris, what is a shark profile? is there any link to it?

Final question: is it always a good idea to put and id on every form element? I mean not just as measure to workaround this bug but in general.  The problem with my web app is that it is a huge page (and have many pages) and it would be very difficult to put ids on every input element.
(In reply to comment #17)
> It's actually the XPath generation that is the expensive part. IDs are by
> definition supposed to be unique, so we have an identifying attribute there.
> Without the ID we need to generate something to identify the field, and that
> something is the XPath expression.

Thanks Paul, I now have a better understanding.  But, shouldn't this be a one time problem during page load? Maybe I'm talking non senses here but the result of the XPath expression should be saved as of some kind of ID when it's executed the first time, I guess that should solve the problem.

As I see it that XPath expression is being executed every time the user interacts with a form input (mouse clicks, key presses, etc) and to make it worse it seems to be executed on every input element even when the user is just interacting with one element.
> I got the results in comment 15 with the test case attached to this bug report.

OK.  I'll see what I can get out of it.

Paul, where in the sessionstore code should I put the start/stopProfiling calls to capture a single sessionstore save cycle?

> Boris, what is a shark profile? is there any link to it?

Shark is a profiler Apple ships with XCode.  It also happens to be one of the better profilers out there in terms of usability and ability to get the data one wants...
(In reply to comment #23)
> Paul, where in the sessionstore code should I put the start/stopProfiling calls
> to capture a single sessionstore save cycle?

saveState. Really the data gathering state is just the call to _getCurrentState, so I'd start profiling right before that and stop immediately after. The writing to disk is async now, so the call to _saveStateObject is probably not interesting.

We do dirty state tracking, so not all calls to saveState will gather all window data. Pass true to _getCurrentState to force all collection.

If you just want to look at the form saving, _collectFormDataForFrame is what you want.
OK.  On the attached testcase, time for _getCurrentState breaks down about like so:

30% getting localName, previousSibling, name, namespaceURI off of nodes.  This
    hits various XPConnect slow paths due to the cross-origin access; the actual
    time in the C++ getters for the above is closer to 5%.  The first patch in
    bug 477564 might help here, I bet.
20% js_ComputeThis when calling the XPC_WN_GetterSetter from JS_Invoke (right
    before making the above calls into various C++ getters).  This is mostly
    XPC_WN_JSOp_ThisObject calling GetWrapperForObject, GetCxSubjectPrincipal,
    and so forth.  Blake, do we have a bug on this?
20% xpc_CloneJSFunction (setting prototypes, cloning function objects, etc)
    called from XPCNativeMember::NewFunctionObject, called from
    XPCWrapper::GetOrSetNativeProperty.
4%  XPCCallContext creation stuff.
2%  GetCxSubjectPrincipalAndFrame called from XPC_NW_GetOrSetProperty
5%  Under js_GetPropertyHelper, which ultimately lands under stuff similar to
    the above.
4%  in js_Interpret.

and some minor stuff.

Blake, we really need to figure out a way to make calls from chrome into content suck less performance-wise.... :(  Paul, can we minimize the number of such calls we make here?
Attached file Text report
The actual shark file (with source text stripped out) is about 4.5MB, so can't exactly attach it here....  This text report has the highlights.
Using this patch, put shark in remote mode, then start the browser pointed to the attachment in this bug.  Make sure before you do all that that your sessionstore pref is set to 0 so that you actually save SSL site data.
(In reply to comment #25)
> 30% getting localName, previousSibling, name, namespaceURI off of nodes.  This
>     hits various XPConnect slow paths due to the cross-origin access; the
> actual
>     time in the C++ getters for the above is closer to 5%.  The first patch in
>     bug 477564 might help here, I bet.

It probably would - though it's a rather arbitrary optimization (I guess it handles the cases where the loops becomes particularly expensive).

What about something similar, except that we cache the xpath expression for each node? This has the same problem as that patch - that if the page is dynamically changed we lose (though perhaps we could listen for some event iif we get into xpath and invalidate the cached value for the node & it's children - but that's another event to listen for).

I'd be willing to bet that most of the time the form fields would have a rather close ancestor in common. So we're probably generating the same chunk of xpath expression thousands of times.

> Paul, can we minimize the number of such calls we make here?

Could be done in conjunction with the above or separate - we could at least get aNode.localName, namespaceURI once instead of accessing them each time in the loop.

We could also change previousSibling to previousElementSibling (and skip text nodes), right? That would immediately skip 1/3 of the n.localName, etc calls in that loop.

Other random idea - why not only save form values for fields that don't have the default value?
> (I guess it handles the cases where the loops becomes particularly expensive)

It handles the case when some parent has a _lot_ of kids by not making the algorithm take O(N^2) time in said kids...

> What about something similar, except that we cache the xpath expression for
> each node? 

That requires mutation listeners, right?  The other doesn't if the cached stuff is cleared at the end of the saving phase; since it just uses previous siblings to speed up xpath generation for later siblings, the information being there just during the saving phase is already useful.

I don't think you want to start adding mutation listeners; sessionstore has a bad enough effect on DOM performance already.

> So we're probably generating the same chunk of xpath expression thousands of
> times.

Pretty likely, yes.

> we could at least get aNode.localName, namespaceURI once instead of accessing
> them each time in the loop.

Sounds good to me!

> We could also change previousSibling to previousElementSibling (and skip text
> nodes), right?

This might or might not be faster depending; it involved tearoff creation...  Peterv was going to make that be a non-issue, though.  So long-term might be a win.

> why not only save form values for fields that don't have the default value?

_That_ would be a huge win.  Can we start with that.

mrbkap: we should still get bugs filed on the wrapper performance stuff here, possibly; not sure how it should look with e10s and whether we should worry about it before then.
> Can we start with that.

That was supposed to be a question.  ;)  Let's treat this as a tracking bug and file specific bug's blocking it for specific things we want to change?
(In reply to comment #30)
> > Can we start with that.
> 
> That was supposed to be a question.  ;)  Let's treat this as a tracking bug and
> file specific bug's blocking it for specific things we want to change?

Of course :) I filed that and the one to use localName, namespaceURI less. I'll knock those out ASAP and see if they help with the numbers. Does Shark give you the total time doing the above or just a percentage breakdown?
Summary: too many form elements makes firefox slow → [Tracking] Too many ID-less form elements makes Firefox slow
Shark gives me the number of samples taken, which translates to total time somewhat because the samples are taken every 20us (with the settings I used). There's a bit of noise due to the high sampling rate (so the sampling time is a noticeable fraction of total runtime), but I can give you an idea of speedup with some patches, yes.
Keywords: meta
OS: Linux → All
Only bug 613498 is still open
Keywords: perf
We should really look into the previousElementSibling thing, or moving more of this work into C++ in general.  Mike, do you know who might be able to own this?
Flags: needinfo?(mdeboer)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #36)
> We should really look into the previousElementSibling thing, or moving more
> of this work into C++ in general.  Mike, do you know who might be able to
> own this?

Well, I'm marking this to be part of the sessionstore performance work, which is our GSoC project for this year. I expect our to-be-selected student to pick this up as one of the first bugs and I'll make sure to mentor towards a proper solution.
Blocks: ss-perf
Flags: needinfo?(mdeboer)
Summary: [Tracking] Too many ID-less form elements makes Firefox slow → Too many ID-less form elements makes Firefox slow
I changed this line of code [1] to using previousElementSibling instead. I'm not if I have to do anything else but that doesn't affect much.

Should we move the XPathGenerator to C++ as there is a bug about this: https://bugzilla.mozilla.org/show_bug.cgi?id=1362330?

[1]: http://searchfox.org/mozilla-central/source/toolkit/modules/sessionstore/XPathGenerator.jsm#43
Flags: needinfo?(mdeboer)
We discussed during our 1:1.
Flags: needinfo?(mdeboer)
Depends on: 1362330
I checked one checkbox at the end of the page and the profile result is https://perfht.ml/2uGq6wH.

The code http://searchfox.org/mozilla-central/source/toolkit/modules/sessionstore/FormData.jsm#171 is really expensive and from the profile result, it takes about 1.2 seconds to finish and the whole function FormData::collect makes the webpage unresponsive.

We could somehow make the doc.evaluate function less expensive and break FormData::collect into smaller pieces to let firefox process users input during the break.
As discussed during our 1:1: Quan will move forward here to simplify the XPath query and do the element filtering in JS instead _after_ an idleDispatch. This way we can be sure that we're not blocking as much as we do now.
Quan reported an improved Xpath evaluation time, using the test file attached to this bug, to 200ms vs. 1200ms. Combining this with a lazy collect in idle time should make the needle move into the acceptable realm.
Assignee: nobody → nnn_bikiu0707
Status: NEW → ASSIGNED
Depends on: 1379900
Another thing we could do is to collect all the input nodes that are available when all the frames are collected. That way, we don't have to spend time collecting all those nodes [1] again and again, which are going to be the same most of the time when users input something.

Mike, what do you think?
Flags: needinfo?(mdeboer)
Well, that would break dynamically generated forms in unexpected ways, unless we may have cheap mutation observers that can be scoped to certain types of elements.
Flags: needinfo?(mdeboer)
During a meeting with Mike early this week, I asked about breaking the loop [1] into chunks that can be executed inside requestIdleCallback. This will allow UI to process users event and prevent janking. However, there are still things I'm thinking about:
1. There need to be a callback passed in to return the result. I'm thinking of making all these listeners'[2] collect function return a promise. We'll wait until all these promises are resolved and then send to the parent process [3]. Not sure if the change are going to have bad side-effect?
2. What will happen when we're still collecting and users interact with another inputs (especially with inputs that are collected)?

Dão, I would love to hear your thoughts on this.

Tim, I think you have a lot of experience with session restore. Can you give me some feedback?

[1]: http://searchfox.org/mozilla-central/source/toolkit/modules/sessionstore/FormData.jsm#186
[2]: http://searchfox.org/mozilla-central/source/browser/components/sessionstore/content/content-sessionStore.js#802-807
[3]: http://searchfox.org/mozilla-central/source/browser/components/sessionstore/content/content-sessionStore.js#786
Flags: needinfo?(ttaubert)
Flags: needinfo?(dao+bmo)
(In reply to Bao Quan [:beekill] from comment #44)
> During a meeting with Mike early this week, I asked about breaking the loop
> [1] into chunks that can be executed inside requestIdleCallback. This will
> allow UI to process users event and prevent janking. However, there are
> still things I'm thinking about:
> 1. There need to be a callback passed in to return the result. I'm thinking
> of making all these listeners'[2] collect function return a promise. We'll
> wait until all these promises are resolved and then send to the parent
> process [3]. Not sure if the change are going to have bad side-effect?

Sounds good!

> 2. What will happen when we're still collecting and users interact with
> another inputs (especially with inputs that are collected)?

Maybe this is naive, but I don't expect issues with that. Is there some specific piece of code that you think could be problematic?
Flags: needinfo?(dao+bmo)
(In reply to Dão Gottwald [::dao] from comment #45)
> (In reply to Bao Quan [:beekill] from comment #44)ds good!
> 
> > 2. What will happen when we're still collecting and users interact with
> > another inputs (especially with inputs that are collected)?
> 
> Maybe this is naive, but I don't expect issues with that. Is there some
> specific piece of code that you think could be problematic?

I don't think this is problematic either. We already mark data as "dirty" and collect later, when the user might have interacted again.
Flags: needinfo?(ttaubert)
Depends on: 1386206
Thanks Dão and Tim for your responses. I filed the bug 1386206 to implement the idea in comment 44.
Quan hasn't been able to work on Firefox bugs for a while now. Unassigning to allow others to finish this up.
Assignee: nnn_bikiu0707 → nobody
Status: ASSIGNED → NEW
Whiteboard: [fxperf]
Whiteboard: [fxperf] → [fxperf:meta]
Summary: Too many ID-less form elements makes Firefox slow → [meta] Too many ID-less form elements makes Firefox slow
You need to log in before you can comment on or make changes to this bug.