Closed Bug 44514 Opened 25 years ago Closed 24 years ago

Submitting forms for 1st time too slow.

Categories

(Core :: DOM: Core & HTML, defect, P3)

PowerPC
All
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: meiering, Assigned: morse)

References

()

Details

(Keywords: perf, relnote, Whiteboard: [nsbeta2-] [nsbeta3+])

Attachments

(4 files)

When you launch mozilla and submit a form for the first time, it sits there for 6-10 seconds with no activity, before the submission takes place. Subsequent form submissions are instantaneous like they should be. Steps to reproduce 1. Launch Mozilla 2. Go to bugzilla 3. Enter a bug # 4. Click 'show'or hit enter 5. Wait. 6. 6-10 seconds later the throbber will start and you will be taken to the bug # Expected results. When 'show' is clicked, the form shoud be submitted *immediately*, not 6-10 seconds later. Reproducibilty. Only immediately after launching Mozilla; all later form submissions work fine. It took so long to submit the first form on my G3/350 that I was force quitting thinking Mozilla had crashed. See bug #44390.
Confirmed in build 2000070311.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: perf
reassigning
Assignee: rods → pollmann
Is the delay before or after popping up the security warning dialog box? It seems like this activity sometimes takes quite some time. Also, is this only on Mac or other platforms as well? CC'ing Doug.
Try removing the psmglue.dll from components. Does the problem go away?
On mac, removing PSMGlue.shlb does not remedy the problem. Also, on Mac I do not get a security warning dialogue box.
The delay is before the dialog appears (unless, of course, the user has disabled the dialog).
Since removing PSMGlue.shlb does not remedy the problem, this must not be related to the security dialog. The only other processing done before the first form submit that I can think of may be wallet initializing. CC'ing Steve to see if he has any suggestions on how to eliminate/debug that possibility.
I'm guilty. The first time any wallet function is requested, the wallet tables are read in. This takes two seconds on my windows machine but maybe it takes longer on other machines/platforms. Whenever a form is submitted, wallet looks over the fields to determine if it contains significant information which the user might want to save. In order to do that, the wallet tables are needed. That's why they are being read in at this time. But things have been this way for many months now. The only thing that has changed is that the wallet tables have gotten bigger so perhaps the delay of reading them in is now longer. If you don't like this delay, you can turn off this testing from the wallet pref panel. You'll still be able to use wallet but wallet will not try to process each form as you submit them, so the wallet tables won't be needed until you do an explicit wallet operation.
I think I can speed up the reading of the wallet tables considerably. But I'm curious to know just how much of the 6-10 seconds are being taken up by this. Could you apply the attached patch to extensions/wallet/src/wallet.cpp and then let me know what number you see displayed on the console after you've performed the steps indicated above. When I do it I get time t = 0 time T = 2237 which indicates that it takes 2.237 seconds.
Giving this to you Steve. Let me know if I can help!
Assignee: pollmann → morse
My efforts at speeding this up failed. Here's where I thought I would benefit but was wrong. There are four files that wallet reads in. I thought that by having all the information in one big file and reading just the one file would save a lot of time. But I tried that and it made no difference. So now I'm out of ideas. If anyone has any brainstorms as to how to speed up this phase, I'd love to hear them. Copying Andreas on this in case he has any ideas. I'd still like to hear from the person who reported 6-10 second delay as to what number he is seeing with the timing patch indicated above.
Target Milestone: --- → M18
Status: NEW → ASSIGNED
*** Bug 45344 has been marked as a duplicate of this bug. ***
nominating for beta2, this is most painful on macOS, but happens on all platforms.
Keywords: nsbeta2
OS: Mac System 9.0 → All
Hardware: Macintosh → All
Here's some info on this bug. 1. From my tests, I have observed no speedup by using one file instead of four. At least not on a windows platform. That might not be true on a mac, in which case I can restructure things so that I use only a single file. It's probably about a days worth of work to do this. 2. I have discovered a big inefficiency in the way that I am putting the values being read in into the internal tables. With a little bit of effort I should be able to rectify that. Preliminary investigations indicate that this should cut the time in half. I'm attempting to make this fix right now.
Putting on [nsbeta2-] radar. Not critical to beta2.
Whiteboard: [nsbeta2-]
OK, I have a simple fix that speeds up my reading in of the wallet tables. With the fix, what used to take me 2 1/4 seconds now takes only 1 second. I'll check that in as soon as it gets code reviewed (waiting on dveditz). Since this bug didn't make the nsbeta2+ rating, I won't do anything further with it at this time. But there is another obvious thing that I can do when I return to working on this. Namely keep all the information in the one composite file and read that directly into my internal tables (currently I split the composite file into four separate files and then read in the four files). My experiments on winnt didn't indicate that there would be any significant speedup by doing so, but perhaps it will make a difference on the mac. Of course it will at least save the time that it takes to split up the files (about one second) but that price is paid only once -- future runs already see the split files and don't split it again.
The fix I mentioned above is now checked in. Could someone who previously saw the 6 to 10 second delay try the latest version and let me know how long the delay is now. Thanks.
I tested today on Mac 2000_07_13_08 and saw no significant improvement. Maybe 0.5 - 1 seconds. This was not a problem until about the time that themes were implemented. Probably completely unrelated, just an observation. Checking on M16.
Checked in Mac M16. Form submission is immediately after clicking the button. The security dialogue only takes about 1s to come up on my machine, compared to 7s on M17.
Can you try running M16 but using the current wallet tables. The current tables can be found in dist\WIN32_D.OBJ\bin\defaults\wallet on win32 and probably some similar directory on mac. They are the files that end in *.tbl. In M16 they may have been in the res directory instead of the defaults directory. The reason I'm asking you to try this is because the wallet tables have grown significantly larger since M16 and I suspect that that is what is causing the huge delay. Nothing has changed in the wallet initialization code that I can think of.
I wasn't able to do exactly what you described. Placing the M17 wallet tables in M16 res/ caused a crash once the browser appeared. However, placing the M16 wallet tables into the M17 wallet worked. Form submission in this case is just like in M16...quick. Switching back to the M17 wallet contents made it slow again. Looks like the tables are the issue, but 7 seconds to process ~130k of data?
OK, your test has allowed me to eliminate one variable -- namely the use of four files instead of one. You just showed that the opening and closing of the four files is not what is slowing us down but rather it is the contents of the files that is. That means I should not bother restructuring things so that I do all the processing directly from one combined file.
Keywords: nsbeta3
Whiteboard: [nsbeta2-] → [nsbeta2-] [nsbeta3+]
I have another fix that should speed things up considerably. On win32 it reduces the one-second read-in of the wallet tables to 254 milliseconds (a factor of 4). Hopefully it will give a similar speed-up on the mac. I acheived this speed-up by changing the code from reading in one character at a time to reading in a buffer of 1000 characters at a time. Will be checking it in soon.
Patrick, isn't 4k a more optimal buffersize.
Fix already checked in. If you think it's more optimal, I'll change it to 4K. I tried imperically using 1K and then 10K and got no measurable difference. Let me know if I should change it.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
If the "fix" is in the 2000-08-04-04 mac build then this bug is not fixed. Just tested on the same machine used to report the bug. There was about a 6-10 second delay after clicking a form for the first time. I'd like to remind that this is on a machine that is WAY faster than the target audience. On a slower machine, the user will undoubtedly think the application has crashed.
The 2000-08-04-04 build is from the M17 branch.
In 2000080315-M18 it takes 13 seconds after clicking on the submit before Mozilla displays any visual clue about loading. This is on PPC 604e @ 210 MHz. I have disabled Form Manager from prefs. Putting on the relnote radar for M17 relnotes. Could the necessary file io take place during idle time and not when the user submits the first form?
Keywords: relnote
I didn't realize that the fix didn't make it into the trunk. Retested on 08-04, M18 Mac and the time was about 3 seconds. Big improvement, but still a problem in my opinion. Could the wallet be loaded at startup? Mac takes so long to start up, 3 seconds wouldn't make a noticeable difference.
The fix is in the trunk. It is not in the m17 branch.
*** Bug 48701 has been marked as a duplicate of this bug. ***
This is a problem on the mac only. Am I right?
Hardware: All → Macintosh
on optimized bits from friday, g4 450, it still takes about two seconds, which is about two seconds too long. reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Is that two seconds on the mac? If so, I find that to be good news. Henri Sivonen reported 13 seconds on August 4th. Does anyone think that using a 4K buffer will make an appreciable difference (as suggested by dougt's comment on August 3rd)? I am currently using a 1K buffer. Also, what do you think of meiering's comment of reading in the wallet tables at browser startup time instead -- for the mac only of course.
Status: REOPENED → ASSIGNED
oh yes, it is much better than it used to be, but it's still perceptable even on the fastest of macs. could tell me where to adjust the buffer size and i'll play with it? there is no reason why this should take so long, so i don't want to push it to another part of the launch cycle. let's just fix it.
Yes, a 4K buffer can reduce the time significantly. On the Mac file system, the cost of reading 1K vs. 4K is typically similar, especially on larger volumes where the file block size is larger. With small files between 1K and 4K, you'll then read the entire file in one read.
The buffer length is defined in the WalletUTF8Get routine in extensions/wallet/src/wallet.cpp. The defining lines are: const PRUint32 buflen = 1000; static char buf[buflen+1];
I just noticed that my code in Wallet_UTF8Get is complete nonesense for the case of double-byte characters. I'll work on a fix for that. But that shouldn't affect a pure-ASCII file which is what we currently have.
upping the buffer didn't seem to help. am i supposed to see these lines on the console: Opening file DistinguishedSchema.tbl failed Opening file FieldSchema.tbl failed Opening file URLFieldSchema.tbl failed Opening file SchemaConcat.tbl failed that doesn't sound good.
Those lines are normal. It means that the tables were not downloaded to the profile directory on any previous run, in which case they will be read from the defaults directory instead. BTW, the fix for the UTF8 problem I mentioned above is in bug 48949. But, as I said before, that shouldn't affect anything here.
add nsmac2 keyword ("should fix" for mac)
Keywords: nsmac2
triaging to post-beta3. If anybody wants to help out here, feel free to do so.
Whiteboard: [nsbeta2-] [nsbeta3+] → [nsbeta2-] [nsbeta3-]
Target Milestone: M18 → M20
There is no "post-beta3", by giving it nsbeta3- you are saying "post-Netscape 6.0". Is that what you meant?
Yes, that's what I meant.
Can we have a moz10 keyword or so? If Netscape wants to ship with such a bug, that's very sad. But Mozilla 1.0 shouldn't ship with that type of bugs IMO.
Keywords: helpwanted
adding self since this affects Form Manager testing.
I have to agree with Mike that 2 seconds on a G4-450 is about 2 seconds too long. What is the minimum target machine (processor and speed) that Netscape is targeting? If it is a slower PPC604 class, then it would be a travesty to have NN6 ship with this bug. Think about it. I reported this bug as #44390 because on my machine - which should be way faster than the target machine - I thought mozilla had crashed!!! In fact I force quit numerous times, but figured out it didn't crash after just leaving my machine in disgust and upon returning finding that the form had actually been submitted. Releasing with a bug like this (which *every* browser user is *undoubtedly* going to encounter) will just prove to many mac users what they already believe about NN6, namely, that it sucks. The pitiful impression left by PR1 and PR2 on mac users will be hard to overcome, especially with something as glaring as this. Sorry for the rant, but if Joe Blow has to wait 10sec for a form to submit on his 200Mhz PPC604, then you have lost him as a customer.
There have been many improvements made since you originally reported this bug. Your originally reported a delay of six to ten seconds on your machine. With the new improvements, what delay are you seeing?
i've profiled this and am looking into fixes
Independentally I also started doing some timing. Got some surprising results. Namely, all the time is being taken on the line in wallet_Get that reads: return (buf[next++] & 0xFF); That's not what I expected at all. I thought I was going to see the time taken up by the stream read but that was not the case. I just called Mike and he is getting different results. Could be that the mac and win32 are spending their time differently -- I'm doing timing on win32 and he's profiling on a mac. Stand by for more results.
The code that we are spending all the time in (on win32) seems very innocent as shown below. However we are executing this code 122,376 times. 026F5AB2 mov ecx,dword ptr [next (02716bd0)] 026F5AB8 movsx edx,byte ptr buf (027167a4)[ecx] 026F5ABF and edx,0FFh 026F5AC5 mov word ptr [rv],dx 026F5AC9 mov eax,[next (02716bd0)] 026F5ACE add eax,1 026F5AD1 mov [next (02716bd0)],eax
there are large amounts of time being spent in: - appending 1 char to a string - calling SetCapacity() on a string - allocating and re-allocating memory scc has a patch for the first one, which is a _big_ win and I'll check it in, but the SetCapacity() calls are still very hurtful. Basically, the code takes a string, calls SetCapacity(64) (which calls malloc) and then throws the string away when it's done. This is done, oh, about 10,000 times. steve, can you use/re-use a buffer here so that we're not spending all the time allocating memory only to throw it away and start-afresh?
Again, sorry for the rant in the last post. I now see about 3.5 second delay. A great improvement, but still about 3.5s too long. Things like this should be should be imperceptible on a fast machine.
i made a bunch of the nsAutoString's static and that helps, but not quite like i thought. Where the bulk of the time is now is in wallet_WriteToList allocating 3- 4000 map elements: wallet_MapElement * mapElement = new wallet_MapElement; we're losing lots of time in the memory allocators, mostly in nsLargeHeapAllocator::AllocatorMakeBlock() which is called from malloc(). We spend 57% of the total time processing these files in malloc's children.
here are my static autostring changes that speed things up a little more. yes, yes, i know this sucks, but short of overhauling the code...
Ignore my comments about where the time was being spent. After investigating further I have determined that my method of doing timing was flawed and I was getting accumulated roundoff error (accumulated by one hundred thousand iterations). But I have problems with Mike's analysis as well. If what he is saying is true, then the following hack which statically alocates the entire table should show a significant time improvement over doing dynamic memory allocation. But on my testing it does not change the timing. - add the following global variables: wallet_MapElement mapTable[5000]; PRInt32 mapIndex = 0; - replace wallet_MapElement * mapElement = new wallet_MapElement; with wallet_MapElement * mapElement = &mapTable[mapIndex++];
steve, i made this change and re-profiled and malloc went from 50% of the time taken to 20% -- a huge differencee!!. Our memory allocators need some work, yes, but using an arena or some way to pre- allocate these data structures (which are a good size, 2 nsAutoStrings aren't small) would be a big win. Can you try to do something like this? I will help _a lot_
I applied Mike's string patch and tried it on win32. It reduced the time for reading in the tables from 557 milliseconds to 488 milliseconds which is an improvement of 12 percent.
Mike, you said that malloc went from 50% to 20% of the time. That doesn't surprise me. But you failed to say what happened to the total time. How much of a reduction did you see. When I tried it on win32 I didn't see any.
FWIW, all these changes taken together (esp the "hack" morse suggsted above) have reduced the time to init the tables by an order of maginitude (yes 10x). I checked the original profile against the one with the hack and all the other changes and it was reduced by 10x. And who said the MW Profiler was useless!? ;)
What's the value in a |static nsAutoString|? The conflicting |static| and |auto| declaration modifiers are confusing (the latter implicit in the class name, but reflected in the on-the-stack intent of the class). Unless I'm really missing something, |static nsString| would seem superior, though maybe scc can suggest an even better alternative. (And are we seriously thinking about allocating five thousand wallet_map objects on the stack? Or, for that matter, statically allocating them _anywhere_?)
I just picked up the change in nsString2 that Mike checked in last night for scc. Reran my test and found that this change gave a 50% time reduction -- the readin in of the tables now went from 488 milliseconds to 244 milliseconds.
No, we are not seriously considering statically alocating 5000 elements. It was just to do a timing test. What I am seriously considering doing is dynamically alocating in say 500 element chunks as needed. The table that is being read in is fixed (about 3500 elements) and will not change unless the table on the server are changed.
I'm attaching a patch for the mapElement alocation problem if someone would like to review it. I'm still doing some testing on it to make sure that all the wallet functions are still working.
ok, i figured out the units of the numbers in the profiler, so here they are: before: 1.4 seconds after: 140 ms 10x improvement reading in tables.
With all the great progress being made on this bug in the past day, I'm putting it back on the nsbeta3+ list.
Whiteboard: [nsbeta2-] [nsbeta3-] → [nsbeta2-] [nsbeta3+]
Target Milestone: M20 → M18
It's great that progress is being made. Does Mozilla read the tables even if the Form Manager is turned off? Previously turning off the Form Manager didn't speed up the first submission.
You can't turn off the form manager but you can turn off the check that it makes every time a form is submitted. That is under control of a pref.
My previous patch for the mapElement alocation problem was not quite right. Corrected patch to follow.
looks good, check it all in and i'll play with it to double check timing.
Both my patch and Mike's patch are now checked in. If Mike's number of 140ms is really correct, then this bug is fixed.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago25 years ago
Keywords: helpwanted
Looking over my own timing numbers for win32, I have an interesting observation. On August 3 I reported that I had reduced the timing from one second down to 254 ms. When I started retiming it yesterday, the timing was up to 557 milliseconds. So something changed in the underlying infrastructure to slow this down by a factor of two. The changes that were made today (Mike's changes to the strings in wallet.cpp and Scott's changes to the string module) brought the number down to 488 and 244 milliseconds respectively. So this got us back to where we were on August 3. My change of buffering the alocations for map elements had no effect on win32. Apparently the win32 memory alocator already did its own buffering. But, as Mike reported, it did make a big difference on the mac.
WOW: This is much faster now, takes only about half a second to do on 2000082409. Marking verified
Status: RESOLVED → VERIFIED
mid-air collision ? / bugzilla cleanup Reopening (current State: verified and no resolution)
Status: VERIFIED → REOPENED
fixed
Status: REOPENED → RESOLVED
Closed: 25 years ago24 years ago
Resolution: --- → FIXED
verfied
Status: RESOLVED → VERIFIED
Component: HTML: Form Submission → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: