Closed Bug 91044 Opened 24 years ago Closed 7 years ago

global history datasource slow when responding to nsXULTemplateBuilder::CheckContainer() queries

Categories

(Core Graveyard :: History: Global, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: waterson, Assigned: timeless)

References

Details

(Keywords: perf)

Attachments

(3 files)

This leads to pretty poor performance when scrolling new rows onto the screen
for the first time (cf. bug 89872, which fixes subsequent scroll performance).
Need to profile this to see what's going on, and if we can make any improvements...
Status: NEW → ASSIGNED
Keywords: perf
Priority: -- → P3
Target Milestone: --- → mozilla0.9.4
Blocks: 66907
Target Milestone: mozilla0.9.4 → mozilla0.9.5
I'm swamped. Kicking to component owner.
Assignee: waterson → alecf
Status: ASSIGNED → NEW
history bugs to blake.
Assignee: alecf → blakeross
Er, except hist perf, ->bryner
Assignee: blakeross → bryner
-> 0.9.6.
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Blocks: 74634
Target Milestone: mozilla0.9.6 → mozilla1.0
Bugs targeted at mozilla1.0 without the mozilla1.0 keyword moved to mozilla1.0.1 
(you can query for this string to delete spam or retrieve the list of bugs I've 
moved)
Target Milestone: mozilla1.0 → mozilla1.0.1
don't move bugs that are in the 1.0 dependency tree. sorry.

Target Milestone: mozilla1.0.1 → mozilla1.0
Target Milestone: mozilla1.0 → Future
This is /unusably/ slow for me on Linux trunk of a few days ago.

Look at the attached profile: it seems Mozilla is doing something
horrible with libc calls to timezone / getenv stuff. This profile
was taken whilst scrolling through a large history for the first
time. Also (subjectively) the time taken for a page down via
the scrollbar seemed to be proportional to the number of items
previous to the visible page.

Anybody want me to test a patch ? :)

(Each profile line is VMA address, number of samples, percent of total, symbol
name, relevant binary)

000807e0 40917    3.10649     memmove                 /lib/libc-2.2.so
0000e27c 41668    3.16351     morkMap::find(morkEnv *, void const *, unsigned
int) const /home/moz/bin/mozill/bin/components/libmork.so
00027668 47022    3.56999     CharsToPRInt64(char const *, unsigned int, long
long *) /home/moz/bin/mozill/bin/components/libappcomps.so
0002e648 49149    3.73148     getenv                  /lib/libc-2.2.so
000c7a10 50190    3.81051     nsASingleFragmentCString::Length(void) const
/home/moz/bin/mozill/bin/libxpcom.so
000a6170 51394    3.90192     __tzstring              /lib/libc-2.2.so
000a50d0 58725    4.45851     tzset_internal          /lib/libc-2.2.so
000c7910 68171    5.17566    
nsASingleFragmentCString::GetReadableFragment(nsReadableFragment<char> &,
nsFragmentRequest, unsigned int) const /home/moz/bin/mozill/bin/libxpcom.so
000a6e30 72924    5.53652     __tzfile_compute        /lib/libc-2.2.so
000a1d14 153763   11.674      __offtime               /lib/libc-2.2.so

the getenv calls are strange, but I can't figure out who is calling it. is there
any way to get a stack here?
Not with oprofile. I can hack up an LD_PRELOAD + builtin_return_address()
trick if you like to get call site
counts ... (if that'll work with shared libraries ...)
*** Bug 142514 has been marked as a duplicate of this bug. ***
Well it seemed to work, but I had to calculate the stuff manually, and I
must have cocked up, because it blamed calls inside pthreads via PR_JoinThread
in ptthread.c.

However, somebody taught me gdb, so here's the top of the backtrace from one
of the thousands of getenv() calls :

#0  getenv (name=0x405c209e "TZ") at ../sysdeps/i386/bits/string.h:343
#1  0x40554113 in tzset_internal (always=1079779486) at tzset.c:144
#2  0x40555037 in __tz_convert (timer=0xbfffd37c, use_localtime=1,
tp=0x405cf480) at tzset.c:583
#3  0x405512be in localtime (t=0xbfffd37c) at localtime.c:43
#4  0x401de402 in MT_safe_localtime (clock=0xbfffd37c, result=0xbfffd380) at
prtime.c:604
#5  0x401de506 in PR_LocalTimeParameters (gmt=0xbfffd410) at prtime.c:701
#6  0x401ddd1f in PR_ExplodeTime (usecs=1022700704021436, params=0x401de444
<PR_LocalTimeParameters>, exploded=0xbfffd410) at prtime.c:230
#7  0x40d9173d in NormalizeTime (aTime=1022700704021436) at nsGlobalHistory.cpp:268
#8  0x40d918aa in matchAgeInDaysCallback (row=0x87eae28, aClosure=0xbfffd820) at
nsGlobalHistory.cpp:343
Checking in gdb, the backtraces from the LD_PRELOAD trick :

getenv called: backtrace:
0x40557113
 0x40558037
  0x405542be
   0x401e1402
    0x401e1473

(gdb) list *0x40557113
0x40557113 is in strptime_internal (strptime.c:445).
(gdb) list *0x40558037
0x40558037 is in strptime_internal (strptime.c:922).
(gdb) list *0x405542be
0x405542be is in tzset_internal (tzset.c:93).
(gdb) list *0x401e1402
0x401e1402 is in PRP_NakedWait (ptsynch.c:1091).
1086            rv = pthread_cond_wait(&cvar->cv, &ml->mutex);
1087        else
1088            rv = pt_TimedWait(&cvar->cv, &ml->mutex, timeout);
1089        if (rv != 0)
1090        {
1091            _PR_MD_MAP_DEFAULT_ERROR(rv);
1092            return PR_FAILURE;
1093        }
1094        return PR_SUCCESS;
1095    }  /* PRP_NakedWait */
(gdb) list *0x401e1473
0x401e1473 is in pt_poll_now (ptio.c:563).
558     }  /* pt_poll_now_with_select */
559
560     #endif  /* _PR_POLL_WITH_SELECT */
561
562     static void pt_poll_now(pt_Continuation *op)
563     {
564         PRInt32 msecs;
565             PRIntervalTime epoch, now, elapsed, remaining;
566             PRBool wait_for_remaining;
567         PRThread *self = PR_GetCurrentThread();

Which doesn't look very promising in terms of this __builtin_return_address
doing anything sensible :)
holy smokes! who knew PR_ExplodeTime time would fan out like THAT. I'll see if I
figure something out there.
Any progress here ?
I optimized it a big for bug 166681, but the general algorithm is still the
same.. I don't know much else to do right now though..
I am seeing this bug within Phoenix 0.5 on Window 2000 Professional.  My history
was set for 21 days.  I cleared it before I saw how big it got.  Before the
clear, typing in the location bar was painfully slow, as was deleting characters
in the location bar.  Loading the history always took a while when sorted by
date.  A While == 15-20 seconds, maybe a little less.  

Does this bug include Phoenix (and/or all Mozilla web browsers)?  I've added
myself to the CC list.
Peter, I don't think that's the same bug ... this was concerning the old
History dialog in seamonkey in terms of what the user can see. Firebird
uses a sidebar that doesn't seem to have the problem.

If somebody needs profiles (I can also do some limited call graph stuff)
of particular operations in Firebird related to this bug, please ask.
I don't know this code; I merely implemented a suggestion dictated to me in IRC
(and verified that it compiled/ran and didn't hork history).
Attachment #129075 - Flags: superreview?(alecf)
Attachment #129075 - Flags: review?(alecf)
.
Assignee: bryner → timeless
Status: ASSIGNED → NEW
Target Milestone: Future → ---
Comment on attachment 129075 [details] [diff] [review]
some modifications timeless suggested

generally looks ok, but now you have an nsAutoString created inside a loop -
was that the intent? I'd prefer something like PR_sscanf or something
equivalent.
Attachment #129075 - Flags: superreview?(alecf)
Attachment #129075 - Flags: superreview-
Attachment #129075 - Flags: review?(alecf)
Comment on attachment 129075 [details] [diff] [review]
some modifications timeless suggested

alecf: does pr_sscanf support wide characters?
soap does this:
/extensions/webservices/soap/src/nsDefaultSOAPEncoder.cpp, line 3031 -- PRInt32
r = PR_sscanf(NS_ConvertUCS2toUTF8(value).get(), " %lf %n", &f, &n);

Note that currently the same code is called deeper in the callstack the goal is
to simply skip calls to the time function, not fix the autostring stuff.

330 matchAgeInDaysCallback(nsIMdbRow *row, void *aClosure)
331 {
...   //  vvvvvvvvvvvvvvvvvvvvvvvvvvvvv is always false in the old code
339   if (!matchSearchTerm->haveClosure) {
/* because |matchSearchTerm_t matchSearchTerm = { mEnv, mStore, term ,
PR_FALSE};| is discarded with each iteration of the loop */
340	PRInt32 err;
341	// Need to create an nsAutoString to use ToInteger
/* so this code which i'm simply *copying* gets called with each iteration of
the loop */
342	matchSearchTerm->intValue =  nsAutoString(term->text).ToInteger(&err);
/* all i'm doing is copying lines 340,342,344 to skip the expensive 343 */
343	matchSearchTerm->now = NormalizeTime(PR_Now());
344	if (err != 0) return PR_FALSE;
345	matchSearchTerm->haveClosure = PR_TRUE;

As for those lines of code... well you own blame for them. (mkaply has tainting
blame, but your 1.73 introduced ToInteger)

I'm certainly willing to at some later time rewrite the code to at least
recycle the nsAutoString but I'd rather we get this in. (Don't ask me why I'm
poking this at 6am while i'm vacationing for a weekend.)
Attachment #129075 - Flags: superreview?(bzbarsky)
Attachment #129075 - Flags: superreview-
Attachment #129075 - Flags: review?(alecf)
*** Bug 194926 has been marked as a duplicate of this bug. ***
Comment on attachment 129075 [details] [diff] [review]
some modifications timeless suggested

well, I mean you're adding an EXTRA nsAutoString - clearly I did the wrong
thing before as well, but I see no need to make it worse by adding yet another
temporary! 

you get sr=alecf if you recycle the nsAutoString - you're already recycling the
"term" which we didn't do before..
Attachment #129075 - Flags: review?(alecf) → review-
Attachment #129075 - Flags: superreview?(bzbarsky)
Is anyone looking at resolving this?
This still occurs with Firefox 3 for the History container within the Library. Shall I file a new bug on that or will it be covered by this bug?
QA Contact: claudius → history.global
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: