Bugzilla

Updated

•

24 years ago

Priority: -- → P1

Whiteboard: [nsbeta1+]

Target Milestone: --- → mozilla0.9

Comment 1

•

24 years ago

moving to 0.9.1

Target Milestone: mozilla0.9 → mozilla0.9.1

Comment 2

•

24 years ago

moving to 0.9.2

Target Milestone: mozilla0.9.1 → mozilla0.9.2

Comment 3

•

24 years ago

moving to 0.9.3

Target Milestone: mozilla0.9.2 → mozilla0.9.3

(not reading, please use seth@sspitzer.org instead)

Comment 4

•

23 years ago

moving to 0.9.4

Target Milestone: mozilla0.9.3 → mozilla0.9.4

Comment 5

•

23 years ago

slide to 0.9.5

Target Milestone: mozilla0.9.4 → mozilla0.9.5

Randell Jesup [:jesup] (needinfo me)

Comment 6

•

23 years ago

Attached file jprof of loading the 1000-message folder (Linux, dual 450 PIII's) — Details

Randell Jesup [:jesup] (needinfo me)

Comment 7

•

23 years ago

Attached file loading the folder the 2nd time — Details

(not reading, please use seth@sspitzer.org instead)

Comment 8

•

23 years ago

moving out to 0.9.6

this should go to who (bienvenu, naving, me) who is going to work on folder
loading perf.

Target Milestone: mozilla0.9.5 → mozilla0.9.6

Updated

•

23 years ago

Depends on: 102973

Asa Dotzler [:asa]

Updated

•

23 years ago

Blocks: 104166

Assignee

Comment 9

•

23 years ago

Two places can be improved to increase performance.  One place is in 
nsMsgDatabase where we allocate/de-allocate hash tables (used to hold the cached 
message headers) too many times, and the second place is where we 
allocate/de-allocate memory space too often and make many unnecessary calls to 
memset and memcpy when using nsUInt32Array and nsUInt32Array to manipulate 
message keys, flags, etc.

Based on a manual measurement (using stopwatch and naked eyes), the improvement 
is about 3.75% and 8.75% respectively for a 1,000 and 4,000 message folder. To 
make the measurement more accurate I added some code in 
nsMsgLocalMailFolder::UpdateFolder() to actually record the time(in 
milliseconds) spent by the function and the test results are listed below. 
nsMsgLocalMailFolder::UpdateFolder() is the top level function where message 
header caching code is invoked:

              Improved Code     Original Code   % Improved

1000 msgs -      250/304           302/353       17.21/13.88     
2000 msgs -      321/363           429/469       25.17/22.60     
4000 msgs -      458/492           655/693       30.07/29.00

Two scenarios are measured and they are separated by a slash (like 'scenario#1/ 
scenario#2'):
1. Scenario #1 - After mail is launched I immediately click on the target local 
folder (without doing anything else).
2. Scenario #2 - After mail is launched I open a local folder 'test', click on 
the first message of this folder and then open the target folder.

On thing to note is that the changes made to nsUInt32Array and nsUInt32Array (to 
reduce memory allocation and de-allocation frequencies and calls to memset and 
memcpy) only makes very little impact on the performance number (in the range of 
2 to 5 milliseconds) so they are not listed as a separated column.

Another thing to note is that even though I ran the same scenario against the 
same folder many times, the computer calculated numbers fluctuated a bit 
sometimes.  I'm not sure why but in this case I would ran a few more times and 
took the best 3-4 numbers and averaged them.  So even the measurement itself is 
hard to do sometimes.

Here is the code I added to measure performance and I'll attach a patch for 
review soon.

    PRIntervalTime startTime = PR_IntervalNow();
    NotifyFolderEvent(mFolderLoadedAtom);
    rv = AutoCompact(aWindow);
    printf("It takes %d milliseconds for nsMsgLocalMailFolder::UpdateFolder() to 
finish\n", 
            (long)PR_IntervalToMilliseconds(PR_IntervalNow()-startTime));

Assignee

Comment 10

•

23 years ago

Attached patch 1st patch. — Details — Splinter Review

Assignee

Comment 11

•

23 years ago

David, can you review the code to make sure I don't break things I wasn't aware 
of? Thanks.

The chage made in nsMsgDatabase::AddHdrToCache() is mainly to figure out the 
best hash table size to allocate. With the old code, we always set the size to 
512 so for 1000 msg folder we would allocate/de-allocate the table (size of 
512*12=6.1 KB) 4 times (2 for loading and 2 for sorting).  Now we only allocate 
the table once.

'kLookAheadHdrs' is added to the size so that we don't immediately de-allocate 
the table and recreate it when we get a new mail in after the folder is opened. 
This was the idea.

In nsMsgDatabase::CopyHdrFromExistingHdr() (when we do msg copy), the reason why 
I set 'm_bCacheHeaders' to false before calling CreateNewHdr() is to avoid 
creating a new hash table for the destination folder if the table does not 
already exist.  See bug #105192 for more details about this issue.

Other than these the patch seems very straightforward.

Comment 12

•

23 years ago

No, this is not exactly what I had in mind. The db should not be basing its hash
table on the number of headers in the database, as I mentioned in my comments
for 105192. It should be up to the client of the db (in this case, the view
code), to tell the db how big to make it's hash table, i.e., how many headers
the client code thinks it's going to need. This way we wouldn't need to change
the copy code at all, or any code that's just going to open the db and get a
couple headers out of it.

Assignee

Comment 13

•

23 years ago

Attached patch Incorporated david's comment. — Details — Splinter Review

Assignee

Comment 14

•

23 years ago

The only change is that I added nsMsgDatabase::SetMsgHdrCacheSize() to allow 
callers to set message header cache table size.  Performance improvement stays 
the same.

Comment 15

•

23 years ago

I thought we were going to get rid of this: +    // Turn m_bCacheHeaders off
(and on later) so that we don't have to cache the msg hdr.
+    m_bCacheHeaders = PR_FALSE;
 
	CreateNewHdr(key, (nsIMsgDBHdr **) &destMsgHdr);
+    m_bCacheHeaders = PR_TRUE;

I don't think it's needed, and makes things more complicated than they need be.

Is this right?

-      PL_DHashTableInit(saveCachedHeaders, &gMsgDBHashTableOps, nsnull,
sizeof(struct MsgHdrHashElement), kMaxHdrsInCache);
+      m_cacheSize = kMaxHdrsInCache;
+      PL_DHashTableInit(saveCachedHeaders, &gMsgDBHashTableOps, nsnull,
sizeof(struct MsgHdrHashElement), m_cacheSize);

I would think it should leave m_cacheSize alone instead of setting it back to
kMaxHdrsInCache.

I don't see the need for all the changes to nsUintArray, and I'm not sure
they're an improvement. When you call PR_Calloc instead of PR_Malloc, you're
causing code to clear out the whole array, instead of just the part we're not
going to use, which was what the code did before.

Also, we don't need a method to get the total number of messages in the db - you
can get that from the db folder info.

And, when initializing the view max hdr size, if we're in unread messages only
view, you should get the number of unread messages from the db folder info, and
init the view to that size. Search for unreadOnly to see how to find out if it's
an unreadOnly view.

Assignee

Comment 16

•

23 years ago

> I thought we were going to get rid of this: +    // Turn m_bCacheHeaders off
> (and on later) so that we don't have to cache the msg hdr.
> +    m_bCacheHeaders = PR_FALSE;
>
>	CreateNewHdr(key, (nsIMsgDBHdr **) &destMsgHdr);
> +    m_bCacheHeaders = PR_TRUE;
>
> I don't think it's needed, and makes things more complicated than they need 
> be.
> 
> Is this right?
>
This patch is to save the memory we have to allocate for the hash table of the
destination folder during msg copy. So if I copy a msg to a folder with 1000 
msgs I'll have to allocate a table of size 1025*12 bytes even if I'll never open 
it again (ie, the table will never be referenced during the session). I was 
worry more about regression of the patch, but if we agree that this is not an 
issue then we can remove it.

> -      PL_DHashTableInit(saveCachedHeaders, &gMsgDBHashTableOps, nsnull,
> sizeof(struct MsgHdrHashElement), kMaxHdrsInCache);
> +      m_cacheSize = kMaxHdrsInCache;
> +      PL_DHashTableInit(saveCachedHeaders, &gMsgDBHashTableOps, nsnull,
> sizeof(struct MsgHdrHashElement), m_cacheSize);
>
> I would think it should leave m_cacheSize alone instead of setting it back to
> kMaxHdrsInCache.
>
For big folders (like a few K msgs) it's better to set the cache size to a 
smaller value (so if I use up all 2025 cache table entries a new 512 entry table 
is probably better than a 2025 one since I'll mostly likely not deal with that 
many msgs in this folder again). So maybe we can set kMaxHdrsInCache to a more 
reasonable value for a fresh cache table.

> I don't see the need for all the changes to nsUintArray, and I'm not sure
> they're an improvement. When you call PR_Calloc instead of PR_Malloc, you're
> causing code to clear out the whole array, instead of just the part we're not
> going to use, which was what the code did before.
>
No, they don't improve a lot as I indicated earlier (2-5 milliseconds).  I was 
mainly trying to remove code that's not really needed, especially the memset 
calls for every array element before a value is assigned.  I also tried to 
pre-allocate the array size to avoid de-allocation and reallocation calls (since 
we already know the size we need). For the PR_Malloc one, I was trying to use 
PR_Calloc call instead of PR_Malloc and memset calls (my assumption is that 
PR_Calloc would be faster), but I can go either way on this. Again, we can 
remove all the changes to nsUintArray if we all agree.

>
> Also, we don't need a method to get the total number of messages in the db - 
> you can get that from the db folder info.
>
OK, I'll do that.

> And, when initializing the view max hdr size, if we're in unread messages only
> view, you should get the number of unread messages from the db folder info, 
> and init the view to that size. Search for unreadOnly to see how to find out
> if it's an unreadOnly view.
>
Thanks for the info as I was not aware of the 'unread message' mode.

Comment 17

•

23 years ago

>This patch is to save the memory we have to allocate for the hash table of the
>destination folder during msg copy. So if I copy a msg to a folder with 1000 
>msgs I'll have to allocate a table of size 1025*12 bytes even if I'll never open 
>it again (ie, the table will never be referenced during the session).

Since you're backing out your change that made the db figure out how big to make
the cache, the folder size doesn't matter for this code. If you put it back the
way it was, we'll just allocate a 512 entry hash table, and it's no big deal. We
can make kMaxHdrsInCache smaller, but again, it's not a big deal. So I'd remove
the code.

>For big folders (like a few K msgs) it's better to set the cache size to a 
>smaller value (so if I use up all 2025 cache table entries a new 512 entry table 
>is probably better than a 2025 one since I'll mostly likely not deal with that 
>many msgs in this folder again). So maybe we can set kMaxHdrsInCache to a more 
>reasonable value for a fresh cache table.

I'm not sure I agree - if we did deal with that many messages, and we (the view
code) explicitly said we were going to deal with that many messages, then I
think the db should leave it the way the code that knew how many headers it was
going to need said we should have.


I doubt  PR_Calloc is cheaper than memset. I think this stuff should all be
reverted since it doesn't make an appreciable difference

Comment 18

•

23 years ago

I don't see why you neeeded AllocateSpace. Can't you just use the existing
SetSize method to pre-initialize the array size?

Assignee

Comment 19

•

23 years ago

> I don't see why you neeeded AllocateSpace. Can't you just use the existing
> SetSize method to pre-initialize the array size?
>
SetSize() does not resset 'm_nSize' to 0 so it's not working as I expected.

Comment 20

•

23 years ago

if it's a bug in setsize, why not fix it instead of inventing a new routine that
does what you want? And when do we ever set the size of the array to 0?

Assignee

Comment 21

•

23 years ago

'm_nSize' is never reset to 0 so I'm not sure if it's a bug. Maybe the 
name is misleading but SetSize() is called to make sure we have enough space to 
hold the next array element(s) and so 'm_nSize' is incremented everytime 
SetSize() is called. SetSize(0) is called to free the memory space and set 
'm_nSize' to 0.

Assignee

Comment 22

•

23 years ago

>>For big folders (like a few K msgs) it's better to set the cache size to a 
>>smaller value (so if I use up all 2025 cache table entries a new 512 entry 
>>table is probably better than a 2025 one since I'll mostly likely not deal 
>>with that many msgs in this folder again). So maybe we can set kMaxHdrsInCache
>> to a more reasonable value for a fresh cache table.
>
>I'm not sure I agree - if we did deal with that many messages, and we (the view
>code) explicitly said we were going to deal with that many messages, then I
>think the db should leave it the way the code that knew how many headers it was
>going to need said we should have.
>
The scenario I was trying to say is this:
A folder with 2000 msgs is opened and say the cache table size is 2000, and then 
a new mail comes in (ie, it grows). In this case we'll need to re-create the 
cache table and the question is 2000 vs. what (like 512)? That's what I was 
trying to decide and there does not seem to exist a single rule that can make 
all scenarios work to the best. Since average users tend to have smaller number 
of messages in the folders, using the original cache size seems to be a better 
solution.

Comment 23

•

23 years ago

I looked at SetSize for both nsUint32Array and nsUint8Array and both of them, if
you call SetSize(0) will make the size 0 and free the array. Also, when you call
SetSize(newSize, growBy), it will make the arraySize be newSize, not increment
the existing size by newSize, so I'm confused. There's some fudging going on
with nGrowBy so that the newSize will take into account the growBy stuff:

		if (nNewSize < m_nMaxSize + nGrowBy)
			nNewMax = m_nMaxSize + nGrowBy;  // granularity
		else
			nNewMax = nNewSize;  // no slush

but that seems correct to me. 

I can see your argument about resetting the cache size when growing the array,
except that in reality, I think it could backfire - imagine this scenario - I
open the folder, and I set the array size a little too small, so that it ends up
having to grow. Then, suppose I'm going to make two passes through all the
headers (e.g., syncing imap flags, and building up the view). If we shrink the
array size back down again, that second pass is going to be expensive because
we'll end up throwing away lots of headers.

Assignee

Comment 24

•

23 years ago

Yes, there are scenarios where backing down the cache size is actually worse. 
That's why I said there is no one good rule to follow.  Another case where this 
is bad is during folder opening.  Say we have a folder with 16K msgs and our max 
cache size is 8k, so when you reach 8001st msg you'll need to re-create the 
table. In this case, 8000 is much better than 512 for the same reason you 
mentioned in your scenario.

Let me look at SetSize() again but I remember the problem is 'm_nSize' not being 
reset. That's why I added the new API to simply call SetSize() and reset 
'm_nSize'.

Assignee

Comment 25

•

23 years ago

'm_nSize' is used to keep track of the most current element in the array. Even 
though SetSize() does allocate the right amount of space for us it also sets 
'm_nSize' to the last byte in the array, meaning that the next time you call 
Add() the array will be expanded again (it thinks it's running out of space).

Assignee

Comment 26

•

23 years ago

Attached patch Incorporated more comments. — Details — Splinter Review

Assignee

Comment 27

•

23 years ago

If we all think that the changes to nsUintArray is not really needed I'll remove 
them.

Comment 28

•

23 years ago

Cavin, you're right about the way Uint32Array works with m_nSize (add will put
the new element at m_nSize). But your AllocateSpace routine isn't really right -
it should leave m_nSize where it was before you called SetSize, but enlarge the
array. If I've put something in the array, and then call AllocateSpace, I could
lose my data. So something like PRInt32 saveSize = m_nSize; SetSize(foo);
m_nSize = saveSize). So it's up to you - we could leave out the changes to
nsUintArray, or just fix them.

Assignee

Comment 29

•

23 years ago

Good catch David as I did not make AllocateSpace() generic enough (it was meant
for the first time allocation only). I'll make the change.

Assignee

Comment 30

•

23 years ago

Attached patch Made AllocateSpace() more generic. — Details — Splinter Review

Assignee

Comment 31

•

23 years ago

Can David and Seth make a final r and sr on the patch? Thanks.

Comment 32

•

23 years ago

Why remove these lines?
-
	// The new size is within the current maximum size, make sure new
-
	// elements are to initialized to zero
-
	if (nSize > m_nSize)
-
		nsCRT::memset(&m_pData[m_nSize], 0, (nSize - m_nSize) * sizeof(PRUint32));
-

seems like we need to have elements initialized to a well known value like 0.

Assignee

Comment 33

•

23 years ago

From what I can tell we clear the array when it's initially allocated and then 
clear each and every array element again before a value is assigned to the 
element.  Seems unnecessary.

Comment 34

•

23 years ago

Cavin, I think you should put those lines back. Here's a scenario:

1. Create an array of size 10
2. Fill it up with 10 elements.
3. Remove the first 5 elements. Now, m_nSize is 5 but m_maxSize is 10
4. Call SetSize (8). We don't need to grow the array to do this, so we'll fall
into the code that you've changed.

With your change, the last three elements in the array will have arbitrary
values instead of being set to 0 as they would have before your change. I'd
rather not mess with this code unless we need to, and I'd like to maintain the
invariant that any elements not set will be 0. These classes are used all over,
and changing the way they work could have very hard to figure out side effects.

Does this make sense?

Assignee

Comment 35

•

23 years ago

>changing the way they work could have very hard to figure out side effects.
>
OK.

Other than this, are there any other comments about the patch? You can list them 
all so that I can change them all at the same time.

Comment 36

•

23 years ago

Nope, that's the only remaining problem I see. Put back the code you were going
to remove, and r=bienvenu

Assignee

Comment 37

•

23 years ago

Yes, I'll put back the memset() calls.

(not reading, please use seth@sspitzer.org instead)

Assignee

Comment 38

•

23 years ago

Attached patch Put back memcpy in SetSize(). — Details — Splinter Review

Comment 39

•

23 years ago

I think GetNumNewMessages() and GetNewMessages() to return -1 when we don't 
know the value.  in the folder datasource, we turn that into "???", which is 
why you see that in the folder pane.

I'm nervous that this code:

+      dbFolderInfo->GetNumNewMessages(&unreadMessages);
+      totalMessages = (PRUint32)unreadMessages+MSGHDR_CACHE_LOOK_AHEAD_SIZE;

will end up with a monsterous number for totalMessages.

I think removing all your summary files and your panacea.dat will get you into 
a state where you'll get the "???".

after calling GetNumNewMessages() and GetNumMessages(), do this:

+ dbFolderInfo->GetNumNewMessages(&unreadMessages);
+ if (unreadMessages > 0) 
+   totalMessages = (PRUint32)unreadMessages+MSGHDR_CACHE_LOOK_AHEAD_SIZE; 
+ else
+   totalMessages = MSGHDR_CACHE_LOOK_AHEAD_SIZE; 

and

+ if (totalMessages > MSGHDR_CACHE_MAX_SIZE) 
+   totalMessages = MSGHDR_CACHE_MAX_SIZE;
+ else if (totalMessage < 0) 
+   totalMessages = MSGHDR_CACHE_LOOK_AHEAD_SIZE;
+ else
+  totalMessages += MSGHDR_CACHE_LOOK_AHEAD_SIZE;

re-assign to cavin, who's working on folder loading performance.

Assignee: sspitzer → cavin

Assignee

Comment 40

•

23 years ago

Seth, thanks for the info. I'll see if I can make GetNumNewMessages() return -1 
by removing all the summary files and panacea.dat.

Assignee

Comment 41

•

23 years ago

OK, nsDBFolderInfo never returns negative values for total number of msgs, new 
msgs and visible msgs.  Adding comments to nsDBFolderInfo to make sure it 
doesn't break the patch in the future. A final patch coming soon.

(not reading, please use seth@sspitzer.org instead)

Assignee

Comment 42

•

23 years ago

Attached patch Added comments to nsDBFolderInfo. — Details — Splinter Review

Comment 43

•

23 years ago

Comment on attachment 55141 [details] [diff] [review]
Added comments to nsDBFolderInfo.

sr=sspitzer

a comment about numVisibleMessages, 
numNewMessages and numMessages always being >= 0

belongs in nsIDBFolderInfo.idl

as well.

Attachment #55141 - Flags: superreview+

Comment 44

•

23 years ago

Comment on attachment 55141 [details] [diff] [review]
Added comments to nsDBFolderInfo.

r=bienvenu

Attachment #55141 - Flags: review+

Assignee

Comment 45

•

23 years ago

Fix checked in.

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Comment 46

•

23 years ago

I'm going to reopen this since we still need to work on folder loading
performance.  Perhaps in the future we should file separate bugs for the work we
do and make this bug dependant on them.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Comment 47

•

23 years ago

moving to 0.9.7 for continued perf work.

Target Milestone: mozilla0.9.6 → mozilla0.9.7

stephend@netscape.com (gone - use stephen.donner@gmail.com instead)

Updated

•

23 years ago

Depends on: 107481

Reporter

Comment 48

•

23 years ago

Using the 10-26 build on Win32, we went from:

19th
1.58 (1.43 1.63 1.67) 

26th
1.46 (1.40 1.51 1.47)

That's of course not conclusive, but it's my 'official' measurement, nonetheless ;-)

Assignee

Comment 49

•

23 years ago

So that's a gain of 7.59% for 1000 msgs and is much better than the measurement
I did on 10/19 (which is about 3.75%).  When bug #107481 is completely done we
should see even more improvement for various folder opening scenarios. Right now
#107481 is 50% complete (1st fix submitted 10/30/01) but we should see the
improvement as well.

Updated

•

23 years ago

Keywords: nsbeta1+

Whiteboard: [nsbeta1+]

stephend@netscape.com (gone - use stephen.donner@gmail.com instead)

Assignee

Comment 50

•

23 years ago

The official number from the 11-02 build on Win32 is:

1.18 (1.30 1.21 1.03)

So it's a 25.32% jump off the 10/19 build before any of the fixes went in and is
19.18% increase over the 10/26 build which has the fix to reduce hash table
allocation/de-allocation.

Reporter

Comment 51

•

23 years ago

As inaccurate as those numbers were, I could *pysically* tell that it was 
focusing/selecting the folders faster (I already told Seth this).  Great stuff!

Alex Vincent [:WeirdAl]

Comment 52

•

23 years ago

How much would this affect a newsgroup with 70,000+ headers?

Updated

•

23 years ago

Target Milestone: mozilla0.9.7 → mozilla0.9.8

Comment 53

•

23 years ago

moving out.

Status: REOPENED → ASSIGNED

Target Milestone: mozilla0.9.8 → mozilla1.2

Myk Melez [:myk] [@mykmelez]

Updated

•

23 years ago

Blocks: 122274

Keywords: nsbeta1+ → nsbeta1-

Updated

•

20 years ago

Product: MailNews → Core

Bruno 'Aqualon' Escherl

Comment 54

•

17 years ago

Is there anything left to do here?