Last Comment Bug 571419 - Tb3 issues read request for each one byte of msgFilterRules.dat
: Tb3 issues read request for each one byte of msgFilterRules.dat
: perf
Product: MailNews Core
Classification: Components
Component: Filters (show other bugs)
: Trunk
: All All
: -- normal (vote)
: Thunderbird 17.0
Assigned To: :aceman
Depends on:
Blocks: 444793
  Show dependency treegraph
Reported: 2010-06-10 18:32 PDT by WADA
Modified: 2012-07-31 04:25 PDT (History)
6 users (show)
ryanvm: in‑testsuite-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---

Process Monitor log(CSV) for msgFilterRules.dat and msgFilterRules-1.dat (111.09 KB, text/plain)
2010-06-10 18:32 PDT, WADA
no flags Details
patch (2.00 KB, patch)
2012-04-26 12:04 PDT, :aceman
mozilla: review+
mozilla: feedback+
Details | Diff | Splinter Review

Description WADA 2010-06-10 18:32:13 PDT
Created attachment 450546 [details]
Process Monitor log(CSV) for msgFilterRules.dat and msgFilterRules-1.dat

[Build ID]
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a5pre) Gecko/20100604 Shredder/3.2a1pre

Tb3 issues read request for each one byte of msgFilterRules.dat.
By design?
Comment 1 WADA 2010-06-10 18:48:44 PDT
Same "read request for each one byte" is observed with Tb 3.0.4, Tb 3.1xpre, Sm 1.1.18, and Sm 2.0.2 too.
Comment 2 WADA 2010-06-11 03:00:10 PDT
Because a msgFilterRules.dat file looks read only once(upon first message filtering, or first message filter open via UI), and because physical read from HDD is done with a block size and/or with buffering by OS, or at least with a sector size of HDD, this bug won't produce critical performance problem of Tb while normally running.
I'm not sure perfomance impact on big remote msgFilterRules.dat file.
Comment 3 Wayne Mery (:wsmwk, NI for questions) 2011-06-11 06:44:51 PDT
WADA, is this a regression from version 2?
Comment 4 :aceman 2011-10-22 09:45:50 PDT
It is reading 1 char per function call (readchar) in functions like these, from nsIInputStream. Is nsIInputStream->Read buffered?

char nsMsgFilterList::LoadAttrib(nsMsgFilterFileAttribValue &attrib, nsIInputStream *aStream)
  char  attribStr[100];
  char  curChar;
  attrib = nsIMsgFilterList::attribNone;

  curChar = SkipWhitespace(aStream);
  int i;
  for (i = 0; i + 1 < (int)(sizeof(attribStr)); )
    if (curChar == (char) -1 || (!(curChar & 0x80) && isspace(curChar)) || curChar == '=')
    attribStr[i++] = curChar;
    curChar = ReadChar(aStream);
  attribStr[i] = '\0';
  for (int tableIndex = 0; tableIndex < (int)(sizeof(FilterFileAttribTable) / sizeof(FilterFileAttribTable[0])); tableIndex++)
    if (!PL_strcasecmp(attribStr, FilterFileAttribTable[tableIndex].attribName))
      attrib = FilterFileAttribTable[tableIndex].attrib;
  return curChar;

// If we want to buffer file IO, wrap it in here.
char nsMsgFilterList::ReadChar(nsIInputStream *aStream)
  char  newChar;
  PRUint32 bytesRead;
  nsresult rv = aStream->Read(&newChar, 1, &bytesRead);
  if (NS_FAILED(rv) || !bytesRead)
    return -1;
  PRUint32 bytesAvailable;
  rv = aStream->Available(&bytesAvailable);
  if (NS_FAILED(rv))
    return -1;
    if (m_startWritingToBuffer)
    return newChar;
Comment 5 :aceman 2012-04-26 06:10:40 PDT
David Bienvenu, any idea how to easily switch this to some buffered reading? Maybe just exchange the nsIInputStream type?
Comment 6 :aceman 2012-04-26 06:39:22 PDT
Seems there is a nsIBufferedInputStream that attaches to nsIInputStream. Maybe that could work.
Comment 7 :aceman 2012-04-26 12:04:11 PDT
Created attachment 618747 [details] [diff] [review]

OK, I have tested this with a 10000 filters (each with one rule), 1.2MB msgFilterRules.dat, on a SSD disk, but the file was cached by the OS already. Opening the filter list took only about 5 seconds so this is not a performance problem. As WADA said, the file is buffered by the OS.

But I attach a patch that converts the nsIInputStream to a buffered one. WADA can you somehow test if there are now less calls in your log?
Comment 8 David :Bienvenu 2012-04-30 10:00:24 PDT
Comment on attachment 618747 [details] [diff] [review]

I haven't tried the patch, but it looks like the right thing to do, thx.
Comment 9 WADA 2012-05-20 01:08:59 PDT
Comment on attachment 618747 [details] [diff] [review]
patch experiment

I think use of NS_NewBufferedInputStream is correct change, but I can say nothing on corectness of the code, although I believe there is no type in the code.
Comment 10 :aceman 2012-05-20 03:29:40 PDT
But you are the bug reporter and you are the only one who can run the process monitor you used so we need your test.

David Bienvenu, can you create a try build for WADA so he can run the patch?
Comment 11 WADA 2012-05-20 08:27:35 PDT
(In reply to :aceman from comment #10)
> you are the only one who can run the process monitor you used so we need your test.

Anyone can download charge free Process Monitor which is provided by MS, and anyone can legally run Process Monitor on MS Win if he is genuine MS Win user. And I believe similar charge free monitor tool is available on Linux and Mac OS X.
Comment 12 :aceman 2012-05-20 08:34:07 PDT
Exactly, if one has MS Windows.

But you know best what to look for in the monitor;)
Comment 13 :aceman 2012-07-20 12:18:00 PDT
Comment on attachment 618747 [details] [diff] [review]

OK, I have tested the patch with strace on Linux and I see that before the patch there were read() system calls for each byte, after the patch there was one call for each 10K bytes.
Comment 14 David :Bienvenu 2012-07-29 13:51:59 PDT
Comment on attachment 618747 [details] [diff] [review]

thx for the patch.
Comment 15 Ryan VanderMeulen [:RyanVM] 2012-07-30 16:58:06 PDT
Comment 16 2012-07-31 01:35:13 PDT
Maybe we need a MsgNewBufferedFileInputStream to match our existing MsgNewBufferedFileOutputStream?

Note You need to log in before you can comment on or make changes to this bug.