Closed Bug 571419 Opened 14 years ago Closed 12 years ago

Tb3 issues read request for each one byte of msgFilterRules.dat

Categories

(MailNews Core :: Filters, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 17.0

People

(Reporter: World, Assigned: aceman)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(2 files)

[Build ID]
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a5pre) Gecko/20100604 Shredder/3.2a1pre

Tb3 issues read request for each one byte of msgFilterRules.dat.
By design?
Attachment #450546 - Attachment description: Process Monitor log for msgFilterRules.dat and msgFilterRules-1.dat → Process Monitor log(CSV) for msgFilterRules.dat and msgFilterRules-1.dat
Attachment #450546 - Attachment mime type: application/vnd.ms-excel → text/plain
Same "read request for each one byte" is observed with Tb 3.0.4, Tb 3.1xpre, Sm 1.1.18, and Sm 2.0.2 too.
Component: General → Filters
Product: Thunderbird → MailNews Core
QA Contact: general → filters
Version: 3.0 → Trunk
Keywords: perf
Note:
Because a msgFilterRules.dat file looks read only once(upon first message filtering, or first message filter open via UI), and because physical read from HDD is done with a block size and/or with buffering by OS, or at least with a sector size of HDD, this bug won't produce critical performance problem of Tb while normally running.
I'm not sure perfomance impact on big remote msgFilterRules.dat file.
Blocks: 444793
WADA, is this a regression from version 2?
It is reading 1 char per function call (readchar) in functions like these, from nsIInputStream. Is nsIInputStream->Read buffered?

char nsMsgFilterList::LoadAttrib(nsMsgFilterFileAttribValue &attrib, nsIInputStream *aStream)
{
  char  attribStr[100];
  char  curChar;
  attrib = nsIMsgFilterList::attribNone;

  curChar = SkipWhitespace(aStream);
  int i;
  for (i = 0; i + 1 < (int)(sizeof(attribStr)); )
  {
    if (curChar == (char) -1 || (!(curChar & 0x80) && isspace(curChar)) || curChar == '=')
      break;
    attribStr[i++] = curChar;
    curChar = ReadChar(aStream);
  }
  attribStr[i] = '\0';
  for (int tableIndex = 0; tableIndex < (int)(sizeof(FilterFileAttribTable) / sizeof(FilterFileAttribTable[0])); tableIndex++)
  {
    if (!PL_strcasecmp(attribStr, FilterFileAttribTable[tableIndex].attribName))
    {
      attrib = FilterFileAttribTable[tableIndex].attrib;
      break;
    }
  }  
  return curChar;
}

// If we want to buffer file IO, wrap it in here.
char nsMsgFilterList::ReadChar(nsIInputStream *aStream)
{
  char  newChar;
  PRUint32 bytesRead;
  nsresult rv = aStream->Read(&newChar, 1, &bytesRead);
  if (NS_FAILED(rv) || !bytesRead)
    return -1;
  PRUint32 bytesAvailable;
  rv = aStream->Available(&bytesAvailable);
  if (NS_FAILED(rv))
    return -1;
  else
  {   
    if (m_startWritingToBuffer)
      m_unparsedFilterBuffer.Append(newChar);
    return newChar;
  }
}
David Bienvenu, any idea how to easily switch this to some buffered reading? Maybe just exchange the nsIInputStream type?
Seems there is a nsIBufferedInputStream that attaches to nsIInputStream. Maybe that could work.
Assignee: nobody → acelists
Attached patch patchSplinter Review
OK, I have tested this with a 10000 filters (each with one rule), 1.2MB msgFilterRules.dat, on a SSD disk, but the file was cached by the OS already. Opening the filter list took only about 5 seconds so this is not a performance problem. As WADA said, the file is buffered by the OS.

But I attach a patch that converts the nsIInputStream to a buffered one. WADA can you somehow test if there are now less calls in your log?
Attachment #618747 - Flags: feedback?(dbienvenu)
Comment on attachment 618747 [details] [diff] [review]
patch

I haven't tried the patch, but it looks like the right thing to do, thx.
Attachment #618747 - Flags: feedback?(dbienvenu) → feedback+
Attachment #618747 - Flags: feedback?(m-wada)
Status: NEW → ASSIGNED
Attachment #618747 - Flags: feedback?(m-wada)
Comment on attachment 618747 [details] [diff] [review]
patch experiment

I think use of NS_NewBufferedInputStream is correct change, but I can say nothing on corectness of the code, although I believe there is no type in the code.
No longer blocks: 444793
But you are the bug reporter and you are the only one who can run the process monitor you used so we need your test.

David Bienvenu, can you create a try build for WADA so he can run the patch?
Blocks: 444793
(In reply to :aceman from comment #10)
> you are the only one who can run the process monitor you used so we need your test.

Anyone can download charge free Process Monitor which is provided by MS, and anyone can legally run Process Monitor on MS Win if he is genuine MS Win user. And I believe similar charge free monitor tool is available on Linux and Mac OS X.
Exactly, if one has MS Windows.

But you know best what to look for in the monitor;)
Comment on attachment 618747 [details] [diff] [review]
patch

OK, I have tested the patch with strace on Linux and I see that before the patch there were read() system calls for each byte, after the patch there was one call for each 10K bytes.
Attachment #618747 - Flags: review?(mozilla)
Attachment #618747 - Attachment description: patch experiment → patch
OS: Windows XP → All
Hardware: x86 → All
Attachment #618747 - Flags: review?(kent)
Comment on attachment 618747 [details] [diff] [review]
patch

thx for the patch.
Attachment #618747 - Flags: review?(mozilla) → review+
Keywords: checkin-needed
https://hg.mozilla.org/comm-central/rev/31e6a7c45484
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite-
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 17.0
Maybe we need a MsgNewBufferedFileInputStream to match our existing MsgNewBufferedFileOutputStream?
Attachment #618747 - Flags: review?(kent)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: