Closed Bug 195010 Opened 20 years ago Closed 13 years ago

seeking too much in fastload file; blows out buffered stream


(Core :: XUL, defect, P2)






(Reporter: jrgmorrison, Assigned: brendan)



(Keywords: perf)


(3 files, 1 obsolete file)

as discussed, we wind up seeking back and forth from the master document 
location to the current out-of-line script location and back to master. This
blows out the buffered read of the fastload file.
Priority: -- → P2
Target Milestone: --- → mozilla1.4alpha
Attached patch proposed fix (obsolete) — Splinter Review
My laptop has a slow disk; I'm still rebuilding after updating my tree to
today's trunk.	Any testing feedback will be much appreciated, even if it's of
the form "it don't work, it crashed/asserted here...".

builds ok, and no crashes or asserts, or defects that I noticed. However, it
still seems to be seeking as before, or so the following printf at the top
of nsBufferedStream::Seek(PRInt32 whence, PRInt32 offset) seems to say:
  printf("xxx: nsBufferedStream::Seek, whence: %d, offset: %-8d\n",
         whence, offset);
Attachment #118370 - Flags: superreview?(ben)
Attachment #118370 - Flags: review?(jrgm)
Hmm, why didn't this help more?  jrgm, I thought you had a FastLoad file that
caused seeks from relatively low offsets back to a high one, then back to a
nearby lower one, etc.

I think this patch will help such a FastLoad file, but I'm not sure how that
file was generated.  Anyway, perhaps I can do even better for the case we see
now, that doesn't benefit from this patch.  Tomorrow.

diff -pu8, for applying via patch.  A cvs diff -pwu8 is coming up, that's the
one to code-review.

Attachment #118370 - Attachment is obsolete: true
The changes here are:

- Pass null scope objects into nsIScriptContext::Compile{EventHandler,Script}
when precompiling for brutal sharing.  The JS engine does not need a non-null
object parameter for static scope.  That parameter can be non-null only if it
helps the compiler "pre-bind" functions to the same scope object that they'll
be parented by when executing, but with brutal sharing, functions are
precompiled once and executed against many different scope objects.

I had hoped to go further and remove all the one-per-XUL-document
nsXULPDGlobalObject and nsJSContext/JSContext objects created by the
content/xul/document/src/nsXULPrototypeDocument.cpp code, but it's hard to
untangle the dependencies on per-XUL-prototype-doc global objects (for security
principals).  I'll save that for another patch.

- The previous patch in this bug didn't avoid any seeks, as jrgm pointed out. 
The problem was that XUL FastLoad would serialize master document out-of-line
scripts (those included via script src= from non-overlay, "master" .xul docs)
far from the place in the FastLoad file where XUL prototype script info was
serialized for the <script src=> tag itself.  I fixed that so that, unless the
out-of-line script was previously serialized (by a different src= ref from
another .xul file), the OOL script data immediately follows the proto-script
info in the FastLoad file.

This required adding a SerializeOutOfLine method to nsXULPrototypeScript, which
restores symmetry by matching the existing DeserializeOutOfLine(Script) -- but
I dropped the redundant "Script" from the end of the latter method's name.  We
need SerializeOutOfLine to handle overlay OOL scripts.	They are serialized by
nsXULDocument::OnStreamComplete, because that code knows the difference between
an overlay and a master doc.  This removes all trace of FastLoad writing from
nsXULPrototypeScript::Compile -- FastLoad stuff didn't belong there, not only
because we now want to write master OOL scripts later, when walking the master
XUL prototype doc's element tree, but also for modularity reasons.  The caller
knows about FastLoad, nsXULPrototypeScript::Compile does just what its name

This change to the order in which OOL scripts for master XUL docs are muxed
into the FastLoad file reduced seeks that dump the underlying stream buffer
quite a bit.  I'll comment in a little while with the savings.	We still seek
outside the buffer when reading the FastLoad file too much, on account of
overlay OOL scripts.  A follow-up bug to address those sounds good, because I
want to get this patch in for 1.4b this week.

- Nit-picked some comments to fit in 80 columns, and made other cosmetic fixes.

- Implicated the nsXULDocument::mIsWritingFastLoad flag from the useXULCache
"is the XUL cache enabled?" flag, so other places that test mIsWritingFastLoad
don't have to query whether the cache is enabled.

- Added METERING synchronous meter-dumping to /tmp/bufstats, only ifdef
DEBUG_brendan, in netwerk/base/src/nsBufferedStreams.cpp.

- Added the deferred seek optimization from the first patch in this bug to

- Added a new method to nsIFastLoadFileControl and its impls: hasMuxedDocument.
 This is needed when serializing master XUL doc OOL scripts, because we must
not serialize twice, and any OOL script that other XUL docs can include via
script src= could already be in the FastLoad mux.

Forgot to note in comment #5 that I also fixed xpcom/io/nsFastLoadFile.cpp to
track the interface extension made to nsIInputStream to add readSegments.  The
FastLoad file reader extends nsBinaryInputStream, a concrete class, and
overrides Read in order to snoop on *all* bytes read.  It does this in order to
demultiplex documents interleaved when the FastLoad file was written.  But now
that, e.g., nsBinaryInputStream::ReadCString uses ReadSegments to read the
string chars, the FastLoad file implementation never accounts for those bytes.

This didn't seem to cause harm until I changed the order in which XUL master doc
OOL scripts were muxed!

Anyway, nsFastLoadFile.cpp needs to be checked when nsIInputStream or similar
interfaces change.  Cc'ing darin and alecf so they know what to watch out for.

Comment on attachment 120151 [details] [diff] [review]
diff -w version of last patch

(jrgm, your r= and testing welcome too, I know yer busy -- thanks!)
Attachment #120151 - Flags: superreview?(bryner)
Attachment #120151 - Flags: review?(ben)
Patch appliers, don't forget to apply this, too.

Ok, without my patch, generating a FastLoad file for navigator startup results
in the next app session, which FastLoads that file, doing

/home/brendan/tmp/bufstats.old-reader:seeks within buffer:    46
/home/brendan/tmp/bufstats.old-reader:seeks outside buffer:   87

With my patch (er, patches -- sorry about forgetting nsFastLoadFile.h), I get:

/home/brendan/tmp/ within buffer:    83
/home/brendan/tmp/ outside buffer:   49

So, 132 seeks with the patch, only 49 of which dump the buffer, vs. 133 seeks
without the patch, 87 of which dump the buffer.

This should make for a Ts improvement, assuming we measure Ts with a profile
containing a valid FastLoad file -- ah, bryner just said we measure Ts after
clobbering the profile, running once, then taking best of 10 startup times, so
we should be cool there.

Keywords: perf
bryner asked some good questions, prompting me to record the answers here:

Q: Before the patch, where did XUL OOL scripts go with respect to their <script
src=> proto-script info?

A: All OOL scripts were serialized just after their loads completed and their
source was compiled by the JS engine (in the old nsXULPrototypeScript::Compile).

Q: Before the patch, where did <script src=> proto-script info go?

A: for master documents, near the end of the FastLoad file, as the proto-element
tree content model was walked.  For overlays, the nsXULDocument.cpp code does
not wait for an overlay's OOL scripts to complete loading before serializing the
overlay's content model.  But for the master, all OOL scripts, directly sourced
by the master and indirectly included via overlays, must complete before the
master content model is walked to serialize its data into the FastLoad file.

Also, some overlays are included for their scripts, but not attached by PIs in
master .xul docs to the master content model.  So you can't wait for everything
to get built up into one big content model that includes master and overlay
elements, then walk it.  Doing that might speed up FastLoad more, but since some
overlays are shared among different master docs, it would result in either (a)
redundant copies of overlay elements in the FastLoad file or (b) order of docs
that was good for some startup options (browser first) but not others (-mail).

We live with (b) today, and I think it's fine.  Most users start the same way
every time.

The FastLoad file format and code do not support mux'ing a doc more than once
into the file.  That's a waste of disk space, runtime writing and reading the
file, and memory buffering its contents.

For overlays, I'll file a followon bug that tries to put their OOL scripts right
next to their <script src=> proto-script content model nodes.

In case it wasn't obvious, the Q&A in the last comment mean that before this
patch, scripts tended to be low in the FastLoad file, while the referencing
<script src=> proto-nodes were at high offsets.  Hence more big seeks, as
measured by the METERING code in nsBufferedStreams.cpp and reported in comment
#9.  Moving serialization of master OOL scripts to right after their proto-nodes
made for the win shown in that comment.

Comment on attachment 120151 [details] [diff] [review]
diff -w version of last patch

Looks good. sr=bryner.
Attachment #120151 - Flags: superreview?(bryner) → superreview+
Comment on attachment 120151 [details] [diff] [review]
diff -w version of last patch
Attachment #120151 - Flags: review?(ben) → review+
Checked in, watching Ts.

Keeping the bug open for further improvements to reduce overlay-induced seeks.

Target Milestone: mozilla1.4alpha → mozilla1.5alpha
Brendan, I am confused.  You added a nsFastLoadFile::ReadSegments to
nsFastLoadFile.cpp and then you added the prototype member function
to nsFastLoadFile.h, HOWEVER, you added it to the header as
    NS_IMETHODIMP nsFastLoadFileReader::ReadSegments(nsWriteSegmentFun aWriter,
                                                     void* aClosure,
                                                     PRUint32 aCount,
                                                     PRUint32 *aResult);
and NOT as
    NS_IMETHODIMP ReadSegments(nsWriteSegmentFun aWriter, void* aClosure,
                               PRUint32 aCount, PRUint32 *aResult);

Was there a reason you prefaced (in the header) the ReadSegments function
with nsFastLoadFileReader::?  I actually think this is incorrect, or
I should say the HP aCC compiler considers this incorrect.
Comment on attachment 118370 [details] [diff] [review]
proposed fix

removing obsolete review request
Attachment #118370 - Flags: superreview?(ben)
Attachment #118370 - Flags: review?(jrgm)
Target Milestone: mozilla1.5alpha → mozilla1.6alpha
Target Milestone: mozilla1.6alpha → Future
Planning to address by switching to PR_FileMap/PR_MemMap veneer on top of mmap. Then the problem might be poor locality of reference instead of seeking, so this bug can hang around as blocked by bug 340148.

Depends on: 340148
Component: XP Toolkit/Widgets: XUL → XUL
QA Contact: shrir → xptoolkit.widgets
I think bug 412796 fixed this. Taras, please reopen if you disagree.

Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.