Closed Bug 119321 Opened 23 years ago Closed 22 years ago

[FIX]CSS loader uses large contiguous buffers

Categories

(Core :: DOM: HTML Parser, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.2alpha

People

(Reporter: darin.moz, Assigned: bzbarsky)

References

Details

(Keywords: memory-footprint, perf, Whiteboard: [adt3])

Attachments

(4 files, 4 obsolete files)

The CSS loader currently uses a nsStreamLoader to load CSS documents. 
nsStreamLoader delivers the CSS document to the CSS loader as a contiguous
memory buffer.  This buffer can be large and hence costly depending on the size
of the CSS document, and is most likely a significant cost at startup.  It is
even more costly because the CSS loader converts the char buffer to unicode,
again storing it in a contiguous buffer.

Instead, the CSS loader should at least read and convert the raw network data
chunk-by-chunk to unicode and perhaps store it in a segmented buffer.  It would
be even better if the CSS parser could be made to operate chunk-by-chunk. 
dbaron suggested that this shouldn't be too difficult, but after a quick glance
over the code, the solution is not so apparent to me.

Assigning this bug to dp per his request.
I would think it wouldn't be too difficult because I'd think the changes could
be limited to the CSS Loader and the CSS Scanner (and not the CSS Parser).
Let me get some allocation numbers on this one from spacetrace.
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla0.9.9
Not only does the loader convert it into Unicode in a single buffer, it does
this only to get a unichar stream....

So the current flow is:

char* --> nsAutoString (to determine encoding)
char* --> nsString --> nsIUnicharInputStream

One possible change would be to do:

char* --> nsAutoString (to determine encoding. The patch in bug 80106 fixes
                        this to be smarter, but it did not seem to be a
                        significant hit.)
char* --> nsIInputStream --> nsIUnicharInputStream

This would be fairly simple to do.  The problem there would be that we could end
up with partially-parsed CSS content if a decoding error occured partway through
the CSS....

If the CSSLoader could just get a seekable stream from the streamloader that may
be best.  Then we could determine the encoding by reading data from the stream,
rewind it, and just wrap the stream to get a nsIUnicharInputStream....
it can almost get a seekable stream by calling ReadSegments on the first buffer
segment.  that allows it to look ahead at as much as 4k of data, which should be
more than enough to determine the charset, right?
4k is certainly plenty.  Is there a decent way to wrap up a segmented buffer
into a stream?  We need to end up passing a unichar stream to the CSS Parser...
well... i figure a multifragment nsAString implementation might be most
appropriate, but i'm not certain.
Could the CSS loader benefit from nsSlidingString? This is a multi-fragment
string that can adopt buffers, "appending" then to its end, and then one can
slide this as a string "view" over the multiple fragments, where it will free
those fragment it's done with. nsSlidingSubstring (nsSlidingDependentString
would be more appropriate a name) works the same except it doesn't free the
fragments.

Vidur, is this a correct description, and can this string type be of use here?
Keywords: footprint, nsbeta1
Target Milestone: mozilla0.9.9 → mozilla1.0
Darin says the shortest path to success would be to have the
nsIStreamLoader::OnStreamComplet() pass an nsIInputStream which has a segmented
buffer behind it.
Darin for the stream loader stuff...
Assignee: dp → darin
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
Keywords: mozilla1.0
Keywords: nsbeta1+
Keywords: nsbeta1
Keywords: perf
Priority: P3 → P4
Attached file nsIUnicharStreamLoader.idl (obsolete) —
here's an interface for a nsIStreamLoader-like thing that can be used to solve
this bug.
the first interface should be called nsIUnicharStreamLoaderObserver, not
nsIUnicharStreamLoader ;-)
punt, no time left to work on this before 1.0
Target Milestone: mozilla1.0 → mozilla1.1alpha
Whiteboard: [adt3]
Changing nsbeta1+ [adt3] bugs to nsbeta1- on behalf of the adt.  If you have any
questions about this, please email adt@netscape.com.  You can search for
"changing adt3 bugs" to quickly find and delete these bug mails.
Keywords: nsbeta1-
Changing nsbeta1+ [adt3] bugs to nsbeta1- on behalf of the adt.  If you have any
questions about this, please email adt@netscape.com.  You can search for
"changing adt3 bugs" to quickly find and delete these bug mails.
Keywords: nsbeta1+
Target Milestone: mozilla1.1alpha → ---
mass futuring of untargeted bugs
Target Milestone: --- → Future
Attached patch Changes to netwerk/ (obsolete) — Splinter Review
This is the implementation of nsIUnicharStreamLoader together with various
ancillaries (build system changes, nsNetCID changes, etc).  Does not include
the mac build changes that are needed to get this building on mac.
Attached patch Changes to intl/Splinter Review
This changes nsConverterInputStream to support a boolean aRecoverFromErrors
argument. We need this to avoid regressing some things.
Attached patch Changes to content/ (obsolete) — Splinter Review
Make the CSS Loader use all that nice infrastructure.  Also fixes bug 80106
while I'm in this code and rips out the rather bogus mCharset stuff in the CSS
parser.  Moves all the charset handling out of the loader and into the LoadData
(and the static helpers).

The GetCharsetFromData function is a little excessive, probably.  I doubt we
should bother with detecting data that's not in network order, since the
converter to Unicode will fail to deal with such data anyway.  It could even be
argued that we don't need to bother to detect data that uses encodings that are
not supersets of ASCII, but properly handling UCS-2, UTF-16, and the like is
not that hard and is good to do, in my opinion.
Blocks: 80106
Oh, and those patches are pretty much good for review.  There's still some
DEBUG_bzbarsky stuff in there, but I'll remove it (probably) before checking in....
the early returns inside nsUnicharStreamLoader::OnStopRequest should still null
out the member variables.  otherwise, you might end up leaking a bunch of
objects if say the observer owned the loader.  otherwise, your netwerk changes
look solid.

as for the intl and content changes, they seem good to me, but then i'm not an
expert in those areas.
Good catch, Darin.  This should be better.
Attachment #74526 - Attachment is obsolete: true
Attachment #93828 - Attachment is obsolete: true
Frank, Peter, could you take a look at the patches?
Comment on attachment 93830 [details] [diff] [review]
Changes to content/

Index: content/html/style/src/nsCSSLoader.cpp
===================================================================
RCS file: /cvsroot/mozilla/content/html/style/src/nsCSSLoader.cpp,v
retrieving revision 3.121
diff -u -8 -r3.121 nsCSSLoader.cpp
--- content/html/style/src/nsCSSLoader.cpp	2 Jul 2002 03:56:08 -0000      
3.121
+++ content/html/style/src/nsCSSLoader.cpp	3 Aug 2002 09:00:25 -0000

...

+  PRUint32 index = 0;
+  nsresult rv = NS_OK;
+  while (pos < aDataLength && index < sizeof(kCharsetSym) - 1) {
+    if (aStyleSheetData[pos] != kCharsetSym[index]) {
+      rv = NS_ERROR_NOT_AVAILABLE;
+      break;
+    }
+    ++index;
+    pos += step;
+  }
+
+  if (NS_FAILED(rv))
+    return rv;

Make this return NS_ERROR_NOT_AVAILABLE up after the if inside the while?

+CSSLoaderImpl::DidLoadStyle(nsIUnicharStreamLoader* aLoader,
+			     nsIUnicharInputStream* aStyleDataStream,
			     SheetLoadData* aLoadData,
			     nsresult aStatus)
...

+  if (NS_SUCCEEDED(aStatus) && (aStyleDataStream) && (mDocument)) {

Lose the parens around aStyleDataStream and mDocument.

r=peterv
Attachment #93830 - Flags: review+
Attachment #93830 - Attachment is obsolete: true
Attachment #96826 - Flags: review+
Taking.
Assignee: darin → bzbarsky
Status: ASSIGNED → NEW
nominating
Keywords: nsbeta1-nsbeta1
Priority: P4 → P1
Summary: CSS loader uses large contiguous buffers → [FIX]CSS loader uses large contiguous buffers
Target Milestone: Future → mozilla1.2alpha
Attachment #93829 - Flags: review+
Comment on attachment 93829 [details] [diff] [review]
Changes to intl/

r=smontagu
Comment on attachment 94506 [details] [diff] [review]
Updated changes to netwerk/ that don't leak

r=rpotts@netscape.com
Attachment #94506 - Flags: review+
Comment on attachment 93829 [details] [diff] [review]
Changes to intl/

sr=darin
Attachment #93829 - Flags: superreview+
Comment on attachment 94506 [details] [diff] [review]
Updated changes to netwerk/ that don't leak

whoops!  if the channel is redirected, then you have a problem.  if a redirect
occurs, then the request passed to OnStartRequest will not be the request
passed to Init.  i think you may want to update mChannel if this happens.
Attachment #94506 - Flags: needs-work+
Attachment #96826 - Flags: superreview+
Comment on attachment 96826 [details] [diff] [review]
Changes to content v1.1

sr=darin
Same, but with the channel property better commented to make it clear when it
can be used (and including mac build changes).
Attachment #94506 - Attachment is obsolete: true
Comment on attachment 96923 [details] [diff] [review]
netwerk patch v. 1.5

carrying the r=rpotts (since the changes are all pretty cosmetic)
Attachment #96923 - Flags: review+
Comment on attachment 96923 [details] [diff] [review]
netwerk patch v. 1.5

looks great!  sr=darin
Attachment #96923 - Flags: superreview+
Checked in.  Thanks for all the reviews!
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
the change of nsIStreamLoader to nsIUnicharstreamLoader in
SheetLoadData::OnStreamComplete in nsCSSParser.cpp breaks moz_timeline.
And therefor debug builds.
The nsIStreamLoader has a GetRequest, the unichar one a GetChannel. So the macro
NS_TIMELINE_MARK_LOADER does the wrong thing :-(.
nsCSSLoader.cpp:937
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 97002 [details] [diff] [review]
fix MOZ_TIMELINE bustage in nsCSSLoader.cpp

code by peterv, r=me
Attachment #97002 - Flags: review+
checked in fix from attachement 97002
Status: REOPENED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
Hmm... is MOZ_TIMELINE not enabled by default in debug builds?  Or is it only
enabled on Windows?  Because I certainly built this patch debug on Linux....
i think you have to explicitly enable it (--enable-timeline).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: