Status

()

Core
XML
16 years ago
8 years ago

People

(Reporter: Frank Tang, Assigned: Heikki Toivonen (remove -bugzilla when emailing directly))

Tracking

Trunk
Future
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

16 years ago
I try dp's startup performance analysis step described in
http://www.mozilla.org/performance/measureStartup.html

It seems the XML DTD handler take about 3% of start up time

Z:\mozilla\tools\performance\startup>..\..\..\dist\win32_o.obj\bin\mozilla.exe -
P "Default User" file:///z:/mozilla/tools/performance/startup/quit.html >out 2>&
1

Z:\mozilla\tools\performance\startup>egrep Token out|wc -l
     36

Z:\mozilla\tools\performance\startup>egrep Token out
00001.152:   Tokenizer_HandleExternalEntityRef total: 0.010
00001.152:   Tokenizer_HandleExternalEntityRef total: 0.010
00001.162:   Tokenizer_HandleExternalEntityRef total: 0.020
00001.422:   Tokenizer_HandleExternalEntityRef total: 0.020
00001.442:   Tokenizer_HandleExternalEntityRef total: 0.020
00001.452:   Tokenizer_HandleExternalEntityRef total: 0.030
00001.452:   Tokenizer_HandleExternalEntityRef total: 0.030
00001.512:   Tokenizer_HandleExternalEntityRef total: 0.030
00001.542:   Tokenizer_HandleExternalEntityRef total: 0.030
00001.552:   Tokenizer_HandleExternalEntityRef total: 0.030
00001.582:   Tokenizer_HandleExternalEntityRef total: 0.040
00001.582:   Tokenizer_HandleExternalEntityRef total: 0.040
00001.582:   Tokenizer_HandleExternalEntityRef total: 0.040
00001.642:   Tokenizer_HandleExternalEntityRef total: 0.050
00001.652:   Tokenizer_HandleExternalEntityRef total: 0.060
00001.662:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.672:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.672:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.692:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.702:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.712:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.722:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.743:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.763:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.773:   Tokenizer_HandleExternalEntityRef total: 0.070
00001.823:   Tokenizer_HandleExternalEntityRef total: 0.080
00001.863:   Tokenizer_HandleExternalEntityRef total: 0.080
00001.873:   Tokenizer_HandleExternalEntityRef total: 0.090
00001.913:   Tokenizer_HandleExternalEntityRef total: 0.100
00001.933:   Tokenizer_HandleExternalEntityRef total: 0.100
00001.943:   Tokenizer_HandleExternalEntityRef total: 0.100
00001.953:   Tokenizer_HandleExternalEntityRef total: 0.100
00001.973:   Tokenizer_HandleExternalEntityRef total: 0.110
00002.183:     Tokenizer_HandleExternalEntityRef total: 0.110
00002.734:    Tokenizer_HandleExternalEntityRef total: 0.110
00002.934:   Tokenizer_HandleExternalEntityRef total: 0.110

Z:\mozilla\tools\performance\startup>egrep main1 out
00000.100: main1...
00003.655: ...main1
(Reporter)

Comment 1

16 years ago
Created attachment 57171 [details] [diff] [review]
instrumentation code

instrumentation code
(Reporter)

Comment 2

16 years ago
110/3655 * 100 = 3%
we read in 36 dtd files at the startup. I wonder what will happen if we merge
these 36 files into one files ?
dp was investigating what would happen if we inlined the localization DTDs into
the XUL files. dp?
(Reporter)

Comment 4

16 years ago
>dp was investigating what would happen if we inlined the localization DTDs 
>into the XUL files.
don't want to see that happen. that will make the locale package BIGGER . We
should find a WIN-WIN solution as possible as we can.


(Reporter)

Comment 5

16 years ago
at the startup we try to load the dtd 36 times. there are only 28 uniq dtd files
got load. Here is the list
Z:\mozilla\tools\performance\startup>egrep systemId out|sort -u
DTD systemId: chrome://chatzilla/locale/chatzillaOverlay.dtd
DTD systemId: chrome://communicator/locale/bookmarks/bookmarksOverlay.dtd
DTD systemId: chrome://communicator/locale/contentAreaCommands.dtd
DTD systemId: chrome://communicator/locale/securityOverlay.dtd
DTD systemId: chrome://communicator/locale/sidebar/sidebarOverlay.dtd
DTD systemId: chrome://communicator/locale/tasksOverlay.dtd
DTD systemId: chrome://communicator/locale/utilityOverlay.dtd
DTD systemId: chrome://communicator/locale/viewZoomOverlay.dtd
DTD systemId: chrome://cookie/locale/cookieContextOverlay.dtd
DTD systemId: chrome://cookie/locale/cookieTasksOverlay.dtd
DTD systemId: chrome://editor/locale/editorNavigatorOverlay.dtd
DTD systemId: chrome://global-platform/locale/platformGlobalOverlay.dtd
DTD systemId: chrome://global-region/locale/region.dtd
DTD systemId: chrome://global/content/build.dtd
DTD systemId: chrome://global/locale/brand.dtd
DTD systemId: chrome://global/locale/charsetOverlay.dtd
DTD systemId: chrome://global/locale/tabbrowser.dtd
DTD systemId: chrome://global/locale/textcontext.dtd
DTD systemId: chrome://help/locale/helpMenuOverlay.dtd
DTD systemId: chrome://messenger/locale/mailNavigatorOverlay.dtd
DTD systemId: chrome://messenger/locale/mailOverlay.dtd
DTD systemId: chrome://messenger/locale/mailTasksOverlay.dtd
DTD systemId: chrome://navigator/locale/linkToolbar.dtd
DTD systemId: chrome://navigator/locale/navigator.dtd
DTD systemId: chrome://venkman/locale/venkman-overlay.dtd
DTD systemId: chrome://wallet/locale/walletContextOverlay.dtd
DTD systemId: chrome://wallet/locale/walletNavigatorOverlay.dtd
DTD systemId: chrome://wallet/locale/walletTasksOverlay.dtd

Comment 6

16 years ago
>dp was investigating what would happen if we inlined the localization DTDs into
>the XUL files. dp?
Is there a bug number for this? A shipping product with entities inlined in 
XULs will break the locale switching feature.

Comment 7

16 years ago
I havent tried the inlining yet. Will get on it. Feel free to beat me to it. I
think it is useful as an experiment. Will show if the bigger cost is parsing or
reading.
Blocks: 7251
(Reporter)

Comment 8

16 years ago
Here are the list of xul file reference to them and the dtd file

Z:\mozilla\tools\performance\startup>egrep "DTD base|DTD system" out
DTD base: chrome://navigator/content/navigator.xul
DTD systemId: chrome://global/locale/brand.dtd
DTD base: chrome://navigator/content/navigator.xul
DTD systemId: chrome://global/content/build.dtd
DTD base: chrome://navigator/content/navigator.xul
DTD systemId: chrome://navigator/locale/navigator.dtd
DTD base: chrome://wallet/content/walletNavigatorOverlay.xul
DTD systemId: chrome://wallet/locale/walletNavigatorOverlay.dtd
DTD base: chrome://navigator/content/navigatorOverlay.xul
DTD systemId: chrome://global/locale/brand.dtd
DTD base: chrome://navigator/content/navigatorOverlay.xul
DTD systemId: chrome://navigator/locale/navigator.dtd
DTD base: chrome://navigator/content/navigatorOverlay.xul
DTD systemId: chrome://communicator/locale/contentAreaCommands.dtd
DTD base: chrome://messenger/content/mailNavigatorOverlay.xul
DTD systemId: chrome://messenger/locale/mailNavigatorOverlay.dtd
DTD base: chrome://messenger/content/mailOverlay.xul
DTD systemId: chrome://messenger/locale/mailOverlay.dtd
DTD base: chrome://editor/content/editorNavigatorOverlay.xul
DTD systemId: chrome://editor/locale/editorNavigatorOverlay.dtd
DTD base: chrome://communicator/content/utilityOverlay.xul
DTD systemId: chrome://global/locale/brand.dtd
DTD base: chrome://communicator/content/utilityOverlay.xul
DTD systemId: chrome://global-region/locale/region.dtd
DTD base: chrome://communicator/content/utilityOverlay.xul
DTD systemId: chrome://communicator/locale/utilityOverlay.dtd
DTD base: chrome://help/content/helpMenuOverlay.xul
DTD systemId: chrome://help/locale/helpMenuOverlay.dtd
DTD base: chrome://global/content/platformGlobalOverlay.xul
DTD systemId: chrome://global-platform/locale/platformGlobalOverlay.dtd
DTD base: chrome://communicator/content/viewZoomOverlay.xul
DTD systemId: chrome://communicator/locale/viewZoomOverlay.dtd
DTD base: chrome://communicator/content/tasksOverlay.xul
DTD systemId: chrome://global/locale/brand.dtd
DTD base: chrome://communicator/content/tasksOverlay.xul
DTD systemId: chrome://communicator/locale/tasksOverlay.dtd
DTD base: chrome://messenger/content/mailTasksOverlay.xul
DTD systemId: chrome://global/locale/brand.dtd
DTD base: chrome://messenger/content/mailTasksOverlay.xul
DTD systemId: chrome://messenger/locale/mailTasksOverlay.dtd
DTD base: chrome://venkman/content/venkman-overlay.xul
DTD systemId: chrome://venkman/locale/venkman-overlay.dtd
DTD base: chrome://chatzilla/content/chatzillaOverlay.xul
DTD systemId: chrome://chatzilla/locale/chatzillaOverlay.dtd
DTD base: chrome://cookie/content/cookieTasksOverlay.xul
DTD systemId: chrome://cookie/locale/cookieTasksOverlay.dtd
DTD base: chrome://wallet/content/walletTasksOverlay.xul
DTD systemId: chrome://wallet/locale/walletTasksOverlay.dtd
DTD base: chrome://global/content/charsetOverlay.xul
DTD systemId: chrome://global/locale/charsetOverlay.dtd
DTD base: chrome://navigator/content/linkToolbarOverlay.xul
DTD systemId: chrome://navigator/locale/linkToolbar.dtd
DTD base: chrome://communicator/content/sidebar/sidebarOverlay.xul
DTD systemId: chrome://global/locale/brand.dtd
DTD base: chrome://communicator/content/sidebar/sidebarOverlay.xul
DTD systemId: chrome://communicator/locale/sidebar/sidebarOverlay.dtd
DTD base: chrome://communicator/content/contentAreaContextOverlay.xul
DTD systemId: chrome://communicator/locale/contentAreaCommands.dtd
DTD base: chrome://cookie/content/cookieContextOverlay.xul
DTD systemId: chrome://cookie/locale/cookieContextOverlay.dtd
DTD base: chrome://wallet/content/walletContextOverlay.xul
DTD systemId: chrome://wallet/locale/walletContextOverlay.dtd
DTD base: chrome://communicator/content/securityOverlay.xul
DTD systemId: chrome://communicator/locale/securityOverlay.dtd
DTD base: chrome://communicator/content/bookmarks/bookmarksOverlay.xul
DTD systemId: chrome://communicator/locale/bookmarks/bookmarksOverlay.dtd
DTD base: jar:resource:///chrome/toolkit.jar!/content/global/bindings/tabbrowser
.xml
DTD systemId: chrome://global/locale/tabbrowser.dtd
DTD base: jar:resource:///chrome/toolkit.jar!/content/global/autocomplete.xml
DTD systemId: chrome://global/locale/textcontext.dtd
DTD base: jar:resource:///chrome/toolkit.jar!/content/global/bindings/textbox.xm
l
DTD systemId: chrome://global/locale/textcontext.dtd



I am thinking of the following:
1. dump all the dtd files into one or two files (startup.dtd and brand.dtd)
2. change all the xul file refer to the startup.dtd and brand.dtd instead.

and see does that improve it.

Comment 9

16 years ago
Thats a wonderful experiment Frank.
(Reporter)

Comment 10

16 years ago
I try it and it seems that make it worst. It seems that
Tokenizer_HandleExternalEntityRef
call nsExpantTokenizer::OpenInputStream, LoadStream,
XML_ExternalEntityParserCrate, XML_Parse every time without any caching, even
the request systemID are the same one.

So... merging the the dtd file is a wrong direction for now.
(Reporter)

Comment 11

16 years ago
ok, here are the 36 xul file loaded into memory while we startup. First of all,
some of them are loaded more than once, can we reduce that ?
Second , is there a way we can delay of the loading of some of them till later ?

DTD base: chrome://navigator/content/navigator.xul
DTD base: chrome://navigator/content/navigator.xul
DTD base: chrome://navigator/content/navigator.xul
DTD base: chrome://wallet/content/walletNavigatorOverlay.xul
DTD base: chrome://navigator/content/navigatorOverlay.xul
DTD base: chrome://navigator/content/navigatorOverlay.xul
DTD base: chrome://navigator/content/navigatorOverlay.xul
DTD base: chrome://messenger/content/mailNavigatorOverlay.xul
DTD base: chrome://messenger/content/mailOverlay.xul
DTD base: chrome://editor/content/editorNavigatorOverlay.xul
DTD base: chrome://communicator/content/utilityOverlay.xul
DTD base: chrome://communicator/content/utilityOverlay.xul
DTD base: chrome://communicator/content/utilityOverlay.xul
DTD base: chrome://help/content/helpMenuOverlay.xul
DTD base: chrome://global/content/platformGlobalOverlay.xul
DTD base: chrome://communicator/content/viewZoomOverlay.xul
DTD base: chrome://communicator/content/tasksOverlay.xul
DTD base: chrome://communicator/content/tasksOverlay.xul
DTD base: chrome://messenger/content/mailTasksOverlay.xul
DTD base: chrome://messenger/content/mailTasksOverlay.xul
DTD base: chrome://venkman/content/venkman-overlay.xul
DTD base: chrome://chatzilla/content/chatzillaOverlay.xul
DTD base: chrome://cookie/content/cookieTasksOverlay.xul
DTD base: chrome://wallet/content/walletTasksOverlay.xul
DTD base: chrome://global/content/charsetOverlay.xul
DTD base: chrome://navigator/content/linkToolbarOverlay.xul
DTD base: chrome://communicator/content/sidebar/sidebarOverlay.xul
DTD base: chrome://communicator/content/sidebar/sidebarOverlay.xul
DTD base: chrome://communicator/content/contentAreaContextOverlay.xul
DTD base: chrome://cookie/content/cookieContextOverlay.xul
DTD base: chrome://wallet/content/walletContextOverlay.xul
DTD base: chrome://communicator/content/securityOverlay.xul
DTD base: chrome://communicator/content/bookmarks/bookmarksOverlay.xul
DTD base: jar:resource:///chrome/toolkit.jar!/content/global/bindings/tabbrowser
.xml
DTD base: jar:resource:///chrome/toolkit.jar!/content/global/autocomplete.xml
DTD base: jar:resource:///chrome/toolkit.jar!/content/global/bindings/textbox.xml

Comment 12

16 years ago
Hi, Waterson:


Is it true that DTDs themselves are not cached; only the final documents are?


The problem we are seeing here is because DTD are loaded via OpenInputStream which 
blocks the calling thread. Two solutions I can see here:

1. Use streamloader to load DTD asynchronously. A while ago, James Clark 
   commented that this is feasible, in theory.
2. Pre-compiled and cache the DTD. In fact, we can apply this to all chrome files
   including XUL, RDF, DTD, and CSS. we rebuild them only when we switch 
   providers. Any idea what might be in the way?

Comment 13

16 years ago
DTDs are processed deep in the bowels of expat; by the time I see them in the
XUL content sink all the entities have been resolved. So no, XUL doesn't cache
any DTD information.

1. Load DTDs asynchronously. While this solution would probably be a good thing
   to do (e.g., for XML loaded via HTTP), I doubt that it will have any impact
   on performance here. (I guess it _might_, if the DTD ended up in the memory
   cache, and we're currently re-reading the file. You'd save a bit of OS over-
   head, maybe.)

2. Pre-compile and cache DTDs. Yes, this is probably a better solution. In fact,
   it's what brendan has been asking us (hyatt, me, etc.) to do: implement
   his fastload stuff for XUL, CSS, XBL, etc.

Comment 14

16 years ago
It might be worth thinking about inlining entities. The big disadvantage here is
that intl packages now wont have entities but the entire XUL inlined with the
right  language entity. But then I dont know if that matters - a intl package
going from say 1mb to 5mb ? That would pretty much save all the 3% without any
additional code.

Comment 15

16 years ago
Plus how will precompiling and caching DTD work if expat is the one reading in
the dtds. Hopefully it goes through enough of our layers so we can short circuit
expat.

Comment 16

16 years ago
Inlining the entities is a good idea. Although it wouldn't require any ``code
changes'', it seems like it would require some serious XP build system hackery
(which is never to be underestimated!)

The idea with brendan's fastload would be that the _first_ time we parsed
navigator.xul (and its overlays), we would suck, and read all the DTDs. On all
subsequent loads, we would simply deserialize the XUL prototype content model 
from the fastload file: expat would not be involved at all. I'll file a bug on
that, and take a crack at it.

Comment 17

16 years ago
Wait, am I a moron? Did you already do this brendan?

Comment 18

16 years ago
If we fastload XUL, we could short-circuit expat.

Comment 19

16 years ago
>It might be worth thinking about inlining entities. The big disadvantage here is
>that intl packages now wont have entities but the entire XUL inlined with the
>right  language entity. But then I dont know if that matters - a intl package
>going from say 1mb to 5mb ? That would pretty much save all the 3% without any
>additional code.

IMO, what's causing the slowness is the blocking I/O in loading DTD. Inlining
entities in XUL could save the time of loading them from external files but not
the time for parsing them (while pre-compiled cache will). In addition, as 
mentioned earlier, inlining will break language (and reigonal content) 
switching and the entities can't be overwritten. Please don't go there.

I am inclined to the pre-compiled DTDs solution.

Comment 20

16 years ago
>The idea with brendan's fastload would be that the _first_ time we parsed
>navigator.xul (and its overlays), we would suck, and read all the DTDs.

> On all
>subsequent loads, 

Probably a dumb question: why not just ship the fastload file in the first place?

Comment 21

16 years ago
Why is navigator.xul being loaded three times, assuming ftang's data is
accurate?  That's what's causing all those other overlays and dtds to be loaded
2-3 times; it appears that there are about 9 xul and 8 dtd files we would stop
unnecessarily reloading if we figure out the navigator.xul issue.

Comment 22

16 years ago
ftang, your XUL cache is on, right?  Check your prefs.  We should also ensure
that chrome URLs are still being properly canonified when checking the chrome
cache.  Without that canonification, we may have problems where:

chrome://navigator/content/
chrome://navigator/content/navigator.xul
chrome://navigator/content

all map to different entries in the cache.

Comment 23

16 years ago
Created attachment 57299 [details] [diff] [review]
patch - prevent triple loading of brand.dtd on startup

Comment 24

16 years ago
Obviously precompiling the dtds would be good, but is more lazily loading them
out of the question?  It pains me to see us loading about 15 unique dtds
(according to ftang's data) that contain strings only necessary for various
menus and context menus.  We need very few strings to display a navigator window.

Comment 25

16 years ago
Comment on attachment 57299 [details] [diff] [review]
patch - prevent triple loading of brand.dtd on startup

checked this in.
Attachment #57299 - Attachment is obsolete: true

Updated

16 years ago
Depends on: 109450

Comment 26

16 years ago
Hyatt, if Frank is feeding off the timeline log, then it possible for him to see
the url being hit 3 times. But if we look for the file/jar url, I see it only
once. The XUL cache is preventing the navigator.xul to get loaded thrice.

I see Brand.dtd being loaded 6 times. See news post to n.p.m.performance titled
"Number of files we load on startup"

Comment 27

16 years ago
As its mame implies, the 'startup' is just the start :-) See bug 44458, multiple 
re-load of the DTDs is actually a pain in real-file situations for XML dialects 
that have large DTDs as well, since the problem is worse when navigating from 
page to page and reloading/reparsing at each click (e.g., the official "XHML 
plus MathML" is 384 KB).
Target Milestone: --- → Future

Updated

15 years ago
QA Contact: petersen → rakeshmishra

Comment 28

15 years ago
An additional comment:
During some testing of libjar, specifically the JARInputStream stuff,
I saw that DTD's are now read using JARInputStream, however
the the dtd's are read using 1024 bytes buffers, instead of the usual 
4K buffers. Just increasing the buffer from 1024 to 4096 would
allready save some pain.
(having the same buffersizes in buffered IO is very important, as the
opposite would destroy the buffering completely).

Updated

15 years ago
QA Contact: rakeshmishra → ashishbhatt

Comment 29

13 years ago
3% is 3%.

Can this be brought back?
Do these still take 3% of startup?

/be

Comment 31

13 years ago
About the 1024 byte buffer issue:
Checkout:
http://lxr.mozilla.org/mozilla/source/parser/htmlparser/src/nsExpatDriver.cpp#646
   rv = NS_NewUTF8ConverterStream(getter_AddRefs(uniIn), in, 1024);

This tells me that a 1024 byte buffer is used for DTD reading. All other
jar-reading and other file reading is using 4K buffers.
So, using the same buffersize will improve the performance.

Testing is required to find out how much, but it is a very simple patch,
with very minimal risk!

Updated

12 years ago
Depends on: 328289
Comment 31 was fixed in bug 328289.
QA Contact: ashshbhatt → xml
You need to log in before you can comment on or make changes to this bug.