Closed
Bug 97173
Opened 23 years ago
Closed 23 years ago
startup perf- remove the need of loading of unixcharset.properties files at startup time to speed up
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
FIXED
mozilla0.9.8
People
(Reporter: ftang, Assigned: shanjian)
References
Details
(Keywords: perf)
Attachments
(1 file, 3 obsolete files)
5.34 KB,
patch
|
shanjian
:
review+
shanjian
:
superreview+
|
Details | Diff | Splinter Review |
we currently load unixcharset.properties at startup time, we should find a way
to
remove the need of loading it at startup time to speed up startup performance
we could
1. build in common used pair in the cpp code, and/or
2. cache the resolved result in registry.
Updated•23 years ago
|
Reporter | ||
Comment 1•23 years ago
|
||
mark m0.9.6 and give to jbetak
Assignee: ftang → jbetak
Target Milestone: --- → mozilla0.9.6
Comment 2•23 years ago
|
||
Juraj: would you kindly tell us how much time will be save by doing this?
Comment 3•23 years ago
|
||
accepting.
bstell: my measurements using the built-in NS_TIMELINE timer dp recommended to
us show that this would save ~16 ms, which ~0.3% of my local startup time.
Status: NEW → ASSIGNED
Comment 4•23 years ago
|
||
Comment 5•23 years ago
|
||
bstell:
could you kindly look over the proposed changes? It's a rather lame
optimization, but I couldn't think of anything better. It would save us an
unnecessary disk access on a platform Netscape cares most about. I've noticed
that you worked in this area and obsoleted the unixcharset.properties string
bundle, which covers most of the work needed for this bug. Since you have
domain knowledge in this particular area, I'm basically looking for a statement
indicating whether you find the proposed change acceptable?
Comment 6•23 years ago
|
||
Frank suggested we have a hard coded mapping list for standare charset names
(eg: iso8859-1, SJIS, ...) and only load the properties file we have a charset
that is not in that list.
Comment 7•23 years ago
|
||
bstell - thanks for looking at this. I agree and understand. However, I believe
you (or erik?) have obsoleted the .properties file on most platforms -
certainly on Linux - and it is in fact not being loaded.
There is some new logic in place where the default charset value is derived
from the results received from couple of system API calls. My assumption was
that this is performant enough (it's below the timer resolution) and does not
need to be circumvented by a hard-coded list. The only measurable performance
impact is caused by an attempt to load an non-existent .properties file. For
what I've understood the string bundle is not required and not packaged on most
*nix platforms. I could extend the existing code by a hard-coded list, but I'd
not like to. It's less than transparent as it is and I'm not sure whether that
would be the most intelligent way to address this.
Comment 8•23 years ago
|
||
it is being used
Comment 9•23 years ago
|
||
I did not see it being used in my debug Linux build. Instead Mozilla tried to
load unixcharset.Linux.properties. Also when
is "ConvertLocaleToCharsetUsingDeprecatedConfig" being used? Do we need to
worry about it at all?
Reporter | ||
Comment 10•23 years ago
|
||
>+ if (OSARCH!="Linux")
no, we should not have this kind of code.
Comment 11•23 years ago
|
||
I'll profile this on my home machine again - on my slow Suse 7.1 machine at
work the only file access was to a non-existent unixcharset.Linux.properties
which consumed 15 ms to return a failure and no further file access was
necessary.
Bstell pointed out that this might not be the case for Linux versions,
certainly not some older ones. However, if requesting a non-existed file takes
less than 0.1% of startup time on Netscape supported Redhat versions, we should
probably push this out and come up with a more comprehensive solution as
discussed offline with bstell.
Comment 12•23 years ago
|
||
moving to M097 - new profiling data on another machine has shown that an
attempt to load an non-existent string bundle carries an I/O penalty of ~15 ms
which is ~0.3% if our startup time. Since it's below the 0.5% threshold it's
not as critical, but we should still fix it.
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Reporter | ||
Comment 13•23 years ago
|
||
jbetak's contract is up. Bulk move bugs to ftang
Assignee: jbetak → ftang
Status: ASSIGNED → NEW
Reporter | ||
Comment 14•23 years ago
|
||
Linux code. Give to shanjian. move to m0.9.8
Assignee: ftang → shanjian
Target Milestone: mozilla0.9.7 → mozilla0.9.8
Assignee | ||
Comment 15•23 years ago
|
||
brian,
On Linux (and hopefully most other unix system), could we try to use
the result returned by nl_langinfo(CODESET) before loading platform
specific property file? I believe file "unixcharset.OSARCH.properties"
is suppose to provide charset for unusual vendor specific charset
names. We don't really need to overload some names. So it should be
safe to change the trying order.
Status: NEW → ASSIGNED
Comment 16•23 years ago
|
||
The file is there to provide a point to correct wrong values *from*
nl_langinfo(CODESET).
Reversing the order of loading would defeat this.
However, I do not know of any case where this is needed
so the result of the change you propose is presently unknown.
No one can say either "safe" or "unsafe".
If I wanted to be safe I would add a pref to unix.js that disables the
loading of the file and wait to see if anyone complains.
If I was feeling brave I would just remove the code and wait to see if '
anyone complains.
Assignee | ||
Comment 17•23 years ago
|
||
Attachment #54738 -
Attachment is obsolete: true
Assignee | ||
Comment 18•23 years ago
|
||
I rearrange the calling sequence. Now we try to use charset directly before
anything else. I kept platform specific property file there to help resolve
unknow charset name.
Brian,
I was feeling very brave and made this patch. If there is any complain after
this change, we can always reconsider it in future. Could you review?
Reporter | ||
Comment 19•23 years ago
|
||
>The file is there to provide a point to correct wrong values *from*
>nl_langinfo(CODESET).
>
>Reversing the order of loading would defeat this.
>
>However, I do not know of any case where this is needed
ok. so... here is what I understand:
1. we add the loading of that file because we afraid that we may need to correct
the return value of nl_langinfo(CODESET)
2. we don't know any one do such wrong thing yet.
So... my suggestion is we remove such procedure for now untill we find out we
need it. We should not take performance hit for something we are not sure we
need. We should only consider take such hit when we are sure that we NEED it.
Assignee | ||
Comment 20•23 years ago
|
||
frank,
I believe that there is 2 purpose for the existence of that property file:
1) if the return value of nl_langinfo(CODESET) is incorrect, we can override it.
2) if the return value of nl_langinfo(CODESET) is not found in charset alias
file, we can put it here.
To achieve purpose 1, we have to load and check this property file first. For
above mentioned reason, we can call it after we failed to use it directly. That
way we still can achieve 2nd goal.
Reporter | ||
Comment 21•23 years ago
|
||
2) probably won't cause startup performance
1) will and we should consider remove 1)
Assignee | ||
Comment 22•23 years ago
|
||
frank, could you put r= there if you don't have objection to my approach?
Comment 23•23 years ago
|
||
Assignee | ||
Comment 24•23 years ago
|
||
Brendan, could you sr?
Comment 25•23 years ago
|
||
Comment on attachment 61784 [details] [diff] [review]
proposed patch
>@@ -60,6 +60,7 @@
> #include <langinfo.h>
> #endif
> #include "nsPlatformCharset.h"
>+#include "nsAutoLock.h"
>
> NS_IMPL_ISUPPORTS1(nsPlatformCharset, nsIPlatformCharset);
If this XPCOM object is referenced by more than one thread, as the following
suggests, you need NS_IMPL_THREADSAFE_... here.
>@@ -67,22 +68,30 @@
> static nsURLProperties *gInfo_deprecated = nsnull;
> static PRInt32 gCnt=0;
>
>+//this lock is for protecting above static variable operation
>+static PRLock *gLock = nsnull;
>+
> nsPlatformCharset::nsPlatformCharset()
> {
> NS_INIT_REFCNT();
> PR_AtomicIncrement(&gCnt);
>+ if (!gLock)
>+ gLock = PR_NewLock();
This is not threadsafe, and could result in two locks being created (a leak,
and possibly a further thread-safety hazard, if one thread runs for a while
with the first lock cached in an nsAutoLock, while the second thread creates,
stores, and uses the final or "race-winning" lock).
Consider using PR_CallOnce to initialize this lock, if you can't initialize the
lock in a function or method that you are certain runs on the main thread
before any other thread could construct an nsPlatformCharset.
I'll review a new patch that fixes these bugs.
/be
Attachment #61784 -
Flags: needs-work+
Reporter | ||
Comment 26•23 years ago
|
||
Will this object reference from different thread ?
Reporter | ||
Comment 27•23 years ago
|
||
>Will this object reference from different thread ?
if not, we can remove those locking code. How can we tell ?
Comment 28•23 years ago
|
||
Follow the uses of nsIPlatformCharset via lxr. The interface is not scriptable,
so the only code that could call into nsPlatformCharset's implementation is C++
code. A quick glance at the lxr made me worry that some Necko thread uses uconv
or something else that uses nsIPlatformCharset. Cc'ing darin.
A little thread-safety violation can be a big problem, so unless you're willing
to assert (NS_ASSERTION) that the implementation runs only on the main thread,
and you can show good arguments for keeping it that way, and doc-comments that
warn potential users about the limitation, I think the part of this bug's
patches that tries to make nsPlatformCharset thread-safe is moving in the right
direction -- but we're not safe yet.
/be
Assignee | ||
Comment 29•23 years ago
|
||
Attachment #61784 -
Attachment is obsolete: true
Assignee | ||
Comment 30•23 years ago
|
||
brendan, could you sr?
Assignee | ||
Comment 31•23 years ago
|
||
fyi, bug 99382 is for all thread safety issue in charset conversion. I just
intend to make this file thread safe while I was touching it. A complete
solution to the whole module will happen in 99382.
Comment 32•23 years ago
|
||
Comment on attachment 63425 [details] [diff] [review]
updated patch for thread safety issue
sr=brendan@mozilla.org
Nit: the if(nsnull == gInfo_deprecated) silly style of comparing a pointer to
nsnull, with nsnull on the left, is not observed in much of the code visible in
the patch, so why preserve it in your changes? See 'if (gNLInfo)' and
'if(gInfo_deprecated && locale.Length())' for alternatives.
Those remind me: how about consistent 'if(' or 'if (' style? And don't use
.Length(), use .IsEmpty().
/be
Attachment #63425 -
Flags: superreview+
Comment 33•23 years ago
|
||
loading file://some/directory can lead to charset conversion happening on one of
the background file transport threads. cc'ing bbaetz as he would know best what
requirements the directory index stream has on charset conversion.
Assignee | ||
Comment 34•23 years ago
|
||
fix some nits pointed out by brendan.
Attachment #63425 -
Attachment is obsolete: true
Assignee | ||
Updated•23 years ago
|
Attachment #64065 -
Flags: superreview+
Attachment #64065 -
Flags: review+
Assignee | ||
Comment 35•23 years ago
|
||
fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 36•23 years ago
|
||
darin: It could, on some platforms (unix, but not windows - not sure about mac).
That code isn't enabled, but it may be shortly - I need to rework that stuff
soonish.
Comment 37•23 years ago
|
||
Changing QA contact to ftang for now. Frank, can this be verified within
development or can you provide a test case for QA?
CCing teruko on this bug.
QA Contact: andreasb → ftang
You need to log in
before you can comment on or make changes to this bug.
Description
•