Closed Bug 97173 Opened 23 years ago Closed 23 years ago

startup perf- remove the need of loading of unixcharset.properties files at startup time to speed up

Categories

(Core :: Internationalization, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla0.9.8

People

(Reporter: ftang, Assigned: shanjian)

References

Details

(Keywords: perf)

Attachments

(1 file, 3 obsolete files)

we currently load unixcharset.properties at startup time, we should find a way to remove the need of loading it at startup time to speed up startup performance we could 1. build in common used pair in the cpp code, and/or 2. cache the resolved result in registry.
Blocks: 7251
Keywords: perf
Blocks: 71781
No longer blocks: 7251
mark m0.9.6 and give to jbetak
Assignee: ftang → jbetak
Target Milestone: --- → mozilla0.9.6
Blocks: 95823
Blocks: 103179
Juraj: would you kindly tell us how much time will be save by doing this?
accepting. bstell: my measurements using the built-in NS_TIMELINE timer dp recommended to us show that this would save ~16 ms, which ~0.3% of my local startup time.
Status: NEW → ASSIGNED
bstell: could you kindly look over the proposed changes? It's a rather lame optimization, but I couldn't think of anything better. It would save us an unnecessary disk access on a platform Netscape cares most about. I've noticed that you worked in this area and obsoleted the unixcharset.properties string bundle, which covers most of the work needed for this bug. Since you have domain knowledge in this particular area, I'm basically looking for a statement indicating whether you find the proposed change acceptable?
Frank suggested we have a hard coded mapping list for standare charset names (eg: iso8859-1, SJIS, ...) and only load the properties file we have a charset that is not in that list.
bstell - thanks for looking at this. I agree and understand. However, I believe you (or erik?) have obsoleted the .properties file on most platforms - certainly on Linux - and it is in fact not being loaded. There is some new logic in place where the default charset value is derived from the results received from couple of system API calls. My assumption was that this is performant enough (it's below the timer resolution) and does not need to be circumvented by a hard-coded list. The only measurable performance impact is caused by an attempt to load an non-existent .properties file. For what I've understood the string bundle is not required and not packaged on most *nix platforms. I could extend the existing code by a hard-coded list, but I'd not like to. It's less than transparent as it is and I'm not sure whether that would be the most intelligent way to address this.
it is being used
I did not see it being used in my debug Linux build. Instead Mozilla tried to load unixcharset.Linux.properties. Also when is "ConvertLocaleToCharsetUsingDeprecatedConfig" being used? Do we need to worry about it at all?
>+ if (OSARCH!="Linux") no, we should not have this kind of code.
I'll profile this on my home machine again - on my slow Suse 7.1 machine at work the only file access was to a non-existent unixcharset.Linux.properties which consumed 15 ms to return a failure and no further file access was necessary. Bstell pointed out that this might not be the case for Linux versions, certainly not some older ones. However, if requesting a non-existed file takes less than 0.1% of startup time on Netscape supported Redhat versions, we should probably push this out and come up with a more comprehensive solution as discussed offline with bstell.
moving to M097 - new profiling data on another machine has shown that an attempt to load an non-existent string bundle carries an I/O penalty of ~15 ms which is ~0.3% if our startup time. Since it's below the 0.5% threshold it's not as critical, but we should still fix it.
Target Milestone: mozilla0.9.6 → mozilla0.9.7
jbetak's contract is up. Bulk move bugs to ftang
Assignee: jbetak → ftang
Status: ASSIGNED → NEW
Linux code. Give to shanjian. move to m0.9.8
Assignee: ftang → shanjian
Target Milestone: mozilla0.9.7 → mozilla0.9.8
brian, On Linux (and hopefully most other unix system), could we try to use the result returned by nl_langinfo(CODESET) before loading platform specific property file? I believe file "unixcharset.OSARCH.properties" is suppose to provide charset for unusual vendor specific charset names. We don't really need to overload some names. So it should be safe to change the trying order.
Status: NEW → ASSIGNED
The file is there to provide a point to correct wrong values *from* nl_langinfo(CODESET). Reversing the order of loading would defeat this. However, I do not know of any case where this is needed so the result of the change you propose is presently unknown. No one can say either "safe" or "unsafe". If I wanted to be safe I would add a pref to unix.js that disables the loading of the file and wait to see if anyone complains. If I was feeling brave I would just remove the code and wait to see if ' anyone complains.
Attached patch proposed patch (obsolete) — Splinter Review
Attachment #54738 - Attachment is obsolete: true
I rearrange the calling sequence. Now we try to use charset directly before anything else. I kept platform specific property file there to help resolve unknow charset name. Brian, I was feeling very brave and made this patch. If there is any complain after this change, we can always reconsider it in future. Could you review?
>The file is there to provide a point to correct wrong values *from* >nl_langinfo(CODESET). > >Reversing the order of loading would defeat this. > >However, I do not know of any case where this is needed ok. so... here is what I understand: 1. we add the loading of that file because we afraid that we may need to correct the return value of nl_langinfo(CODESET) 2. we don't know any one do such wrong thing yet. So... my suggestion is we remove such procedure for now untill we find out we need it. We should not take performance hit for something we are not sure we need. We should only consider take such hit when we are sure that we NEED it.
frank, I believe that there is 2 purpose for the existence of that property file: 1) if the return value of nl_langinfo(CODESET) is incorrect, we can override it. 2) if the return value of nl_langinfo(CODESET) is not found in charset alias file, we can put it here. To achieve purpose 1, we have to load and check this property file first. For above mentioned reason, we can call it after we failed to use it directly. That way we still can achieve 2nd goal.
2) probably won't cause startup performance 1) will and we should consider remove 1)
frank, could you put r= there if you don't have objection to my approach?
Brendan, could you sr?
Comment on attachment 61784 [details] [diff] [review] proposed patch >@@ -60,6 +60,7 @@ > #include <langinfo.h> > #endif > #include "nsPlatformCharset.h" >+#include "nsAutoLock.h" > > NS_IMPL_ISUPPORTS1(nsPlatformCharset, nsIPlatformCharset); If this XPCOM object is referenced by more than one thread, as the following suggests, you need NS_IMPL_THREADSAFE_... here. >@@ -67,22 +68,30 @@ > static nsURLProperties *gInfo_deprecated = nsnull; > static PRInt32 gCnt=0; > >+//this lock is for protecting above static variable operation >+static PRLock *gLock = nsnull; >+ > nsPlatformCharset::nsPlatformCharset() > { > NS_INIT_REFCNT(); > PR_AtomicIncrement(&gCnt); >+ if (!gLock) >+ gLock = PR_NewLock(); This is not threadsafe, and could result in two locks being created (a leak, and possibly a further thread-safety hazard, if one thread runs for a while with the first lock cached in an nsAutoLock, while the second thread creates, stores, and uses the final or "race-winning" lock). Consider using PR_CallOnce to initialize this lock, if you can't initialize the lock in a function or method that you are certain runs on the main thread before any other thread could construct an nsPlatformCharset. I'll review a new patch that fixes these bugs. /be
Attachment #61784 - Flags: needs-work+
Will this object reference from different thread ?
>Will this object reference from different thread ? if not, we can remove those locking code. How can we tell ?
Follow the uses of nsIPlatformCharset via lxr. The interface is not scriptable, so the only code that could call into nsPlatformCharset's implementation is C++ code. A quick glance at the lxr made me worry that some Necko thread uses uconv or something else that uses nsIPlatformCharset. Cc'ing darin. A little thread-safety violation can be a big problem, so unless you're willing to assert (NS_ASSERTION) that the implementation runs only on the main thread, and you can show good arguments for keeping it that way, and doc-comments that warn potential users about the limitation, I think the part of this bug's patches that tries to make nsPlatformCharset thread-safe is moving in the right direction -- but we're not safe yet. /be
Attachment #61784 - Attachment is obsolete: true
brendan, could you sr?
fyi, bug 99382 is for all thread safety issue in charset conversion. I just intend to make this file thread safe while I was touching it. A complete solution to the whole module will happen in 99382.
Comment on attachment 63425 [details] [diff] [review] updated patch for thread safety issue sr=brendan@mozilla.org Nit: the if(nsnull == gInfo_deprecated) silly style of comparing a pointer to nsnull, with nsnull on the left, is not observed in much of the code visible in the patch, so why preserve it in your changes? See 'if (gNLInfo)' and 'if(gInfo_deprecated && locale.Length())' for alternatives. Those remind me: how about consistent 'if(' or 'if (' style? And don't use .Length(), use .IsEmpty(). /be
Attachment #63425 - Flags: superreview+
loading file://some/directory can lead to charset conversion happening on one of the background file transport threads. cc'ing bbaetz as he would know best what requirements the directory index stream has on charset conversion.
fix some nits pointed out by brendan.
Attachment #63425 - Attachment is obsolete: true
Attachment #64065 - Flags: superreview+
Attachment #64065 - Flags: review+
fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
darin: It could, on some platforms (unix, but not windows - not sure about mac). That code isn't enabled, but it may be shortly - I need to rework that stuff soonish.
Changing QA contact to ftang for now. Frank, can this be verified within development or can you provide a test case for QA? CCing teruko on this bug.
QA Contact: andreasb → ftang
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: