Closed
Bug 655337
Opened 13 years ago
Closed 13 years ago
Use a JAR friendly method of loading hyphenation dictionaries
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
RESOLVED
FIXED
mozilla10
People
(Reporter: mfinkle, Assigned: jfkthame)
References
Details
(Keywords: helpwanted, mobile, perf, Whiteboard: mobilestartupshrink)
Attachments
(2 files)
Bug 253317 added support for hyphenation dictionaries loaded from the gre/hyphenation or app/hyphenation folders. On android this requires extracting the dictionaries out of the omnijar file. This hurts startup a little and it increases the size of our footprint on the device. It also makes it impossible to load the dictionaries from a locale JAR. Could we try loading the dictionaries using a resource:// approach? We could still load from resource://gre/hyphenation and resource://app/hyphenation, but we could also add support for locale JARs too. And we could stop manually extracting the dictionaries on Android.
Assignee | ||
Comment 1•13 years ago
|
||
The libhyphen code we're using expects to be handed a file path, and loads the dictionary from there using standard C library functions (fopen, fgets, fclose). I assume this won't work directly with JARs and resource:// paths. We could of course refactor the dictionary-loading code so as to read via some other API within Gecko, but this will mean maintaining some kind of patch on top of the upstream code. Probably worth doing, though. And maybe we can get upstream to accept a patch providing a more flexible API.
Comment 2•13 years ago
|
||
Would be nice to do the same refactoring for the spellcheck dictionary too.
Updated•13 years ago
|
Whiteboard: mobilestartupshrink
Comment 3•13 years ago
|
||
Jonathan, you can use the trick I used in bug 669116, so that we wouldn't need to patch libhyphen...
Updated•13 years ago
|
OS: Linux → Android
Hardware: x86 → All
Version: unspecified → Trunk
Comment 4•13 years ago
|
||
Benjamin was saying something about whether we should use mmap here instead of overriding fread and friends to read from JAR files. I didn't follow the rest of the conversation... Do we know how we want to tackle this problem yet?
Comment 5•13 years ago
|
||
Reading hnj_hyphen_load, it basically reads sequentially through the file and creates its data structures. AIUI, we can mmap things from the omnijar, in which case it should be a relatively simple code change to allow that code to simply read directly from a buffer rather than using fopen/fgets. I'll also file a bug about how the current loading pattern will be completely broken on Windows if the user installs to a unicode path :-(
Comment 6•13 years ago
|
||
So, here's what I think should happen (it would be great if someone can confirm that I'm not mistaken here): 1. Modify nsHyphenationManager::LoadPatternList to load the hyphenation dictionaries from resource:///gre and resource:///app. I'm not sure if we can enumerate hyphenation dictionaries using nsIFile interfaces, but it might be good to generate a list of these files at build time and just modify nsHyphenationManager::LoadPatternListFromDir to use that list. 2. Pass the URI of the hyphenation dictionary to hnj_hyphen_load. 3. Replace fopen/fgets/fclose with equivalent functions which read the data from a URI, possibly by using APIs such as NS_OpenURI/nsInputStream::Read, etc. Alternatively, we can mmap the jar content at this point, although I have no idea how to do that.
Assignee | ||
Comment 7•13 years ago
|
||
(In reply to Ehsan Akhgari [:ehsan] from comment #6) > 1. Modify nsHyphenationManager::LoadPatternList to load the hyphenation > dictionaries from resource:///gre and resource:///app. I'm not sure if we > can enumerate hyphenation dictionaries using nsIFile interfaces, but it > might be good to generate a list of these files at build time and just > modify nsHyphenationManager::LoadPatternListFromDir to use that list. If we generate a build-time list, we'd still need a way to update it at run time, as we're almost certainly going to want to support post-install or on-demand addition of dictionaries (for languages where we can't ship them as part of the product, and/or for other reasons).
Comment 8•13 years ago
|
||
I don't think comment 7 is relevant: if we were to support downloadable dictionaries, we would store them in the profile, not in the application directory, along with our other downloadable resources.
Comment 9•13 years ago
|
||
Linux distros will most likely provide the dictionary from the openoffice dictionary packages, not in the profile, but in the application directory (or yet another directory).
Comment 10•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #9) > Linux distros will most likely provide the dictionary from the openoffice > dictionary packages, not in the profile, but in the application directory > (or yet another directory). They can modify the list file, right? I'm not sure if expecting them to add the file paths to a manifest file is unreasonable.
Comment 11•13 years ago
|
||
(In reply to Ehsan Akhgari [:ehsan] from comment #10) > (In reply to Mike Hommey [:glandium] from comment #9) > > Linux distros will most likely provide the dictionary from the openoffice > > dictionary packages, not in the profile, but in the application directory > > (or yet another directory). > > They can modify the list file, right? I'm not sure if expecting them to add > the file paths to a manifest file is unreasonable. Anything that needs a manifest being modified when a third party package is installed is painful.
Assignee | ||
Comment 12•13 years ago
|
||
I've just posted a patch in bug 685214 that moves one step towards this - it modifies things so that we use a URL spec (as a utf8 string) to specify the hyphenation dictionary, rather than a pathname, and then uses nsIInputStream to read it, so it should work with resource:// as an alternative to file:// URLs. (Basically, this deals with point 3 of Ehsan's strategy in comment #6.)
Assignee | ||
Comment 13•13 years ago
|
||
(Applies on top of bug 685214, currently -inbound.) This modifies nsHyphenationManager and nsHyphenator to use nsIURI instead of nsIFile to specify the hyphenation resource to be loaded. With this, it should be possible for nsHyphenationManager to instantiate hyphenators using resource:// URIs as well (though this isn't actually implemented yet).
Assignee | ||
Comment 14•13 years ago
|
||
I have no idea if this is a sensible approach, but it seems to work nicely on my Android device. :) Basically, it just uses an nsZipFind to look for hyphenation/*.dic entries in omnijar, and includes these in the list of available patterns. This avoids the need to unpack the pattern dictionaries from the .apk (bug 686637), and so it includes (almost all of) the patch from there. Presumably it would be possible to extend this to look in a locale jar file as well, though I have no idea about any of this packaging stuff really.
Attachment #560388 -
Flags: review?(mark.finkle)
Assignee | ||
Updated•13 years ago
|
Attachment #560329 -
Flags: review?(mark.finkle)
Assignee | ||
Comment 15•13 years ago
|
||
review ping...?
Reporter | ||
Comment 16•13 years ago
|
||
Comment on attachment 560329 [details] [diff] [review] part 1 - use nsIURI rather than nsIFile to specify hyphenation resources I think this is a great change. Simple too. I didn'y spot any issues, but you'll need to get someone else to do the "official" review since I am not a toolkit peer.
Attachment #560329 -
Flags: review?(mark.finkle) → review+
Reporter | ||
Comment 17•13 years ago
|
||
Comment on attachment 560388 [details] [diff] [review] part 2 - don't unpack hyphenation patterns on android, look for them in omnijar >diff --git a/xpcom/build/Omnijar.h b/xpcom/build/Omnijar.h > #include "nsString.h" >+#include "nsIFile.h" > >-class nsIFile; Could this be avoided by including the file in nsHyphenationManager.h/cpp ? r+ for the Android part. You need someone else to "officially" review the nsHyphenationManager changes. smontagu looks like a good candidate. In the meantime, I'll make an Android build with these changes and do some timing tests.
Attachment #560388 -
Flags: review?(mark.finkle) → review+
Assignee | ||
Updated•13 years ago
|
Attachment #560329 -
Flags: review?(smontagu)
Assignee | ||
Updated•13 years ago
|
Attachment #560388 -
Flags: review?(smontagu)
Assignee | ||
Comment 18•13 years ago
|
||
(In reply to Mark Finkle (:mfinkle) from comment #17) > >diff --git a/xpcom/build/Omnijar.h b/xpcom/build/Omnijar.h > > > #include "nsString.h" > >+#include "nsIFile.h" > > > >-class nsIFile; > > Could this be avoided by including the file in nsHyphenationManager.h/cpp ? It could, but as Omnijar.h requires the inclusion of nsIFile.h (because of the use of NS_IF_ADDREF on an nsIFile* in the inline definition of the GetPath() function), I think it's correct to #include the header here, rather than relying on anyone who wants to use Omnijar.h to have already included nsIFile.h themselves first. I assume the fact that Omnijar.h requires nsIFile.h (and not just a forward-declaration of nsIFile) simply went unnoticed because all existing users happened to include nsIFile.h themselves already.
Assignee | ||
Comment 19•13 years ago
|
||
Comment on attachment 560388 [details] [diff] [review] part 2 - don't unpack hyphenation patterns on android, look for them in omnijar Also flagging bsmedberg for review, due to the Omnijar.h change (see above).
Attachment #560388 -
Flags: review?(benjamin)
Comment 20•13 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #18) > I assume the fact that Omnijar.h requires nsIFile.h (and not just a > forward-declaration of nsIFile) simply went unnoticed because all existing > users happened to include nsIFile.h themselves already. indeed.
Comment 21•13 years ago
|
||
Comment on attachment 560388 [details] [diff] [review] part 2 - don't unpack hyphenation patterns on android, look for them in omnijar Oh how I hate this enumeration pattern and wish it would go away at the earliest opportunity.
Attachment #560388 -
Flags: review?(benjamin) → review+
Updated•13 years ago
|
Attachment #560329 -
Flags: review?(smontagu) → review+
Updated•13 years ago
|
Attachment #560388 -
Flags: review?(smontagu) → review+
Assignee | ||
Comment 22•13 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/bacc2c7eee29 https://hg.mozilla.org/integration/mozilla-inbound/rev/e1eac54b1ed3
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → jfkthame
Target Milestone: --- → mozilla10
Assignee | ||
Comment 23•13 years ago
|
||
Backed out pt 2 for leaking an nsZipFind: https://hg.mozilla.org/integration/mozilla-inbound/rev/e6b75c79c618
Assignee | ||
Comment 24•13 years ago
|
||
And re-landed, with the leak fixed: https://hg.mozilla.org/integration/mozilla-inbound/rev/ef08991a3411
Comment 25•13 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #24) > And re-landed, with the leak fixed: https://hg.mozilla.org/mozilla-central/rev/ef08991a3411
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment 26•13 years ago
|
||
Merge of part 1: https://hg.mozilla.org/mozilla-central/rev/bacc2c7eee29
You need to log in
before you can comment on or make changes to this bug.
Description
•