Closed
Bug 803377
Opened 12 years ago
Closed 12 years ago
Add function to compress string data
Categories
(Cloud Services :: Firefox: Common, defect)
Cloud Services
Firefox: Common
Tracking
(firefox18 fixed, firefox19 fixed, firefox20 fixed)
RESOLVED
FIXED
mozilla20
People
(Reporter: gps, Assigned: gps)
References
Details
Attachments
(1 file, 4 obsolete files)
5.03 KB,
patch
|
rnewman
:
review+
akeybl
:
approval-mozilla-aurora+
akeybl
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
AFAICT there is no generic JS module for performing string encodings. I constantly find myself having to muck about with nsIStringConverterService and friends. I think it would be really nice if all this XPCOM voodoo were hidden from view and we could easily perform common string encodings.
The attached patch adds a new "string" component to toolkit. In it, we have a single file, StringEncoding.jsm. In that, we have the StringEncoding object which exposes an API for performing gzip compression.
Other encodings can be added to this file in the future. Other features can be added to our string library as well. Now we have a place for them to call home.
Gavin gets review. Mossop gets f? for approving new toolkit additions (since he is toolkit module owner).
I'm not sure what to do about the test. I don't want to put in data I obtained by calling the API itself. And, I was unable to figure out a trusted input and output pair to verify with. I tried Python's zlib.compress and it returned something different. I /think/ our built-in gzip encoder is writing out extra header data (perhaps a file header?) but I really have no clue. I'm also not an expert with nsIStringConverterService and friends, so I might have done something horribly wrong in XPCOM land.
Attachment #673043 -
Flags: review?(gavin.sharp)
Attachment #673043 -
Flags: feedback?(dtownsend+bugmail)
Comment 1•12 years ago
|
||
Oh, you meant nsIStreamConverterService (not nsIStringConverterService) - that confused me for a bit. The idea of consolidating this code seems reasonable, but I only see one other potential caller of this specific method (in TelemetryPing.js). There are other test callers of asyncConvertData with string streams, but they mostly use different to/from parameters.
This seems like something that could live in a toolkit/modules, were we to create one (ala browser/modules). A whole new components directory to contain just a single module and its test seems like overkill.
The proliferation of small, very specific JS modules is a bit problematic because the fixed overhead of a JS module is rather high at the moment (see e.g. bug 773845 and bug 798491), but that's a problem we probably should be addressing separately either way, so I guess we shouldn't worry about it too much here. Worth being aware of, though.
Assignee | ||
Comment 2•12 years ago
|
||
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #1)
> Oh, you meant nsIStreamConverterService (not nsIStringConverterService) -
> that confused me for a bit. The idea of consolidating this code seems
> reasonable, but I only see one other potential caller of this specific
> method (in TelemetryPing.js). There are other test callers of
> asyncConvertData with string streams, but they mostly use different to/from
> parameters.
I have lots of code in services/common/utils.js and services/sync/modules/util.js that could be moved to this module and related string modules.
> This seems like something that could live in a toolkit/modules, were we to
> create one (ala browser/modules). A whole new components directory to
> contain just a single module and its test seems like overkill.
I will create toolkit/modules then.
> The proliferation of small, very specific JS modules is a bit problematic
> because the fixed overhead of a JS module is rather high at the moment (see
> e.g. bug 773845 and bug 798491), but that's a problem we probably should be
> addressing separately either way, so I guess we shouldn't worry about it too
> much here. Worth being aware of, though.
I think I'm going to bury my head in the sand here as being discouraged from writing small, loosely coupled and reusable modules is so wrong on so many levels. That being said, concatenating files together into a single module is doable with a little build system magic and maybe some minor changes to how we do modules in Gecko. Hopefully that's never my problem ;)
Comment 3•12 years ago
|
||
I'm not discouraging you from doing it, I'm making you aware of some of our current platform limitations that you should consider, as part of the cost/benefit analysis exercise that is software development :)
Assignee | ||
Comment 4•12 years ago
|
||
Normally I'd say the benefits of a line of code written in 1 place instead of N outweighs a slight memory increase due to module proliferation. But B2G. Dammit.
![]() |
||
Comment 5•12 years ago
|
||
Comment on attachment 673043 [details] [diff] [review]
Implement string component with gzip encoding, v1
Review of attachment 673043 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/string/tests/unit/test_gzip_encoding.js
@@ +18,5 @@
> +
> + do_check_true(result.length > 0);
> +
> + // We should probably verify results against an expected value. However,
> + // how does one obtain an expected value without calling the API itself?
FWIW, for the tests I added when fixing gzip bugs in the platform code, I used the result of gzip itself. IIRC, there were a few fiddly details that required hex-editing the gzip output so our gzip implementation would be happy with it, all in non-required fields in the gzip format.
...but maybe you think doing that is scarier than not having any tests at all. ;)
Assignee | ||
Comment 6•12 years ago
|
||
Nathan: Can you just refer me to a known input and output that I can hard-code into the test? I'm basically looking for a smoke test.
![]() |
||
Comment 8•12 years ago
|
||
modules/libjar/zipwriter/test/unit/test_bug717061.js
modules/libjar/zipwriter/test/unit/data/test_bug717061.html
modules/libjar/zipwriter/test/unit/data/test_bug717061.gz
Are the bits I was talking about in comment 5. See:
http://mxr.mozilla.org/mozilla-central/source/modules/libjar/zipwriter/test/unit/test_bug717061.js#51
for some of the fiddly bits.
Flags: needinfo?(nfroyd)
Comment 9•12 years ago
|
||
Comment on attachment 673043 [details] [diff] [review]
Implement string component with gzip encoding, v1
Review of attachment 673043 [details] [diff] [review]:
-----------------------------------------------------------------
As said elsewhere, let's pull the trigger and create toolkit/modules and put this in there.
::: toolkit/components/string/StringEncoding.jsm
@@ +10,5 @@
> +
> +/**
> + * Provides utility functions for performing encoding operations on strings.
> + *
> + * Every find yourself encoding a string to another type or format? This
s/Every/Ever/
@@ +14,5 @@
> + * Every find yourself encoding a string to another type or format? This
> + * object is for you! Simply add your encoder and/or decoder to this and
> + * share it with the world.
> + */
> +let StringEncoding = {
Is this actually appropriate? To me string encoding is all about character encodings (UTF8, UTF16, etc.). Gzip seems different to that. Maybe StringConverter is better?
@@ +24,5 @@
> + * @param onComplete
> + * (function) Callback invoked with compressed output as a string
> + * parameter.
> + */
> + gzipEncodeToString: function gzipEncodeToString(s) {
This is a really confusing name. How about convertToGzip?
I'm concerned that this is a synchronous method and people might try to use it for large chunks of data. What are the performance characteristics of this?
Assignee | ||
Comment 10•12 years ago
|
||
Comment on attachment 673043 [details] [diff] [review]
Implement string component with gzip encoding, v1
I rewrote a lot of this patch locally. Clearing review flag.
Attachment #673043 -
Flags: review?(gavin.sharp)
Attachment #673043 -
Flags: feedback?(dtownsend+bugmail)
Assignee | ||
Comment 11•12 years ago
|
||
(In reply to Dave Townsend (:Mossop) from comment #9)
> Comment on attachment 673043 [details] [diff] [review]
> Implement string component with gzip encoding, v1
>
> Review of attachment 673043 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> As said elsewhere, let's pull the trigger and create toolkit/modules and put
> this in there.
Already done in my local patch (to be uploaded).
> @@ +14,5 @@
> > + * Every find yourself encoding a string to another type or format? This
> > + * object is for you! Simply add your encoder and/or decoder to this and
> > + * share it with the world.
> > + */
> > +let StringEncoding = {
>
> Is this actually appropriate? To me string encoding is all about character
> encodings (UTF8, UTF16, etc.). Gzip seems different to that. Maybe
> StringConverter is better?
Sure.
> @@ +24,5 @@
> > + * @param onComplete
> > + * (function) Callback invoked with compressed output as a string
> > + * parameter.
> > + */
> > + gzipEncodeToString: function gzipEncodeToString(s) {
>
> This is a really confusing name. How about convertToGzip?
If we move this to a StringConverter type, then that would definitely make sense.
> I'm concerned that this is a synchronous method and people might try to use
> it for large chunks of data. What are the performance characteristics of
> this?
If you start with a String, you have all the data available. So, making this async would require splitting data manually, event loop spinning, etc. I'm not convinced that's a good idea.
I think a better idea is to just not use these functions if you have large values: use streams instead.
Comment 12•12 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #11)
> (In reply to Dave Townsend (:Mossop) from comment #9)
> > Comment on attachment 673043 [details] [diff] [review]
> > Implement string component with gzip encoding, v1
> >
> > Review of attachment 673043 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > As said elsewhere, let's pull the trigger and create toolkit/modules and put
> > this in there.
>
> Already done in my local patch (to be uploaded).
>
>
> > @@ +14,5 @@
> > > + * Every find yourself encoding a string to another type or format? This
> > > + * object is for you! Simply add your encoder and/or decoder to this and
> > > + * share it with the world.
> > > + */
> > > +let StringEncoding = {
> >
> > Is this actually appropriate? To me string encoding is all about character
> > encodings (UTF8, UTF16, etc.). Gzip seems different to that. Maybe
> > StringConverter is better?
>
> Sure.
>
> > @@ +24,5 @@
> > > + * @param onComplete
> > > + * (function) Callback invoked with compressed output as a string
> > > + * parameter.
> > > + */
> > > + gzipEncodeToString: function gzipEncodeToString(s) {
> >
> > This is a really confusing name. How about convertToGzip?
>
> If we move this to a StringConverter type, then that would definitely make
> sense.
>
> > I'm concerned that this is a synchronous method and people might try to use
> > it for large chunks of data. What are the performance characteristics of
> > this?
>
> If you start with a String, you have all the data available. So, making this
> async would require splitting data manually, event loop spinning, etc. I'm
> not convinced that's a good idea.
It shouldn't need any event loop spinning, you'd just make a stream out of the string and async feed it to the stream converter and call a callback when done.
> I think a better idea is to just not use these functions if you have large
> values: use streams instead.
I think that's a great idea too, but it's difficult to enforce that on people so providing a way to do it seems like a potential footgun for the future.
Assignee | ||
Comment 13•12 years ago
|
||
Changing bug to reflect its original purpose and moving out of Toolkit. Hopefully I planted a seed and Toolkit will add something similar one day.
Component: General → Firefox: Common
Product: Toolkit → Mozilla Services
Summary: Add a component for string encodings → Add function to compress string data
Assignee | ||
Comment 14•12 years ago
|
||
I wish the test were more robust. Then again, I'm not really testing the core stream encoding code, am I?
Assignee: nobody → gps
Attachment #673043 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #677883 -
Flags: review?(rnewman)
Assignee | ||
Comment 15•12 years ago
|
||
More generic solution to avoid XPCOM masturbation. I'm sure this function has already been implemented somewhere in the tree. But, I can't find it.
Attachment #677883 -
Attachment is obsolete: true
Attachment #677883 -
Flags: review?(rnewman)
Attachment #677889 -
Flags: review?(rnewman)
Assignee | ||
Comment 16•12 years ago
|
||
Forgot to qref.
Attachment #677889 -
Attachment is obsolete: true
Attachment #677889 -
Flags: review?(rnewman)
Attachment #677891 -
Flags: review?(rnewman)
Comment 17•12 years ago
|
||
Comment on attachment 677891 [details] [diff] [review]
Add CommonUtils.convertString, v2
Review of attachment 677891 [details] [diff] [review]:
-----------------------------------------------------------------
Let's get a UTF-8 test first. Even if the breakage is elsewhere in the codebase, I don't want to ship a handy-dandy utility that makes it easier to shoot oneself in the foot.
::: services/common/tests/unit/test_utils_convert_string.js
@@ +10,5 @@
> + run_next_test();
> +}
> +
> +add_test(function test_compress_string() {
> + const INPUT = "hello";
Please put non-ASCII characters into this test string. I opted for "pïgéons1" for my Android Sync text encoding test, but I encourage you to also pass the Turkey Test :)
E.g.,
"Árvíztűrő tükörfúrógép いろはにほへとちりぬるを Pijamalı hasta, yağız şoföre çabucak güvendi."
::: services/common/utils.js
@@ +611,5 @@
> + */
> + convertString: function convertString(s, source, dest) {
> + let is = Cc["@mozilla.org/io/string-input-stream;1"]
> + .createInstance(Ci.nsIStringInputStream);
> + is.setData(s, s.length);
Check for !s.
@@ +614,5 @@
> + .createInstance(Ci.nsIStringInputStream);
> + is.setData(s, s.length);
> +
> + let scs = Cc["@mozilla.org/streamConverters;1"]
> + .getService(Ci.nsIStreamConverterService);
Is this something we ought to be wrapping in a lazy getter? We wrap our scriptableunicodeconverter…
Attachment #677891 -
Flags: review?(rnewman) → feedback+
Assignee | ||
Comment 18•12 years ago
|
||
Additional Unicode string test did expose an error. Turns out JS String -> IDL's "string" throws away data in the high byte octet from JS strings. I documented this limitation in the API and made the test convert the input to an "octet string" before feeding into convertString(). That change made things work.
Attachment #677891 -
Attachment is obsolete: true
Attachment #678478 -
Flags: review?(rnewman)
Comment 19•12 years ago
|
||
Comment on attachment 678478 [details] [diff] [review]
Add CommonUtils.convertString, v3
Review of attachment 678478 [details] [diff] [review]:
-----------------------------------------------------------------
Hell yeah test coverage.
Attachment #678478 -
Flags: review?(rnewman) → review+
Assignee | ||
Comment 20•12 years ago
|
||
Whiteboard: [fixed-in-larch]
Comment 21•12 years ago
|
||
This is going to get superreview before landing in m-c right?
Assignee | ||
Comment 22•12 years ago
|
||
(In reply to Dave Townsend (:Mossop) from comment #21)
> This is going to get superreview before landing in m-c right?
Why would this need super review?
Flags: needinfo?(dtownsend+bugmail)
Comment 23•12 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #22)
> (In reply to Dave Townsend (:Mossop) from comment #21)
> > This is going to get superreview before landing in m-c right?
>
> Why would this need super review?
It adds an API
Flags: needinfo?(dtownsend+bugmail)
Comment 24•12 years ago
|
||
Because it's an API, under the SR policy: http://www.mozilla.org/hacking/reviewers.html
Feel free to tag me for SR.
Comment 25•12 years ago
|
||
Comment on attachment 678478 [details] [diff] [review]
Add CommonUtils.convertString, v3
SR flag has gone missing…?
Attachment #678478 -
Flags: review?(mconnor)
Updated•12 years ago
|
Attachment #678478 -
Flags: review?(mconnor) → superreview?(mconnor)
Comment 26•12 years ago
|
||
Comment on attachment 678478 [details] [diff] [review]
Add CommonUtils.convertString, v3
Bulk-setting approval flags for FHR landing for FxOS ADU ping (Bug 788894).
Attachment #678478 -
Flags: superreview?(mconnor)
Attachment #678478 -
Flags: approval-mozilla-beta?
Attachment #678478 -
Flags: approval-mozilla-aurora?
Comment 27•12 years ago
|
||
Whiteboard: [fixed-in-larch]
Updated•12 years ago
|
Target Milestone: --- → mozilla20
Comment 28•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Comment 29•12 years ago
|
||
Comment on attachment 678478 [details] [diff] [review]
Add CommonUtils.convertString, v3
FHR for B2G ADU ping, won't be build/enabled for Mobile/Desktop.
Attachment #678478 -
Flags: approval-mozilla-beta?
Attachment #678478 -
Flags: approval-mozilla-beta+
Attachment #678478 -
Flags: approval-mozilla-aurora?
Attachment #678478 -
Flags: approval-mozilla-aurora+
Comment 30•12 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•