Last Comment Bug 752756 - Extend nsINIParser/nsIINIParser/nsIINIParserWriter to read/write UTF-16LE INI files with BOM
: Extend nsINIParser/nsIINIParser/nsIINIParserWriter to read/write UTF-16LE INI...
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: x86 Windows 7
: -- normal (vote)
: mozilla15
Assigned To: Tim Abraldes [:TimAbraldes] [:tabraldes]
:
: Nathan Froyd [:froydnj]
Mentors:
Depends on:
Blocks: 751177
  Show dependency treegraph
 
Reported: 2012-05-07 17:25 PDT by Tim Abraldes [:TimAbraldes] [:tabraldes]
Modified: 2012-05-24 09:36 PDT (History)
2 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch v1 - Parse UTF-16LE BOM in nsINIParser on Windows. Add charset parameter to nsIINIParserWrite::writeFile() (10.41 KB, patch)
2012-05-16 11:52 PDT, Tim Abraldes [:TimAbraldes] [:tabraldes]
no flags Details | Diff | Splinter Review
Patch v2 - Only allow UTF-16LE and UTF-8. Fix previous patch so files actually get written. (11.14 KB, patch)
2012-05-16 18:53 PDT, Tim Abraldes [:TimAbraldes] [:tabraldes]
no flags Details | Diff | Splinter Review
Patch v3 - Use flag for UTF-16LE. Add unit tests. (27.81 KB, patch)
2012-05-21 17:33 PDT, Tim Abraldes [:TimAbraldes] [:tabraldes]
no flags Details | Diff | Splinter Review
Patch v4 - Hopefully fix Linux and Android bustage (28.77 KB, patch)
2012-05-22 00:21 PDT, Tim Abraldes [:TimAbraldes] [:tabraldes]
benjamin: review+
Details | Diff | Splinter Review
Patch v5 - Only write UTF16LE on Windows (28.81 KB, patch)
2012-05-22 12:39 PDT, Tim Abraldes [:TimAbraldes] [:tabraldes]
timabraldes: review+
Details | Diff | Splinter Review
Patch v6 - Cleaning up bitrot (27.00 KB, patch)
2012-05-22 23:52 PDT, Tim Abraldes [:TimAbraldes] [:tabraldes]
timabraldes: review+
timabraldes: checkin+
Details | Diff | Splinter Review

Description Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-07 17:25:14 PDT
UTF-16LE is the preferred flavor of Unicode file encoding on Windows.  Other tools like NSIS cannot read/write Unicode INI files unless they are UTF-16LE with a BOM.  This causes problems for us when we attempt to natively install/uninstall webapps (see bug 751177).

At least on Windows, we should support reading/writing INI files that are encoded as UTF-16LE.  Additionally, we should probably support BOMs for UTF-8 and for UTF-16LE.
Comment 1 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-16 11:52:54 PDT
Created attachment 624473 [details] [diff] [review]
Patch v1 - Parse UTF-16LE BOM in nsINIParser on Windows. Add charset parameter to nsIINIParserWrite::writeFile()

I've tested this manually, but it seems likely that automated tests are desired for this patch.  I'll look into writing those now.

A couple of things are worth noting:
  Although nsIINIParserWriter can now write INI files in any unicode encoding that nsIConverterOutputStream supports, nsINIParser can only read UTF-8 and, on Windows, UTF-16LE encoded INI files.
  nsINIParser is used from nsBrowserApp and from the webapprt stub, which means that it cannot use XPCOM because XPCOM might not have been initialized by the time nsINIParser is used.

Benjamin (or whoever Benjamin delegates to) - how do you feel about this approach?
Comment 2 Benjamin Smedberg [:bsmedberg] 2012-05-16 11:59:11 PDT
Comment on attachment 624473 [details] [diff] [review]
Patch v1 - Parse UTF-16LE BOM in nsINIParser on Windows. Add charset parameter to nsIINIParserWrite::writeFile()

I don't think there's any reason to support arbitrary encodings in the writer, just UTF8 and UTF16. And document that UTF16 is one of those windows-only things that shouldn't be used in cross-platform code.
Comment 3 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-16 18:53:36 PDT
Created attachment 624622 [details] [diff] [review]
Patch v2 - Only allow UTF-16LE and UTF-8.  Fix previous patch so files actually get written.

Implemented feedback; this patch allows only UTF-16LE and UTF-8 INI files to be written by nsIINIParserWriter::writeFile().  Also, the previous patch was broken due to a finish()/close() mismatch.  Files now get written correctly.

Still to do:
  Track down memory leak caused by this patch
  Add unit tests
  Do we need a class that implements both nsIConverterOutputStream and nsISafeOutputStream?
Comment 4 Benjamin Smedberg [:bsmedberg] 2012-05-17 06:20:03 PDT
In terms of the API, I don't like that you are accepting an arbitrary charset string (especially undocumented!). I think it would be better to just use a flag, i.e.

interface nsIINIParserWriter ... {
  /**
   * Windows and the NSIS installer code sometimes expect INI files to be in
   * UTF-16 encoding. On Windows only, this flag to writeFile can be used to
   * change the encoding from its default UTF-8.
   */
  const unsigned long WRITE_UTF16 = 0x1;

  void writeFile([optional] in nsILocalFile aINIFile,
                 [optional] in unsigned long aFlags);

And then in the impl, instead of getting a converter each time, just have __utf8Converter and __utf16Converter.
Comment 5 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-21 17:33:18 PDT
Created attachment 625830 [details] [diff] [review]
Patch v3 - Use flag for UTF-16LE.  Add unit tests.

Implemented bsmedberg's suggestion (`writeFile` takes a flag specifying whether to write UTF-16LE or not, rather than taking an arbitrary string).

Added unit tests.

Benjamin: Please let me know what you think! (or delegate)
Comment 6 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-21 17:37:29 PDT
Try run:
https://tbpl.mozilla.org/?tree=Try&rev=4c6387ce091a
Comment 7 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-22 00:21:42 PDT
Created attachment 625909 [details] [diff] [review]
Patch v4 - Hopefully fix Linux and Android bustage

Fixing bustage from previous patch.  I looked through mxr and couldn't find any references to `nsINIParser::writeFile()` other than the one causing the Linux and Android bustage.  If there are more they'll get flushed out in this try run:
  https://tbpl.mozilla.org/?tree=Try&rev=82d2598a82cc
Comment 8 Benjamin Smedberg [:bsmedberg] 2012-05-22 06:24:36 PDT
Comment on attachment 625909 [details] [diff] [review]
Patch v4 - Hopefully fix Linux and Android bustage

It would probably be good to enforce that we only write UTF16 on Windows, either with a build-time ifdef or a runtime `'nsIWindowsRegKey' in Components.interfaces` check
Comment 9 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-22 12:39:36 PDT
Created attachment 626132 [details] [diff] [review]
Patch v5 - Only write UTF16LE on Windows

Carrying forward r+

Pushed to try along with the patch for bug 751177:
  https://tbpl.mozilla.org/?tree=Try&rev=102cb00af075

jsmith will do a quick verification that this patch works and doesn't break INI processing:
  1) Install Ed's app on Windows and Mac
  2) Verify that Ed's app launches on Windows and Mac
  3) Uninstall Ed's app

If all goes well I'll land this tonight
Comment 10 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-22 23:52:16 PDT
Created attachment 626346 [details] [diff] [review]
Patch v6 - Cleaning up bitrot
Comment 11 Tim Abraldes [:TimAbraldes] [:tabraldes] 2012-05-23 10:12:08 PDT
Comment on attachment 626346 [details] [diff] [review]
Patch v6 - Cleaning up bitrot

https://hg.mozilla.org/integration/mozilla-inbound/rev/16e329207fa2
Comment 12 Ed Morley [:emorley] 2012-05-24 09:36:29 PDT
https://hg.mozilla.org/mozilla-central/rev/16e329207fa2

Note You need to log in before you can comment on or make changes to this bug.