Open Bug 1454938 Opened 2 years ago Updated 2 years ago

Support Unicode paths in certutil

Categories

(NSS :: Tools, enhancement, P3)

Unspecified
Windows
enhancement

Tracking

(Not tracked)

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

From the log:

https://treeherder.mozilla.org/logviewer.html#?job_id=174313967&repo=try&lineNumber=1764-1765

> 13:05:01     INFO -  Z:\task_1524055816\build\tests\bin\certutil.exe: function failed: SEC_ERROR_BAD_DATABASE: security library: bad database.
> 13:05:01    ERROR -  0 ERROR TEST-UNEXPECTED-FAIL | runtests.py | Certificate integration failed
Summary: Intermittent runtests.py | Certificate integration failed → Intermittent runtests.py | Certificate integration failed when Unicode is in profile path
I don't think NSS supports Unicode paths. :-(
Version: Version 3 → Trunk
That doesn't seem intermittent.
Summary: Intermittent runtests.py | Certificate integration failed when Unicode is in profile path → runtests.py | Certificate integration failed when Unicode is in profile path
I pushed another try build which exactly logs the certutil command as called via mozprocess. Given that I cannot reproduce locally on MacOS I think this might be Windows only.
Indeed, this is Windows only and happens when the -d and -f arguments point to a Unicode path:

https://treeherder.mozilla.org/logviewer.html#?job_id=174473671&repo=try&lineNumber=1706-1708

> Running command: Z:\task_1524122373\build\tests\bin\certutil.exe -N -d c:\users\task_1524122373\appdata\local\temp\tmpwpzymu.mozrunner%Unicode% -f c:\users\task_1524122373\appdata\local\temp\tmpwpzymu.mozrunner%Unicode%\.crtdbpw

Note, that Bugzilla fails in displaying Unicode so I replaced the cookie with %Unicode%.
Assignee: nobody → nobody
Component: General → Tools
Product: Firefox Build System → NSS
Summary: runtests.py | Certificate integration failed when Unicode is in profile path → Certutils fails with Unicode path specified for -d and -f arguments
Version: Trunk → other
If this isn't working, there's a bug. NSS should use UTF-8 for profile paths since bug 1428538 (NSS changes are in bug 1427276).
See Also: → 1427276
Does certutil support logging? If yes, how can it be enabled so that I can provide more details?
Summary: Certutils fails with Unicode path specified for -d and -f arguments → Certutils fails with Unicode path specified for -d and -f arguments: "function failed: SEC_ERROR_BAD_DATABASE: security library: bad database"
We want to enable profiles with a Unicode character by default for all harnesses which make use of mozprofile. So this problem kinda blocks us now. If someone could help us, we would kinda appreciate. Thanks.
OS: Unspecified → Windows
https://hg.mozilla.org/mozilla-central/annotate/0e45c13b34e815cb42a9f08bb44142d1a81e186e/security/nss/cmd/certutil/certutil.c#l3878
Certutil should take Unicode strings from CommandLineToArgvW or wmain. Otherwise non-ASCII command-line parameters will not work with sdb.
(In reply to Henrik Skupin (:whimboo) from comment #11)
> I assume the necessary change here is similar to what we have here?
> 
> https://dxr.mozilla.org/mozilla-central/rev/
> 0e45c13b34e815cb42a9f08bb44142d1a81e186e/toolkit/xre/test/win/
> TestXREMakeCommandLineWin.cpp#253

Well, something like that, but presumably the unicode characters are already lost by the time you get into main()?  So the right fix is really something like certutil using wmain on Windows, or possibly encoding the command-line arguments as UTF8 from within the harness(es)?
(In reply to Nathan Froyd [:froydnj] from comment #12)
> or possibly encoding the command-line
> arguments as UTF8 from within the harness(es)?

It does not work with non-English locales because some UTF-8-encoded byte sequences are invalid as some legacy encoding sequences.
I can look into this but mozilla-build doesn't seem to support UTF-8. So I don't really have a way to test it.
Priority: -- → P3
Summary: Certutils fails with Unicode path specified for -d and -f arguments: "function failed: SEC_ERROR_BAD_DATABASE: security library: bad database" → Support Unicode paths in certutil
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #16)
> I can look into this but mozilla-build doesn't seem to support UTF-8. So I
> don't really have a way to test it.

What makes you believe so? I thought it is supported. Ryan, can you give feedback too?
Flags: needinfo?(ryanvm)
IIUC, Windows cmd.exe and UTF8 don't always play nicely. 302 gps since I think he better understands it.
Flags: needinfo?(ryanvm) → needinfo?(gps)
I'm not sure why this requires UTF-8? Windows has the concept of the "local codepage", which is the encoding used for char* strings (and the versions of Windows APIs ending with 'A'). There is a UTF-8 codepage, but I don't know if it's possible to actually use that as the system default. Windows deals with Unicode strings as UTF-16 in wchar_t* (for APIs ending with 'W', with the caveat that like most filesystems it will let you write bytes that are not valid UTF-16 as filenames).

The crux of the issue here is that filenames containing characters that are not encodeable in the local codepage *cannot* be accessed via 'A' APIs that take char* strings, and you *must* use wchar_t* and the 'W' APIs.

You can easily create a testcase in Python:
>>> import os
>>> open(u'\u2764', 'wb').write('hello')
>>> os.listdir('.')
['?']
>>> open(u'\u2764', 'rb').read()
'hello'
What Ted said. In addition, https://www.mercurial-scm.org/wiki/EncodingStrategy contains a lot of random knowledge. https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx is the beginning of the rabbit hole on the Microsoft side. And this rabbit hole was dug by the Rabbit of Caerbannog.
Flags: needinfo?(gps)
Looking at the patch a little more, this is further complicated by the presence of Python.

IIRC Python 2.7 uses the POSIX C APIs (e.g. open()) for file I/O. Python 3 rewrote the I/O layer and it uses CreateFileW() so you can do nice things with Unicode.

Since process invocation is in play, something somewhere is calling CreateProcessA() or CreateProcessW(). IIRC mozprocess completely side-steps what Python does behind the scenes and calls CreateProcessW(). (I'd have to dig at source code to see what subprocess.Popen() is doing in the standard library.)

And since you appear to have a Unicode literal in the patch, the source encoding of the .py file comes into play. An "#encoding" in the file header will control that. If not defined, the default for Python will be used (either ascii or utf-8 most likely - depending on Python version). I don't like having the source file encoding come into play and I almost always use inline raw bytes (e.g. b'\xXX\xXX') of the code point in a specific encoding or u'\uXXXX' to represent the Unicode code point as a native Python unicode type.
NSS currently uses NSPR to parse command line arguments, which doesn't support unicode. So this won't be an easy change. Either NSS command line utils like certuil have to move away from NSPR to parse arguments or NSPR needs to add support for this.
(In reply to Gregory Szorc [:gps] from comment #21)
> Since process invocation is in play, something somewhere is calling
> CreateProcessA() or CreateProcessW(). IIRC mozprocess completely side-steps
> what Python does behind the scenes and calls CreateProcessW(). (I'd have to
> dig at source code to see what subprocess.Popen() is doing in the standard
> library.)

Yes, mozprocess has its own ctypes implementation for various Windows API methods including creation of a process. And those are using the wide-char methods. For details see https://dxr.mozilla.org/mozilla-central/source/testing/mozbase/mozprocess/mozprocess/winprocess.py

(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #22)
> NSS currently uses NSPR to parse command line arguments, which doesn't
> support unicode. So this won't be an easy change. Either NSS command line
> utils like certuil have to move away from NSPR to parse arguments or NSPR
> needs to add support for this.

Franziskus, is there a bug in getting NSPR supporting Unicode characters in parsing the command line arguments? If not, maybe you can file one with the appropriate information? Thanks!
Flags: needinfo?(franziskuskiefer)
See bug 1466521. There's not much extra information though.
Severity: normal → enhancement
Flags: needinfo?(franziskuskiefer)
You need to log in before you can comment on or make changes to this bug.