Minidump analyzer assuming utf-8 command-line arguments on Windows

RESOLVED FIXED in Firefox 60

Status

()

P1
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: aklotz, Assigned: aklotz)

Tracking

(Depends on: 1 bug)

unspecified
mozilla60
Unspecified
Windows
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox60 fixed)

Details

(Whiteboard: inj+)

Attachments

(2 attachments, 4 obsolete attachments)

(Assignee)

Description

a year ago
We're probably missing some crash reports (particularly with users with non-ascii usernames) because we're assuming UTF-8.

We either need to use wmain on Windows or we need to convert arguments from CP_ACP.
(Assignee)

Updated

a year ago
Assignee: nobody → aklotz
Status: NEW → ASSIGNED
(Assignee)

Comment 2

a year ago
This patch compiles on MSVC, gcc+mingw, and on non-Windows platforms. There are three main components to the patch:

(1) Cleaning up of the conversion functions in MinidumpAnalyzerUtils.h;
(2) Universally using wmain on Windows so that argv is UTF-16;
(3) Rather than going around and modifying the entire program, I just templatized ParseArguments so that it converts UTF-16 to UTF-8 when presented with a wchar_t based argv.
Attachment #8949892 - Flags: review?(ted)
(Assignee)

Updated

a year ago
Blocks: 1436845
(Assignee)

Comment 4

a year ago
Oops, needed to clean up something else.
Attachment #8949892 - Attachment is obsolete: true
Attachment #8949892 - Flags: review?(ted)
Attachment #8949899 - Flags: review?(ted)
(Assignee)

Comment 6

a year ago
Last update, I promise!
Attachment #8949899 - Attachment is obsolete: true
Attachment #8949899 - Flags: review?(ted)
Attachment #8949918 - Flags: review?(ted)
(Assignee)

Updated

a year ago
Priority: -- → P1
Whiteboard: inj+
(Assignee)

Comment 7

a year ago
Attachment #8949918 - Attachment is obsolete: true
Attachment #8949918 - Flags: review?(ted)
Attachment #8949926 - Flags: review?(ted)
No longer blocks: 1423897
Depends on: 1423897
Comment on attachment 8949926 [details] [diff] [review]
Clean up minidump-analyzer's unicode handling on Windows (r3)

Review of attachment 8949926 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/crashreporter/minidump-analyzer/Win64ModuleUnwindMetadata.cpp
@@ +71,5 @@
>    : mPath(aPath)
>  {
>    // Convert wchar to native charset because ImageLoad only takes
>    // a PSTR as input.
> +  std::string code_file = UTF8ToMBCS(aPath);

Please do never convert paths to MBCS. It is lossy. It will fail if the path contains a Unicode character that is not in the ANSI code page.

::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
@@ +344,5 @@
>    ofstream* file = new ofstream();
>    file->open(UTF8ToWide(aFilename).c_str(), mode);
>  #else   // GCC
>    ofstream* file =
> +    new ofstream(UTF8ToMBCS(aFilename).c_str(), mode);

Please use mozilla::OFStream that have just landed (bug 1428614).
Attachment #8949926 - Flags: feedback-
(Assignee)

Comment 9

a year ago
(In reply to Masatoshi Kimura [:emk] from comment #8)
> Comment on attachment 8949926 [details] [diff] [review]
> Clean up minidump-analyzer's unicode handling on Windows (r3)
> 
> Review of attachment 8949926 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/crashreporter/minidump-analyzer/Win64ModuleUnwindMetadata.cpp
> @@ +71,5 @@
> >    : mPath(aPath)
> >  {
> >    // Convert wchar to native charset because ImageLoad only takes
> >    // a PSTR as input.
> > +  std::string code_file = UTF8ToMBCS(aPath);
> 
> Please do never convert paths to MBCS. It is lossy. It will fail if the path
> contains a Unicode character that is not in the ANSI code page.

I know that. My patch didn't actually materially change anything there, but the existing code calls ImageLoad which is ANSI only.

I can file a follow-up bug where we replace ImageLoad with our own PE loading, but I don't consider that in scope for this bug (where MBCS was being misused as UTF-8).

> 
> ::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
> @@ +344,5 @@
> >    ofstream* file = new ofstream();
> >    file->open(UTF8ToWide(aFilename).c_str(), mode);
> >  #else   // GCC
> >    ofstream* file =
> > +    new ofstream(UTF8ToMBCS(aFilename).c_str(), mode);
> 
> Please use mozilla::OFStream that have just landed (bug 1428614).

I can do that.
(Assignee)

Comment 10

a year ago
This revision uses mozilla::OFstream.
Attachment #8949926 - Attachment is obsolete: true
Attachment #8949926 - Flags: review?(ted)
Attachment #8951026 - Flags: review?(ted)
Posted file marionette test log
I was able to replicate the problem with a Marionette test. Aaron, can you please have a look and tell me if that would be the expected failure in case of minidump stackwalk cannot correctly process the generated minidump files?
Flags: needinfo?(aklotz)
Comment on attachment 8951026 [details] [diff] [review]
Clean up minidump-analyzer's unicode handling on Windows (r4)

Review of attachment 8951026 [details] [diff] [review]:
-----------------------------------------------------------------

Every time I look at this code I think we should spend the time to rewrite it in Rust because fundamental stuff like dealing with strings actually works in Rust.

::: toolkit/crashreporter/minidump-analyzer/MinidumpAnalyzerUtils.h
@@ +103,5 @@
>  
> +static inline std::wstring
> +MBCSToWide(const std::string& aMbStr, bool* aSuccess = nullptr)
> +{
> +  return MBCPToWide(aMbStr, CP_ACP, aSuccess);

I'm just now realizing that it's pretty confusing to have functions where the only difference is `MBCS` or `MBCP` in the name, but I guess this is already how it is.

::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
@@ +341,5 @@
>  {
>    ios_base::openmode mode = ios::out | ios::app;
>  
>  #if defined(XP_WIN)
> +  OFStream* file = new OFStream(UTF8ToWide(aFilename).c_str(), mode);

Now that this is using `OFStream` we could probably get away with just using the constructor and not manual `new` / `delete`, but I've written so much Rust lately that my C++ is fuzzy here. Maybe we'd have to get rid of the `OpenAppend` helper entirely to make that work?

(Related: TIL that calling `c_str` on a temporary is actually a defined thing in the C++ standard: https://stackoverflow.com/a/10006989/69326)

@@ +447,5 @@
>  
> +#if defined(XP_WIN)
> +// WARNING: Windows does *NOT* use UTF8 for char strings off the command line!
> +// Using wmain here so that the CRT doesn't need to perform a wasteful
> +// UTF-16 to MBCS conversion; ParseArguments will convert to UTF8 directly.

You might change the wording here: it's not just wasteful, it's lossy.
Attachment #8951026 - Flags: review?(ted) → review+
(Assignee)

Comment 13

a year ago
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #12)
> Comment on attachment 8951026 [details] [diff] [review]
> Clean up minidump-analyzer's unicode handling on Windows (r4)
> 
> Review of attachment 8951026 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
> @@ +341,5 @@
> >  {
> >    ios_base::openmode mode = ios::out | ios::app;
> >  
> >  #if defined(XP_WIN)
> > +  OFStream* file = new OFStream(UTF8ToWide(aFilename).c_str(), mode);
> 
> Now that this is using `OFStream` we could probably get away with just using
> the constructor and not manual `new` / `delete`, but I've written so much
> Rust lately that my C++ is fuzzy here. Maybe we'd have to get rid of the
> `OpenAppend` helper entirely to make that work?

I've removed OpenAppend and just replaced that with plain-old RAII.

> 
> @@ +447,5 @@
> >  
> > +#if defined(XP_WIN)
> > +// WARNING: Windows does *NOT* use UTF8 for char strings off the command line!
> > +// Using wmain here so that the CRT doesn't need to perform a wasteful
> > +// UTF-16 to MBCS conversion; ParseArguments will convert to UTF8 directly.
> 
> You might change the wording here: it's not just wasteful, it's lossy.

Will do.
(Assignee)

Comment 14

a year ago
(In reply to Henrik Skupin (:whimboo) from comment #11)
> Created attachment 8954145 [details]
> marionette test log
> 
> I was able to replicate the problem with a Marionette test. Aaron, can you
> please have a look and tell me if that would be the expected failure in case
> of minidump stackwalk cannot correctly process the generated minidump files?

That's half of it, yes.
The other half of it is that the .extra file associated with the crash will not be touched. In proper operation, the .extra file is updated with an extra line beginning with the string "StackTraces=".
Flags: needinfo?(aklotz)
(Assignee)

Comment 16

a year ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f73a7ca725dc8444763401a7aea8d968916c84b
Bug 1437156: Fix moz.build corruption from bad auto conflict resolution; r=bustage
https://hg.mozilla.org/mozilla-central/rev/6eefe0f3ed6f
https://hg.mozilla.org/mozilla-central/rev/5f73a7ca725d
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
status-firefox60: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.