Closed Bug 1437156 Opened 6 years ago Closed 6 years ago

Minidump analyzer assuming utf-8 command-line arguments on Windows

Categories

(Toolkit :: Crash Reporting, enhancement, P1)

Unspecified
Windows
enhancement

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: bugzilla, Assigned: bugzilla)

References

(Depends on 1 open bug)

Details

(Whiteboard: inj+)

Attachments

(2 files, 4 obsolete files)

We're probably missing some crash reports (particularly with users with non-ascii usernames) because we're assuming UTF-8.

We either need to use wmain on Windows or we need to convert arguments from CP_ACP.
Assignee: nobody → aklotz
Status: NEW → ASSIGNED
This patch compiles on MSVC, gcc+mingw, and on non-Windows platforms. There are three main components to the patch:

(1) Cleaning up of the conversion functions in MinidumpAnalyzerUtils.h;
(2) Universally using wmain on Windows so that argv is UTF-16;
(3) Rather than going around and modifying the entire program, I just templatized ParseArguments so that it converts UTF-16 to UTF-8 when presented with a wchar_t based argv.
Attachment #8949892 - Flags: review?(ted)
Oops, needed to clean up something else.
Attachment #8949892 - Attachment is obsolete: true
Attachment #8949892 - Flags: review?(ted)
Attachment #8949899 - Flags: review?(ted)
Last update, I promise!
Attachment #8949899 - Attachment is obsolete: true
Attachment #8949899 - Flags: review?(ted)
Attachment #8949918 - Flags: review?(ted)
Priority: -- → P1
Whiteboard: inj+
Attachment #8949918 - Attachment is obsolete: true
Attachment #8949918 - Flags: review?(ted)
Attachment #8949926 - Flags: review?(ted)
No longer blocks: 1423897
Depends on: 1423897
Comment on attachment 8949926 [details] [diff] [review]
Clean up minidump-analyzer's unicode handling on Windows (r3)

Review of attachment 8949926 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/crashreporter/minidump-analyzer/Win64ModuleUnwindMetadata.cpp
@@ +71,5 @@
>    : mPath(aPath)
>  {
>    // Convert wchar to native charset because ImageLoad only takes
>    // a PSTR as input.
> +  std::string code_file = UTF8ToMBCS(aPath);

Please do never convert paths to MBCS. It is lossy. It will fail if the path contains a Unicode character that is not in the ANSI code page.

::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
@@ +344,5 @@
>    ofstream* file = new ofstream();
>    file->open(UTF8ToWide(aFilename).c_str(), mode);
>  #else   // GCC
>    ofstream* file =
> +    new ofstream(UTF8ToMBCS(aFilename).c_str(), mode);

Please use mozilla::OFStream that have just landed (bug 1428614).
Attachment #8949926 - Flags: feedback-
(In reply to Masatoshi Kimura [:emk] from comment #8)
> Comment on attachment 8949926 [details] [diff] [review]
> Clean up minidump-analyzer's unicode handling on Windows (r3)
> 
> Review of attachment 8949926 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/crashreporter/minidump-analyzer/Win64ModuleUnwindMetadata.cpp
> @@ +71,5 @@
> >    : mPath(aPath)
> >  {
> >    // Convert wchar to native charset because ImageLoad only takes
> >    // a PSTR as input.
> > +  std::string code_file = UTF8ToMBCS(aPath);
> 
> Please do never convert paths to MBCS. It is lossy. It will fail if the path
> contains a Unicode character that is not in the ANSI code page.

I know that. My patch didn't actually materially change anything there, but the existing code calls ImageLoad which is ANSI only.

I can file a follow-up bug where we replace ImageLoad with our own PE loading, but I don't consider that in scope for this bug (where MBCS was being misused as UTF-8).

> 
> ::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
> @@ +344,5 @@
> >    ofstream* file = new ofstream();
> >    file->open(UTF8ToWide(aFilename).c_str(), mode);
> >  #else   // GCC
> >    ofstream* file =
> > +    new ofstream(UTF8ToMBCS(aFilename).c_str(), mode);
> 
> Please use mozilla::OFStream that have just landed (bug 1428614).

I can do that.
This revision uses mozilla::OFstream.
Attachment #8949926 - Attachment is obsolete: true
Attachment #8949926 - Flags: review?(ted)
Attachment #8951026 - Flags: review?(ted)
Attached file marionette test log
I was able to replicate the problem with a Marionette test. Aaron, can you please have a look and tell me if that would be the expected failure in case of minidump stackwalk cannot correctly process the generated minidump files?
Flags: needinfo?(aklotz)
Comment on attachment 8951026 [details] [diff] [review]
Clean up minidump-analyzer's unicode handling on Windows (r4)

Review of attachment 8951026 [details] [diff] [review]:
-----------------------------------------------------------------

Every time I look at this code I think we should spend the time to rewrite it in Rust because fundamental stuff like dealing with strings actually works in Rust.

::: toolkit/crashreporter/minidump-analyzer/MinidumpAnalyzerUtils.h
@@ +103,5 @@
>  
> +static inline std::wstring
> +MBCSToWide(const std::string& aMbStr, bool* aSuccess = nullptr)
> +{
> +  return MBCPToWide(aMbStr, CP_ACP, aSuccess);

I'm just now realizing that it's pretty confusing to have functions where the only difference is `MBCS` or `MBCP` in the name, but I guess this is already how it is.

::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
@@ +341,5 @@
>  {
>    ios_base::openmode mode = ios::out | ios::app;
>  
>  #if defined(XP_WIN)
> +  OFStream* file = new OFStream(UTF8ToWide(aFilename).c_str(), mode);

Now that this is using `OFStream` we could probably get away with just using the constructor and not manual `new` / `delete`, but I've written so much Rust lately that my C++ is fuzzy here. Maybe we'd have to get rid of the `OpenAppend` helper entirely to make that work?

(Related: TIL that calling `c_str` on a temporary is actually a defined thing in the C++ standard: https://stackoverflow.com/a/10006989/69326)

@@ +447,5 @@
>  
> +#if defined(XP_WIN)
> +// WARNING: Windows does *NOT* use UTF8 for char strings off the command line!
> +// Using wmain here so that the CRT doesn't need to perform a wasteful
> +// UTF-16 to MBCS conversion; ParseArguments will convert to UTF8 directly.

You might change the wording here: it's not just wasteful, it's lossy.
Attachment #8951026 - Flags: review?(ted) → review+
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #12)
> Comment on attachment 8951026 [details] [diff] [review]
> Clean up minidump-analyzer's unicode handling on Windows (r4)
> 
> Review of attachment 8951026 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/crashreporter/minidump-analyzer/minidump-analyzer.cpp
> @@ +341,5 @@
> >  {
> >    ios_base::openmode mode = ios::out | ios::app;
> >  
> >  #if defined(XP_WIN)
> > +  OFStream* file = new OFStream(UTF8ToWide(aFilename).c_str(), mode);
> 
> Now that this is using `OFStream` we could probably get away with just using
> the constructor and not manual `new` / `delete`, but I've written so much
> Rust lately that my C++ is fuzzy here. Maybe we'd have to get rid of the
> `OpenAppend` helper entirely to make that work?

I've removed OpenAppend and just replaced that with plain-old RAII.

> 
> @@ +447,5 @@
> >  
> > +#if defined(XP_WIN)
> > +// WARNING: Windows does *NOT* use UTF8 for char strings off the command line!
> > +// Using wmain here so that the CRT doesn't need to perform a wasteful
> > +// UTF-16 to MBCS conversion; ParseArguments will convert to UTF8 directly.
> 
> You might change the wording here: it's not just wasteful, it's lossy.

Will do.
(In reply to Henrik Skupin (:whimboo) from comment #11)
> Created attachment 8954145 [details]
> marionette test log
> 
> I was able to replicate the problem with a Marionette test. Aaron, can you
> please have a look and tell me if that would be the expected failure in case
> of minidump stackwalk cannot correctly process the generated minidump files?

That's half of it, yes.
The other half of it is that the .extra file associated with the crash will not be touched. In proper operation, the .extra file is updated with an extra line beginning with the string "StackTraces=".
Flags: needinfo?(aklotz)
https://hg.mozilla.org/mozilla-central/rev/6eefe0f3ed6f
https://hg.mozilla.org/mozilla-central/rev/5f73a7ca725d
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: