Closed Bug 1650637 Opened 2 years ago Closed 2 years ago

Cyrillic URL parameter is split when passed from the command line

Categories

(Toolkit :: Startup and Profile System, defect)

80 Branch
Unspecified
Windows 8.1
defect

Tracking

()

RESOLVED FIXED
81 Branch
Tracking Status
firefox81 --- fixed

People

(Reporter: haswellready, Assigned: toshi)

References

Details

Attachments

(4 files)

User Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0

Steps to reproduce:

Actual results:

  • two tabs are opened and both are with wrong URLs

Expected results:

  • one tab is opened and the URL is the same as in the command line (in spite of upper/lower case of the text passed as argument)

Could someone take a look at this bug?
I think it is important.
Can't open many web-pages correctly due to that.

Flags: needinfo?(ehumphries)

Hello,
I tried reproducing this issue on the latest version of Firefox Nightly 80.0a1 (2020-07-12), beta 79.0b7 and release 78.0.1 using Run with both urls : https://www.google.com/search?q=Эр-рияд and https://www.google.com/search?q=Эр-Pияд but the result was the same : a google search page for the city Riad. This has the same behaviour when I open them with Chrome.
Switching to a "p" or "P" on an English keyboard provides with different search results.

Setting a component for this issue in order to get the dev team involved.
If you feel it's an incorrect one please feel free to change it to a more appropriate one.

Component: Untriaged → Layout: Text and Fonts
Product: Firefox → Core

Still reproducible on my side on Firefox Nightly 80a.
Here are my steps on a video (a little bit changed):
https://youtu.be/6mHXBmMZ8Ts

Looks like it depends on multiple Firefox windows and doesn't depend on the letter passed as a parameter.

I believe this is a Launcher Process bug, but confirming.

Component: Layout: Text and Fonts → Launcher Process
Flags: needinfo?(ehumphries)
Product: Core → Firefox

Moving to Firefox::File Handling per :mossop.

:haswellready, I can't replicate this with Nightly on Windows 10, however my Windows 10 system is en-US locale.

Component: Launcher Process → File Handling
Flags: needinfo?(haswellready)

The video shows that for the case with the Cyrillic character Р Firefox displays a unicode unavailable glyph. Then splits the rest of the URL to another tab with the address xn--d1ah3e which is ияд.

OS: Unspecified → Windows 8.1
Summary: Wrong URL is opened when passed as parameter to firefox.exe → Cyrillic URL parameter is split when passed from the command line

I was able to reproduce the issue on Win8.1 + ru locale, while no repro on Win10.

The Cyrillic character "Р" is represented as "D0 A0" in utf-8. Because the function isspace considers 0xA0 (= NBSP) as a space, HandleCommandLine incorrectly split a URL into two parts as Kevin mentioned. I think the problem is the launcher process drops quotes when launching the browser process. Adding 0xA0 to here would fix this problem.

The behavior of isspace is mysterious. It's causing the difference of Win10 vs Win8.1. And also, it returns false for 0xA0 n my private version of Firefox somehow and no repro, while the official Nightly returns true, causing this problem.

The problem is nsWinRemoteClient converts a command string from utf16 to utf8 before putting it into a window message, and then nsWinRemoteServer parses that utf8 string byte by byte, as ascii. This is causing a white-space problem. We need a utf16 version of HandleCommandLine, convert a handled string to utf-8, and then pass it to nsCommandLine::Init.

Given that, changing the component to toolkit.

Assignee: nobody → tkikuchi
Component: File Handling → Startup and Profile System
Product: Firefox → Toolkit
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(haswellready)

The bug is Windows-specific so I've cleared the flag "needinfo".

Thank you, the bug will now show up in the Toolkit team's triage queue.

I have an idea to implement comment 10, refactoring HandleCommandLine in nsWinRemoteServer.cpp. Let me prepare a patch.

This patch takes our the logic to parse a command line string as a new template
class CommandLineParserWin from HandleCommandLine to prepare to convert it
into the utf-16 version.

This patch just moves the CommandLineParserWin class to toolkit/xre/
without any change and adds a unittest for it.

Depends on D84226

This patch introduces two classes WinRemoteMessageSender/Receiver to
generate/parse a payload of WM_COPYDATA message which is sent from
nsWinRemoteClient to nsWinRemoteServer.

And, we change the encoding of the payload from utf-8 to utf-16,
incrementing the message version from 1 to 2, because our parsing
logic on nsWinRemoteServer does not support a variable-width character
encoding like utf-8.

Depends on D84227

We should not use isspace() according to bug 1485588.

Depends on D84228

The severity field is not set for this bug.
:mossop, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(dtownsend)
Pushed by ncsoregi@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f2de31a4a8c4
Part1: Split HandleCommandLine into CommandLineParserWin and nsCommandLine.  r=mossop
https://hg.mozilla.org/integration/autoland/rev/34307f33b722
Part2: Move CommandLineParserWin to toolkit/xre/.  r=mossop
https://hg.mozilla.org/integration/autoland/rev/2fb1abdecbc7
Part3: Introduce WinRemoteMessageSender/Receiver.  r=mossop
https://hg.mozilla.org/integration/autoland/rev/3f85eff2ecbe
Part4: Replace isspace() with wcschr().  r=mossop
Regressions: 1656185
Severity: -- → S3
Flags: needinfo?(dtownsend)
Regressions: CVE-2021-29964
No longer regressions: CVE-2021-29964
You need to log in before you can comment on or make changes to this bug.