Closed Bug 228270 Opened 21 years ago Closed 21 years ago

huge signons.txt file when the � character is used

Categories

(Toolkit :: Password Manager, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: michael.arnauts+mozilla, Assigned: bryner)

References

Details

(Keywords: intl)

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7

i noticed that my signons.txt file was 34 megabytes, and I found out that it was
because the use of a é character in a login-dialog of a page.

example:

MDIEEPgAAAAAAAAAAAAAAAAAAAEwFAYIKoZIhvcNAwcECB03JwZfb6hfakeYIGOexz2Ehw==
*p
MDIEEPgAAAAAAAAAAAAAAAAAAAEwFAYIKoZIhvcNAwcECKFn48F3OflfakewCcnUt4rdMw==
.
www.site.com:80 (Gebruikersnaam om deze priv
C3 83 C2 83 C3 82 C2 83 C3 83 C2 82 C3 82 C2 83 C3 83 C2 83 C3 82 C2 82 C3 ...
C2 82 C3 82 C2 A9 20
directory in te komen

that whole hex thing should be a "é" so it says (in dutch) "Gebruikersnaam om
deze privé directory in te komen" what means: "Username to enter this private
directory"

the numbers in that file were in hex (looks like a random C3,C2,83 and 82)

now, the bigger problem is when i go to the options and open the change saved
password dialog, firebird starts sucking all my memory up, and at last, you have
to close it to not let it crash windows.

Reproducible: Didn't try

Steps to Reproduce:
1. go to a page with a "é" in the login-dialog
2. use save password
3. go to the options and take vieu savzed password
(haven't reproduced it, but i think it went that way)

Actual Results:  
a 30 meg large file is created

Expected Results:  
it should save the file without that garbage data :)
the same story (i also have that press a key and wait a long time thing)
http://forums.mozillazine.org/viewtopic.php?t=40300
Can you attach a HTTP log? see:

http://www.mozilla.org/projects/netlib/http/http-debugging.html

for instructions on creating one.
hmm, i tried but i can't reproduce it.  the site where it happened first doesn't
have that directory protection anymore, and when i try to use a .htacess myself
with a é in, it works as it should.
My guess is that is has something to do with UTF-8 because the é character is C3
A9, and A9 was the last character in the garbage (and C3 was also used in the
garbage).
However, maybe it is already fixed and it was a bug in a previous nighty build
or release or so, and the garbage-signon file was just taken over from the
previous version.
Very odd.  Still, if we can't reproduce the bug, there's not much to do.  If you
see this again, please get an http log while you're able to (I also wasn't able
to reproduce this with my own htaccess file).
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → WORKSFORME
v.

If you can reproduce this at some point in the future and are able to attach an
http log, please reopen this bug (don't file a new one). Thanks.
Status: RESOLVED → VERIFIED
There must be a repeating process somewhere where the e-acute character is
converted from ISO-8859-1 to UTF-8, and then the UTF-8 is converted as if from
ISO-8859-1 to UTF-8, and so on, like this:

e9                        (correct ISO-8859-1)
c3 a9                     (correct UTF-8)
c3 83 c2 a9  
c3 83 c2 83 c3 82 c2 a9
etc.
reopening, I have figured this out.
Status: VERIFIED → UNCONFIRMED
Resolution: WORKSFORME → ---
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Ok, so the bug here has to do with NS_ReadLine.  We write out the password file
in UTF-8.  When we read it back in, NS_ReadLine reads data into a char* buffer,
and then uses |AssignWithConversion| to put it into an nsAutoString.  This
method does not do any conversion of character encoding, it only converts 1 byte
to 2 byte characters.  So now we have an nsAutoString containing UTF-8 encoded
data.  If we then call NS_ConvertUTF16toUTF8 on that nsAutoString, then each of
those UTF-8 bytes gets re-encoded as UTF-8 because the string is presumed to be
in UTF-16 encoding.

I wrote up a patch that uses AppendUTF8toUTF16 to transfer the data into the
nsAutoString in NS_ReadLine and that solves the problem for me.  However, there
are two things I'm not sure about:

- Is there some guarantee that the data read from the input stream should be UTF-8?
- What happens if the buffer ends in the middle of a UTF-8 sequence?
Attached patch guess at a fix (obsolete) — Splinter Review
Attachment #139323 - Flags: review?(darin)
> Is there some guarantee that the data read from the input stream should be UTF-8?

 I don't think there's any guarantee. It could be anything as far as I know. 
  
> What happens if the buffer ends in the middle of a UTF-8 sequence?

  AppendUTF8toUTF16 would complain loudly about an invalid UTF-8 sequence. 
See bug 230275 comment #6.

Keywords: intl
OS: Windows XP → All
Hardware: PC → All
NS_ReadLine is really designed to be used for ascii-only files (.mailcap and
.mime.types, when it was created).  Anyone using it for anything else is asking
for trouble.  ;)

Ideally, what we need to do is to change this thing to take a charset and use
that charset to decode (typically the native filesystem charset, but utf-8 in
your case).
mailcap file can contain non-ASCII characters (file paths can be non-ASCII) ;-).
Summary: huge signons.txt file when the é character is used → huge signons.txt file when the � character is used
Comment on attachment 139323 [details] [diff] [review]
guess at a fix

not the right solution for the reasons bz outlined.  we need to either stop
using NS_ReadLine in password manager or fix NS_ReadLine to handle charset
conversions.

i'm not sure i understand why a charset conversion needs to be performed at
this level.  NS_ReadLine could just look for EOL and return a byte-buffer
instead (stored in a nsACString).
Attachment #139323 - Flags: review?(darin) → review-
to clarify my last point: perhaps we should break "reading a line of text" and
"converting a line of text to unicode" into two separate tasks / functions
instead of trying to lump them into one.
The problem is that what EOL looks like (in bytes) depends on encoding (compare
UTf-16 and ascii).
Attached patch alternate fixSplinter Review
Since password manager knows it wrote out UTF-8, this patch just takes the wide
string containing utf-8 characters and puts it back into a narrow string.
Attachment #139323 - Attachment is obsolete: true
Attachment #139369 - Flags: superreview?(bz-vacation)
Attachment #139369 - Flags: review?(darin)
Comment on attachment 139369 [details] [diff] [review]
alternate fix

>Index: nsPasswordManager.cpp
>+      entry->userValue.Assign(NS_ConvertUTF8toUTF16(utf8Buffer));

NS_CopyUTF8toUTF16(utf8Buffer, entry->userValue);

>+      entry->passField.Assign(NS_ConvertUTF8toUTF16(Substring(utf8Buffer, 1,
>+                                                    utf8Buffer.Length() - 1)));

Same.

>+      entry->passValue.Assign(NS_ConvertUTF8toUTF16(utf8Buffer));

Same.

Add a nice XXX comment on the place where you call ReadLine that complains
about the bogus charset stuff going on and points to this code?  That way, this
code can be unmunged when ReadLine is made to suck less.

sr=bzbarsky.
Attachment #139369 - Flags: superreview?(bz-vacation) → superreview+
Comment on attachment 139369 [details] [diff] [review]
alternate fix

>Index: nsPasswordManager.cpp

>+        entry->userField.Assign(NS_ConvertUTF8toUTF16(utf8Buffer));

nit: use CopyUTF8toUTF16 to avoid intermediate/second buffer copy


>+      entry->userValue.Assign(NS_ConvertUTF8toUTF16(utf8Buffer));

same nit applies here.


>+      entry->passField.Assign(NS_ConvertUTF8toUTF16(Substring(utf8Buffer, 1,
>+                                                    utf8Buffer.Length() - 1)));

same nit


>+      entry->passValue.Assign(NS_ConvertUTF8toUTF16(utf8Buffer));

same nit


r=darin with that change
Attachment #139369 - Flags: superreview?(bz-vacation)
Attachment #139369 - Flags: superreview+
Attachment #139369 - Flags: review?(darin)
Attachment #139369 - Flags: review+
Attachment #139369 - Flags: superreview?(bz-vacation) → superreview+
*** Bug 223848 has been marked as a duplicate of this bug. ***
*** Bug 230911 has been marked as a duplicate of this bug. ***
checked in.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
*** Bug 233917 has been marked as a duplicate of this bug. ***
*** Bug 218405 has been marked as a duplicate of this bug. ***
*** Bug 233651 has been marked as a duplicate of this bug. ***
This regressed but was fixed again in bug 218405.
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: