Closed Bug 18377 Opened 25 years ago Closed 25 years ago

Latin2 character E8 is not displayed in Input Value field

Tracking

()

Status:

VERIFIED FIXED

People

(Reporter: teruko, Assigned: teruko)

References

(
URL
)

Details

Attachments

(1 file)

reduced case 25 years ago rickg 254 bytes, text/html		Details

Teruko Kobayashi

Assignee

Description

•

25 years ago

Only E8 character "è" is not display INPUT Value field.

Steps of reproduce
1. go to above URL

Look at field after State, Country, Zip.
The 4 characters same as after test1 and test3 should be displayed in there.
However, the last "è" is missing.

Tested 110909 Win32, Mac, and Linux build.

Frank Tang

Updated

•

25 years ago

Assignee: ftang → erik

Frank Tang

Comment 1

•

25 years ago

window display problem. Assign to erik.

Erik van der Poel

Updated

•

25 years ago

Assignee: erik → teruko

Erik van der Poel

Comment 2

•

25 years ago

Frank, Teruko said that she tested Win32, Mac and Linux. So this is not a
Windows display problem.

Teruko, if this problems appears on Win32, Mac and Linux, please change the OS
field to All.

Teruko, I had a look at the URL above, and found that the document is in little
endian Unicode, even though the META charset says iso-8859-2. Either the
document shouldn't be in Unicode, or the META charset shouldn't say iso-8859-2,
right?

Re-assigning to Teruko so that she can fix the test page first.

Teruko Kobayashi

Assignee

Updated

•

25 years ago

OS: Windows NT → All

Teruko Kobayashi

Assignee

Updated

•

25 years ago

Assignee: teruko → erik

Teruko Kobayashi

Assignee

Comment 3

•

25 years ago

Ok, I fixed the test cases.

Erik van der Poel

Updated

•

25 years ago

Assignee: erik → rickg

Erik van der Poel

Comment 4

•

25 years ago

The 4th letter (0xE8, which is small c with caron in iso-8859-2) is indeed
missing in the State, Country and Zip fields. When I did a View Source, that
letter was missing even in the source, but not in Nav4's View Source. So this
may be a parser bug. Re-assigning to RickG.

rickg

Comment 5

•

25 years ago

Attached file reduced case — Details

rickg

Updated

•

25 years ago

Assignee: rickg → erik

rickg

Comment 6

•

25 years ago

The 4th character is truly missing in the display, but it is correctly handled
in the parser (a breakpoint in nsHTMLTokenizer::ConsumeAttributes proves it). I
suspect a font rendering problem.

Another interesting problem: viewsource doesn't display on this page (for me)
because the charset system is not correctly handling the meta tag. Returning to
erik for his opinion on the charset/font issue.

I've attached a min. test case.

Erik van der Poel

Updated

•

25 years ago

Assignee: erik → rickg

Erik van der Poel

Comment 7

•

25 years ago

I did some checking in the font engine on Windows, and it turns out that I do
see the 4th character 0xE8 the first time, but then the next time I only see
3 characters with different codes. The different codes are due to the META
charset causing a re-parse with the iso-8859-2 characters converted to Unicodes.
However, the loss of the 4th char is due to a different problem.

However, the font engine *is* receiving all 4 of the Unicodes later on in the
document (i.e. next to "test1"). This means that the font engine is working
properly (since it displays all 4 chars), and the charset converter is working
properly (since the final 0xE8 in iso-8859-2 becomes 0x010D in Unicode).

So we have a bug, and it is not in the charset converter, and not in the font
engine. It could be in the parser, or somewhere downstream between the parser
and font engine (e.g. content sink, style/frame system, etc).

This is just a wild guess, but the code 0x010D happens to have 0x0D in the least
significant byte, which is Carriage Return. Perhaps the HTML attribute parser
is looking for CR (0x0D) and LF (0x0A) to terminate the attribute value, and it
is masking the most significant byte in the Unicode so that it only sees the
least significant byte (i.e. 0x010D looks like 0x0D and fools the parser).

Returning to RickG for his opinion on my wild guess.

Erik van der Poel

Comment 8

•

25 years ago

By the way, View Source is working for me, even with the META charset. (Tree
pulled and built today.)

rickg

Updated

•

25 years ago

Status: NEW → ASSIGNED

rickg

Comment 9

•

25 years ago

Ok -- silly me. The real problem was that my tree (in san diego) had gone stale.
I've corrected the problem and will land it with my next update.

rickg

Updated

•

25 years ago

Status: ASSIGNED → RESOLVED

Closed: 25 years ago

Resolution: --- → FIXED

rickg

Comment 10

•

25 years ago

Fixed by change to nsStr where char's were being promoted with sign extended.

Teruko Kobayashi

Assignee

Updated

•

25 years ago

Status: RESOLVED → REOPENED

Teruko Kobayashi

Assignee

Comment 11

•

25 years ago

I tested this in 111708 Win32, 111709 build. This works fine.
However, in 111612 (I downloaded in 111708-m12 directory), the character 'E8'
does not show.  I need to reopen this.  I will test this in next Mac build.

Teruko Kobayashi

Assignee

Updated

•

25 years ago

Resolution: FIXED → ---

rickg

Updated

•

25 years ago

Assignee: rickg → teruko

Status: REOPENED → NEW

rickg

Comment 12

•

25 years ago

Can you please verify this before reopening? Also -- the build number you cite
with the problem is on the mac, I presume?

Teruko Kobayashi

Assignee

Updated

•

25 years ago

Status: NEW → RESOLVED

Closed: 25 years ago → 25 years ago

Resolution: --- → FIXED

Teruko Kobayashi

Assignee

Updated

•

25 years ago

Status: RESOLVED → VERIFIED

Teruko Kobayashi

Assignee

Comment 13

•

25 years ago

I tested this in 111708 Mac build.  This works fine. I think the fix was not
there in Mac build I tested before.  I see some other characters are not
displayed in Mac.  That is in bug 18095.

You need to log in before you can comment on or make changes to this bug.