Closed Bug 167434 Opened 20 years ago Closed 11 years ago

zero-width no-break space rendered as a dot

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: psheerin, Assigned: smontagu)

References

()

Details

(Keywords: intl)

Attachments

(2 files)

I'm not sure how far back this bug goes, but once upon a time (though I could be
wrong), Netscape 6.x rendered the character  (zero width no break space)
correctly, keeping any two characters on either side of it glued together
without any space between them.

In the latest builds of Mozilla, however, it is rendered as a space with a
hyphen-like middle dot. This is frustrating because this is one of the best ways
to keep some things (like piece fractions or words on either side of an em dash)
from breaking across lines.

Out of all the Unicode space characters, this is the only one Mozilla is getting
wrong. For comparrison, IE only gets this, the regular space, and the
non-breaking space correct in most fonts (Arial Unicode being the one exeption).

This should be an easy one to fix, I hope...
I discovered something interesting that may help pinpoint the bug, and explain
why I thought this bug was not present in earlier releases.

It exists if you use the numeric entity reference, but not if you insert the
character as UTF-8.

This appears to be the only space character that has that problem, but there is
a chance the bug goes deeper than this, and that other characters may work when
inserted as UTF-8 but not by entity reference.
Peter, could you possibly attach a small file that shows both methods?  (we're
talking about 20 chars here).  That would help greatly in pinpointing the bug...
Assignee: attinasi → yokoyama
Component: Layout → Internationalization
QA Contact: petersen → ruixu
This renders correctly on IE (it looks like a complex fraction) but has these
weird dot spaces in Mozilla
Confirmed bug, see attachment #1 [details] [diff] [review]
Status: UNCONFIRMED → NEW
Ever confirmed: true
Update to last comment. See attachment #98428 [details] in comment #3
I see someone beat me to the punch with the attachment. I can add, though, that
this bug doesn't occurr on the latest Mac build.
OK.. In linux I get a box that says

Z W
NBS

in it for the non-breaking spaces...  Could someone also attach a testcase in
which the non-breaking space is encoded in UTF8 for comparison purposes?
Tests the display of piece fractions built using various non-breaking
characters to keep the fractions from breaking across lines. Meant as a way of
testing the display of numeric entity and UTF-8 rendering of these characters.
I'm attaching a testcase with both numeric entity and UTF-8 references. Also,
after reading http://www.w3.org/TR/unicode-xml/#BOM, (which says don't use
ZWNBS; use Word Joiner instead) I've revised the test to include several other
characters that could be used in its place. Most of them don't render correctly
in Mozilla, even as UTF-8.
OK. I don't get that weird ZWNBS box in the UTF8 case.  Sounds like glyph
substitution is kicking in there, but not for numeric entities (and the fonts
involved have bogus chars at that codepoint).
OS: Windows XP → All
Hardware: PC → All
Keywords: intl
QA Contact: ruixu → ylong
rendering issue. over to shanjian
Assignee: yokoyama → shanjian
I didn't not see any problem on windows except in the 2nd testcase, zero width
word joiner is being displayed as '?'. In my understanding, we didn't do any
special processing for those characters. If the font contains a incorrect glyph,
it will be displayed incorrectly. But I don't understand the inconsistency
between utf8 and NCR. I need to try this on linux. 
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
and you marked this fixed to remind you to try it?  ;)

I can post a screenshot of Linux display if you care.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Our UTF-8 converter strips U+FEFF from the input because it assumes it is
intended as a BOM. This is *exactly* why using ZERO WIDTH NO-BREAK SPACE is not
recommended, as in the reference cited in comment 9.
So this is done on purpose. Marking as WONTFIX
Status: REOPENED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → WONTFIX
Um.  No one has said this is done on purpose.  If the U+FEFF is stripped, why is
anything at all showing in that spot?

Please do not wontfix bugs unless you are the owner or a peer for the module or
the maintainer of the code in question, ok?
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Stripping the U+FEFF is indeed done on purpose, but that doesn't mean it's right
:-) My first thought was that we should perhaps only do the stripping when it's
the first character in a stream, assuming that it wouldn't make sense as a BOM
anywhere else. I'm not sure how safe that assumption is in the first place, and
the converter doesn't have any knowledge about position in streams, which makes
the idea impractical. All that really belongs in another bug, and the real
problem here should be fixed by bug 205387.

Boris, I can't reconcile your question "If the U+FEFF is stripped, why is
anything at all showing in that spot?" with your earlier comment "I don't get
that weird ZWNBS box in the UTF8 case."
Depends on: 205387
Oh, sorry.  I'd misread comment 0 -- the dot shows precisely when the U+FEFF is
not present in the input in raw form, hence not stripped...
On windwos, word joiner is rendered as invisible thanks to fixes for bug 221024
and bug 205387. On other platforms, it's not yet fixed. BTW, is this also about
line breaking around word joiner, ZWNBS, etc? That should be a separate bug if
one hasn't been yet filed. 
shanjian is no longer working on mozilla for 2 years and these bugs are still
here. Mark them won't fix. If you want to reopen it, find a good owner first. 
Status: REOPENED → RESOLVED
Closed: 19 years ago17 years ago
Resolution: --- → WONTFIX
Mass Re-open of Frank Tangs Won't fix debacle. Spam is his responsibility not my own
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Mass Re-assinging Frank Tangs old bugs that he closed won't fix and had to be
re-open. Spam is his fault not my own
Assignee: shanjian → nobody
Status: REOPENED → NEW
Assignee: nobody → smontagu
QA Contact: amyy → i18n
I believe this should be closed as fixed. The attachments look fine to me with Firefox 3.6, 5 and 8 on Windows and 3.6 and 8 on Ubuntu.

(IE 9 has some issues, but that's not our problem.)
WORKSFORME per comment 23
Status: NEW → RESOLVED
Closed: 17 years ago11 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.