Closed Bug 192636 Opened 22 years ago Closed 3 years ago

Map *-Latn languages to Latin script (ISO 15924 script codes)

Tracking

()

Status:

RESOLVED WORKSFORME

People

(Reporter: choess, Assigned: smontagu)

References

(Depends on 1 open bug,
URL
)

Details

(Keywords: testcase, Whiteboard: [bcp47])

Attachments

(3 files, 1 obsolete file)

Testcase to trigger the behavior 22 years ago Christopher Hoess (gone) 203 bytes, text/html		Details
Testcase with ja-Latn 21 years ago Christopher Hoess (gone) 208 bytes, text/html		Details
A screenshot showing -Latn text 9 years ago Derk-Jan Hartman 124.03 KB, image/png		Details
same list of -Latn in 2021 3 years ago Derk-Jan Hartman 217.90 KB, image/png		Details

Christopher Hoess (gone)

Reporter

Description

•

22 years ago

Pages entirely in us-ascii with, e.g., <span lang="ja">manga</span> will result
in a font download dialog for Japanese fonts if they are not installed. See the
post referenced in the URL field and related material linked therefrom. We
should not try to download fonts unless the page's character encoding actually
exceeds the available font repertoire, right?

Christopher Hoess (gone)

Reporter

Comment 1

•

22 years ago

Attached file Testcase to trigger the behavior (obsolete) — Details

Christopher Hoess (gone)

Reporter

Updated

•

22 years ago

Keywords: testcase

Yuying Long

Comment 2

•

22 years ago

reassign to Frank.

Assignee: smontagu → ftang

Frank Tang

Comment 3

•

22 years ago

>We should not try to download fonts unless the page's character encoding actually
>exceeds the available font repertoire, right?

Why not. Invalid bug. It does not really cause problem. It is by design to
improve text layout performance so we don't need to do a per char checking.

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → INVALID

Yuying Long

Comment 4

•

22 years ago

Mark as verified per previous comment.

Status: RESOLVED → VERIFIED

Dan Tobias

Comment 5

•

22 years ago

Fonts really should be associated with character repertoires, not languages...
these are two very different things.  As the original reporter noted, lang="ja"
denotes Japanese language, not any particular writing system, and can be used on
a Romanized representation of something in Japanese (e.g., "manga").  Assuming
that a Japanese font is needed in the absence of encountering a specific
character that requires such a font is not very reasonable.

Ben Meadowcroft

Comment 6

•

22 years ago

Just a quick note to second comment no #5
Typefaces relate to a character repertoire not a specific language. This bug
also specifically affects the accessibility of pages that use the lang attribute
to aid aural browsers (such as IBM's home page reader).

By prompting the standard user to download large font sets, that they don't
need, you are actively discouraging authors from using the lang attribute. This
then affects accessibility by not making available language information to aural
browsers. This is important because these browsers need to know the language in
order to pronounce the word correctly. For example a word that is marked up in a
"anglicised" japanese, eg konichiwa, will then be pronounced according to
japanese pronunciation rules rather than english rules. This can make a huge
difference to the comprehension of the spoken text.

I think this bug should be reopened, retaining the status quo has a strong
negative effect on the usefulness of the lang attribute.

Christopher Hoess (gone)

Reporter

Comment 7

•

21 years ago

smontagu noticed that language codes like ja-Latn are registered to indicated
Japanese written in Latin script. Page authors should mark up such text as <span
lang="ja-Latn">manga</span> and we should display it in the font selected for
Western, rather than Japanese. Reopening.

URL: http://groups.google.com/groups?selm=... → http://www.evertype.com/standards/iso...

Status: VERIFIED → REOPENED

Resolution: INVALID → ---

Summary: Font download dialog appears unnecessarily → Map *-Latn languages to Western script

Christopher Hoess (gone)

Reporter

Comment 8

•

21 years ago

handing to smontagu

Assignee: ftang → smontagu

Status: REOPENED → NEW

Christopher Hoess (gone)

Reporter

Comment 9

•

21 years ago

Attached file Testcase with ja-Latn — Details

Attachment #114061 - Attachment is obsolete: true

Simon Montagu :smontagu

Assignee

Comment 10

•

21 years ago

It seems the *-Latn convention is still controversial. I am still reading up the
archives at http://eikenes.alvestrand.no/pipermail/ietf-languages/ to discover
if there is a consensus and if so, what.

Henri Sivonen (:hsivonen)

Comment 11

•

21 years ago

ja-Latn is not listed at http://www.iana.org/assignments/lang-tags/

Simon Montagu :smontagu

Assignee

Comment 12

•

21 years ago

No, it isn't, but it could be :-). Currently registered are:

az-Latn Azerbaijani in Latin script
sr-Latn Serbian in Latin script
uz-Latn Uzbek in Latin script
yi-latn Yiddish, in Latin script

It seems possible that a future revision of RFC 3066 will formalize the use of
ISO 15924 script tags as part of language identifiers, making any other
combination, e.g. ja-Latn, valid without special registration at IANA.

Simon Montagu :smontagu

Assignee

Comment 13

•

18 years ago

(In reply to comment #12)
> It seems possible that a future revision of RFC 3066 will formalize the use of
> ISO 15924 script tags as part of language identifiers, making any other
> combination, e.g. ja-Latn, valid without special registration at IANA.

This is now RFCs 4646 and 4647

Simon Montagu :smontagu

Assignee

Updated

•

16 years ago

Summary: Map *-Latn languages to Western script → Map *-Latn languages to Western script (ISO 15924 script codes)

Shreevatsa R

Comment 15

•

16 years ago

It's marked as "Platform: x86 Windows 2000", but I see this bug on Mac OS X as well -- I suspect it's present on all platforms.

Here's a testcase: <span>normal</span> <span lang="en-Latn">en-Latn</span> <span lang="sa-Latn">sa-Latn</span> <span lang="ar-Latn">ar-Latn</span> <span lang="el-Latn">el-Latn</span> <span lang="ru-Latn">ru-Latn</span>. I see at least three fonts there.

Karl Tomlinson (:karlt)

Updated

•

16 years ago

OS: Windows 2000 → All

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: amyy → i18n

Derk-Jan Hartman

Comment 16

•

15 years ago

This should really be fixed. This issue is a problem on Wikipedia atm, where we are now left with the choice of forcing a latin compatible font on the user, or removing lang tags for transliterated text. This is rather suboptimal.

Derk-Jan Hartman

Comment 17

•

14 years ago

Due to apparent lack of progress on this issue in the past 7 years, and an increasing amount of complaints from readers, I have disabled the generation of lang= attributes for transliterated text in Wikipedia.

http://en.wikipedia.org/w/index.php?title=Template%3ATransl&action=historysubmit&diff=377349407&oldid=242769116

Gordon P. Hemsley [:GPHemsley]

Comment 18

•

13 years ago

(In reply to comment #17)
> Due to apparent lack of progress on this issue in the past 7 years, and an
> increasing amount of complaints from readers, I have disabled the generation
> of lang= attributes for transliterated text in Wikipedia.
> 
> http://en.wikipedia.org/w/index.
> php?title=Template%3ATransl&action=historysubmit&diff=377349407&oldid=2427691
> 16

This is now on our radar in our plans to implement BCP 47.

Depends on: 556237

Whiteboard: [bcp47]

Gordon P. Hemsley [:GPHemsley]

Updated

•

11 years ago

Status: NEW → RESOLVED

Closed: 22 years ago → 11 years ago

Resolution: --- → DUPLICATE

Gordon P. Hemsley [:GPHemsley]

Updated

•

11 years ago

No longer depends on: 556237

Simon Montagu :smontagu

Assignee

Comment 20

•

10 years ago

Unduping: bug 756022: the fix for that was narrower in scope than this bug and didn't address the issue of script subtags in language tags in content.

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

Simon Montagu :smontagu

Assignee

Updated

•

10 years ago

Depends on: 556237

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Status: REOPENED → NEW

Summary: Map *-Latn languages to Western script (ISO 15924 script codes) → Map *-Latn languages to Latin script (ISO 15924 script codes)

Derk-Jan Hartman

Comment 21

•

9 years ago

Attached image A screenshot showing -Latn text — Details

I've attached a screenshot, of how -Latn language text currently render on FF 40.0.3, Mac OS X 10.10.5

The line height and differing font usage is clearly still problematic

S. McCandlish

Comment 22

•

9 years ago

I'm skeptical that Bug 556237 actually blocks this; it's about a whole new system for treating language and font negotiation.  It does not require such a system to fix the problem reported here in 192636.  For anything tagged *-Latn, just a) use the current font, if latin; or b) use the default latin font, if the current font is something else (Chinese, etc.). The end.

If and when 556237's bigger-better-faster idea is implemented ("don't think it should be a priority", they say, and depends in turn on at least three other bugs), in 10 years or whatever, it can supersede what we're resolving here.  But this problem should be resolved now, not later.

Andrei Purice

Comment 23

•

3 years ago

Hey Derk,
Can you still reproduce this issue or should we close it?

Flags: needinfo?(hartman.wiki)

Derk-Jan Hartman

Comment 24

•

3 years ago

Attached image same list of -Latn in 2021 — Details

This seems fixed, I have no idea how many years ago ;)
https://people.well.com/user/mech/temp/WP/xx-Latn_test.html

Flags: needinfo?(hartman.wiki)

Andrei Purice

Comment 25

•

3 years ago

Marking this as resolved > Worksforme based on the last comment.

Status: NEW → RESOLVED

Closed: 11 years ago → 3 years ago

Resolution: --- → WORKSFORME

You need to log in before you can comment on or make changes to this bug.