Closed Bug 1751940 Opened 2 years ago Closed 2 years ago

q element produces incorrect quotation marks when language changes

Categories

(Core :: Layout: Text and Fonts, defect)

Firefox 96
defect

Tracking

()

RESOLVED FIXED
98 Branch
Tracking Status
firefox98 --- fixed

People

(Reporter: ishida, Assigned: jfkthame)

References

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:96.0) Gecko/20100101 Firefox/96.0

Steps to reproduce:

This issue is common across all languages that use the q element.

When an English page contains a quotation in another language, the quotation marks used around that quotation (and inside it for embedded quotes) should be the English ones – not those of the language of the quotation. The same applies for other languages.

Currently, if the language of the quotation is declared on the q tag in HTML and that tag has a lang attribute, browsers instead set the quotation marks based on the language of the quote.

Quotations work fine in a sentence that is all in the same language. For example, the markup for this Georgian text:

<span lang="ka">ერთი <q>ორი <q>სამი</q></q></span>

will result in:

ერთი „ორი «სამი»“

However, if the quote is in English and lang="en" is added to the first q tag, the result becomes:

ერთი “two ‘three’”

whereas it should be:

ერთი „two «three»“

Specs:
This incorrect behaviour was initially dictated by the HTML specification. issue 3636 (https://github.com/whatwg/html/issues/3636#issuecomment-384340053) was raised to change the spec. In the end the entire section was removed from the HTML spec, and HTML now relies on CSS for this behaviour.

css-content (https://drafts.csswg.org/css-content/#quotes-property) says that "If a quotation is in a different language than the surrounding text, it is customary to quote the text with the quote marks of the language of the surrounding text, not the language of the quotation itself.", however it is non-normative text.

Issue 5478 (https://github.com/w3c/csswg-drafts/issues/5478), requests that this be made normative, and has been agreed by the CSS WG.

Actual results:

Tests & results:
Interactive test, When an embedded quote is in a different language, the quotation marks should be those of the main body, even if the language of the quote is declared using a lang attribute.
https://github.com/w3c/character_phrase_tests/issues/4
Gecko: ❌ If a lang attribute is used, the quotation marks are those associated with the quotation rather than those associated with the surrounding text. Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:94.0) Gecko/20100101 Firefox/94.0
Blink: ❌ Same. Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36
Webkit: ❌ Same. Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.1 Safari/605.1.15

i18n test suite, Multilingual nesting.
https://w3c.github.io/i18n-tests/results/the-q-element.html#ml_nesting

Expected results:

Please use the quote marks of the main paragraph text, rather than change the quote marks to match the embedded quotation.

(This issue is being tracked at https://www.w3.org/TR/geor-gap/#issue22_quotations)

It should be a one/two-line change here, plus tests.

Status: UNCONFIRMED → NEW
Ever confirmed: true

(In reply to Richard Ishida from comment #0)

Issue 5478 (https://github.com/w3c/csswg-drafts/issues/5478), requests that this be made normative, and has been agreed by the CSS WG.

Reviewing that issue, there are actually two specific resolutions:

RESOLVED: auto value of quote to be based on parent language
RESOLVED: Add match-parent keyword

The first of these seems fairly straightforward: instead of checking the language of the actual quote element, we should check the language of its parent.

This will address an example such as:

<p lang=en><q>Hello</q> in Japanese is <q lang=ja>こんにちは</q>.</p>

which Firefox currently renders:

“Hello” in Japanese is 「こんにちは」.
but with that change, should become:
“Hello” in Japanese is “こんにちは”.

However, this by itself will not necessarily cause nested quotes to follow the quotation-mark convention of the paragraph; they will respect the language of their parent, which may itself differ from the top-level language. So if we have:

  <p lang=en>He said,
    <q lang=fr>Je crois que
      <q>bonjour</q>
      en japonais c’est
      <q lang=ja>こんにちは</q>
    </q>.
  </p>

we'll get English quotation marks around the top-level <q>, based on the paragraph language, but the nested quotes (both with and without their own lang attribute) will get French quotation marks. (Currently, we get French and Japanese, but that's what the resolution says should change.)

Some of the discussion in the issue seems to be suggesting that all the quotes in this example should get English-style marks, based on the paragraph language, but that's not my understanding of what the CSS WG resolved, and I'm not sure it's either desirable or realistically implementable for more general cases; it could become quite unclear what should be considered the "main" language, if it is something other than the language of the immediate parent of the quote.

The WG's second resolution, to add a match-parent value, is intended to let authors address this by explicitly saying that nested quotation marks should follow the same convention as outer ones, but it is not (AIUI) expected as the auto behavior.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)

It should be a one/two-line change here, plus tests.

I pushed a patch for

RESOLVED: auto value of quote to be based on parent language

to tryserver, to see how it looks -- I suspect there may be existing tests that will be affected and will need an update.

I did some research a while ago about what to do for nested quotations, and the conclusion was that, indeed, all the quotes in your 3 language example should be in English, unless the author wants to retain the quotes in the original quotation, which is unusual, and so in which case they would need to do that manually - and, in fact such a use case is probably not a good candidate for use of the q element.

I think the main language is whatever is outside the first q element, and i'm not sure that would be too hard to detect. I don't know whether it's helpful, but since this was originally in the HTML5 spec, which used CSS rules to indicate what should happen, Fantasai tried to come up with some (rather convoluted) CSS rule to make this work. See https://github.com/whatwg/html/issues/3636#issuecomment-460843172. Maybe match-parent would make that simpler.

It's unfortunate to have to do a full tree-walk up potentially to the root for this.

I wonder about other cases, too. What about a quote in an embedded iframe, or an HTML document embedded in an SVG image that is used inside a quote? I guess quote-mark behavior should probably reset at such boundaries; but not sure whether the spec currently makes that clear.

Then how about out-of-flow elements that are still within the same document; should the quote marks in an abs-pos or float block depend on the language of a "surrounding" quote in the main flow? It's not at all clear to me that would be desirable. (For that matter, should the nesting level be reset in such cases? I'm not entirely sure...)

This implements the first resolution from https://github.com/w3c/csswg-drafts/issues/5478,
and makes simple cases of quoting a foreign-language snippet work as desired.

Still to do: the match-parent value (second resolution in that issue), required for full
support of nested mixed-language quotes if the author wants the conventions of the
outermost language to propagate down to all nested levels.

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED

I filed a separate bug 1752382 for adding the match-parent keyword.

See Also: → 1752382
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e4a77ca754ff
Language-depeendent quote marks generated with quotes:auto should be based on the lang of the parent. r=layout-reviewers,emilio
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/32577 for changes under testing/web-platform/tests
Upstream PR was closed without merging
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/71b988291cce
Language-depeendent quote marks generated with quotes:auto should be based on the lang of the parent. r=layout-reviewers,emilio
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 98 Branch
Upstream PR merged by moz-wptsync-bot
Regressions: 1752649
Flags: needinfo?(jfkthame)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: