Last Comment Bug 234485 - xml:lang is not used for font selection or hyphenation
: xml:lang is not used for font selection or hyphenation
Status: RESOLVED FIXED
: intl
Product: Core
Classification: Components
Component: CSS Parsing and Computation (show other bugs)
: Trunk
: All All
: -- normal with 9 votes (vote)
: mozilla24
Assigned To: David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
:
Mentors:
: 41978 (view as bug list)
Depends on:
Blocks: 293511 validate-math-pages
  Show dependency treegraph
 
Reported: 2004-02-15 22:45 PST by Mike Hommey [:glandium]
Modified: 2013-06-11 22:17 PDT (History)
24 users (show)
ryanvm: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
xhtml showing everything (602 bytes, application/xhtml+xml)
2004-02-15 22:50 PST, Mike Hommey [:glandium]
no flags Details
result of attachment #141512. (13.18 KB, image/png)
2004-02-15 22:53 PST, Mike Hommey [:glandium]
no flags Details
Map xml:lang attribute into style so that it's used for font selection and hyphenation. (25.18 KB, patch)
2013-05-29 10:22 PDT, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
bzbarsky: review+
Details | Diff | Splinter Review
testcase used to verify rule uniqueness is working (440 bytes, application/xhtml+xml)
2013-05-29 10:22 PDT, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
no flags Details
patch (on top of above patch) used to verify rule uniqueness is working correctly (756 bytes, patch)
2013-05-29 10:25 PDT, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
no flags Details | Diff | Splinter Review

Description Mike Hommey [:glandium] 2004-02-15 22:45:49 PST
User-Agent:       
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040210 Firefox/0.8

In xhtml 1.1, the lang attribute has been deprecated. Therefore, using xml:lang
is the normal way of telling which language is used in an element.
Even if xml:lang is recognized on the styling part (:lang() selector works
properly), the fonts used for rendering are still those for the default lang.

Reproducible: Always
Steps to Reproduce:
1.
2.
3.
Comment 1 Mike Hommey [:glandium] 2004-02-15 22:50:17 PST
Created attachment 141512 [details]
xhtml showing everything

Note that if the content-type given is text/html, xml:lang is not "recognized"
(:lang() stops working for xml:lang'ed elements)
Also note that as the lang attribute is deprecated, it should not be there,
it's only there for demonstration purpose.
I'm also attaching an image showing the rendering result.
Comment 2 Mike Hommey [:glandium] 2004-02-15 22:53:29 PST
Created attachment 141513 [details]
result of attachment #141512 [details].

The fonts strangeness comes from the fact that my Western font contains
hiragana and katakana, but not kanjis, which means that when rendering kanjis,
the font rendering engines fallbacks to an other font, which doesn't render in
the same way, therefore creating crappy display.
In the correctly rendered paragraph, all is rendered using the Japanese font.
Comment 3 Boris Zbarsky [:bz] 2004-02-15 23:47:34 PST
No relationship to view rendering (that would be painting).

The lang attr is mapped into the pseudo-style "mLang" thingie at
http://lxr.mozilla.org/seamonkey/source/content/html/content/src/nsGenericHTMLElement.cpp#2915

We could modify this to also map xml:lang, but we should really be doing that
for all elements, not just HTML ones....  And we don't really want to duplicate
the code in nsCSSStyleSheet too much.  Maybe we could push up GetLang to
nsIContent (it's already on nsIDOMHTMLElement, but....) or something?
Comment 4 Jonas Sicking (:sicking) No longer reading bugmail consistently 2004-02-16 10:51:54 PST
Putting it on nsIContent sounds like a good solution to me (though maybe we
should call it something like GetContentLang to avoid 'hides' warnings). At
least if the stylesystem would be able to use that without a performance hit.
Comment 5 Jungshik Shin 2004-02-16 11:15:23 PST
I though we fixed this in bug 35768(bug 91190 comment 13 and
bug 35768 - attachment 95593 [details]), but apparently we didn't as far as the font
selection is concerned.
  
Comment 6 Boris Zbarsky [:bz] 2004-02-16 11:22:16 PST
The real problem is that bug 115121 was marked duplicate incorrectly.

sicking, I don't really see how putting it on nsIContent will slow things down
much (I guess it's an extra virtual function call, but compared to the actual
work that needs doing -- getting attrs and all -- that seems minor).
Comment 7 Jonas Sicking (:sicking) No longer reading bugmail consistently 2004-02-16 11:29:08 PST
bz, I suspected so, but i thought i'd ask. I look forward for a patch ;-)
Comment 8 Boris Zbarsky [:bz] 2004-02-16 11:43:20 PST
Not likely to come from here any time soon, unfortunately. Should be a matter of
implementing GetContentLanguage by copying the CSS code, then making CSS use it
and making attr mapping use it too.

Would need to do attr mapping on nsGenericElement for that, of course...
Comment 9 rbs 2004-02-16 16:05:55 PST
This bug is confusing at first read. I had to go back to the earlier bugs to
clear things up in my mind. Some extra comments to clarify it a bit from the
font selection pespective:

The font sub-system gets the language from nsFrame.cpp's SetFontFromStyle().
http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsFrame.cpp#401

Thus the bug also means that nsStyleVisibility::mLanguage is not reliable. It is
only half of the equation. It does not walk up the tree to find the language.
|SetFontFromStyle| seemed to assume that the language would have been fully
resolved already in nsRuleNode, but that's not the case.

The style system's nsCSSStyleSheet has its own way of getting/processing the
lang. It is quite comprehensive. It also has the advantage that it is lazy (no
extra work unless the :lang() pseudo is encountered).

Apart for adding a nsGenericElement::GetContentLang [along the same lines as
nsGenericElement::GetBaseURI which is already there for xml:base], another way
to fix the bug might be in |SetFontFromStyle|.

But probably the fix is somewhere between the two. That is
1) add nsGenericElement::GetContentLang 
2) do the genuflexions for the attr mapping of xml:lang and the HTML's lang in
nsGenericElement, to honor the precedence.
3) make |SetFontFromStyle| do the right thing.
Comment 10 Boris Zbarsky [:bz] 2004-02-16 16:19:18 PST
There's more involved than just SetFontFromStyle -- see
http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsHTMLReflowState.cpp#2153
if nothing else.

I'd say the first thing we need here is a clear description of what exactly this
code is trying to achieve... First off, if we add this GetLanguageGroup method,
why store this stuff in the style system?  SetFontFromStyle and the reflow state
could just call the method on the frame's mContent directly, no?
Comment 11 rbs 2004-02-16 16:46:23 PST
I think the original reason why |SetFontFromStyle| was made is because people
kept ommitting to set the language when passing a font to GFX.

>First off, if we add this GetLanguageGroup method, why store this stuff in the
>style system?

It doesn't have to be saved in the style system, if that is deemed unnecessary.

>SetFontFromStyle and the reflow state could just call the method on the frame's
>mContent directly, no?

Sure. That's what I meant by 'do the right thing' -- whatever is appropriate
after getting the basics.
Comment 12 Jungshik Shin 2004-02-16 18:59:35 PST
Probably, it also has to be taken into account that 'Content-Language' (http
header and meta declaration) can set the 'overall' language(s) of a document.
Besides, there's been a talk of adding 'View | Set Language' (for the manual
overriding of the language of a document currently in view) although we may not
do that in the near future. 
Comment 13 rbs 2004-02-16 19:14:39 PST
Yep. The style system's nsCSSStyleSheet is quite comprehensive and includes the
case of the 'Content-Language' HTTP header.
Comment 14 Boris Zbarsky [:bz] 2004-02-16 19:25:11 PST
(In reply to comment #9)
> Thus the bug also means that nsStyleVisibility::mLanguage is not reliable. It
> is only half of the equation. It does not walk up the tree to find the
> language.

Actually, it does.  The values in the visibility struct are all inherited by
default, so if a node doesn't have an explicit language set it automatically
inherits the parent's.  The root node's language is inited from
nsIPresContext::GetLanguage() if it's not set explicitly.  So we should just
keep this in the style system -- it's simpler that way...  The method on
nsIContent should just return the language on that node, if any, not walk up
parents.  The tree-walking code should stay in the style resolution code.

(In reply to comment #12)
> Probably, it also has to be taken into account that 'Content-Language' (http
> header and meta declaration) can set the 'overall' language(s) of a document.

Which just means that we should init the prescontext's mLanguage based on the
document's GetContentLanguage() returns, I would think... (except for the mess
with getting the latter out of prefs).  Right now, the context's mLanguage is
set based on the document's character set (see nsPresContext::UpdateCharSet).
Comment 15 rbs 2004-02-16 19:44:37 PST
>The method on nsIContent should just return the language on that node, if any,
>not walk up parents.  The tree-walking code should stay in the style resolution
>code.

I wonder, then, if it is necessary to add such a dummy GetLang to nsIContent.
Comment 16 rbs 2004-02-16 19:49:23 PST
I had a look and nsDocument::RetrieveRelevantHeaders() [which feeds the
document's GetContentLanguage()] looks up the prefs too.
Comment 17 Jonas Sicking (:sicking) No longer reading bugmail consistently 2004-02-16 19:52:09 PST
Oh, I assumed that nsIContent::GetContentLang would walk up the parentchain. If
that's not the case then we should make that clear in the function-name, or add
an |PRBool aCheckParentChain| argument.
Comment 18 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2004-02-16 19:55:43 PST
Comment 15 seems sensible, although it might be good to consolidate this and the
code we use for :lang()
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/content/html/style/src/nsCSSStyleSheet.cpp&rev=3.295&mark=3210-3211,3217-3218#3198
Comment 19 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2004-02-16 19:58:00 PST
Oh, except attribute mapping is entirely the wrong mechanism.  The language
should come from somewhere in content rather than the style data so that you
*can* map xml:lang easily.
Comment 20 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2004-02-16 20:00:05 PST
*** Bug 41978 has been marked as a duplicate of this bug. ***
Comment 21 Boris Zbarsky [:bz] 2004-02-16 20:23:10 PST
Yeah, I guess we could cache the language on the content node instead of caching
it in the style.. the one nice thing with caching in style is that it's cheaper
because of style context sharing, maybe (maybe).
Comment 22 rbs 2004-02-16 21:28:29 PST
I kinda also like the saving being achieved via the style system. Why is/was
that important to have nsIContent::GetContentLang()? Is is going to supersede
the one in nsIDOMHTMLElement (or, has is the ambiguity going to be resolved)?
Wouldn't it be possible to simply map the xml:lang in nsGenericElement?
Comment 23 rbs 2004-02-16 21:29:27 PST
s/has is/how is/
Comment 24 Boris Zbarsky [:bz] 2004-02-16 21:39:47 PST
> Why is/was that important to have nsIContent::GetContentLang()?

To avoid the duplication of the "what's the lang set on this content node" logic
(the fact that you have to check xml:lang and then if and only if the node is
HTML check the lang attribute in the null namespace).  That way the next time
the (X)HTML WG decides to make some half-assed decisions we will only have to
change one place in the code...

It's not _that_ important.  Just nice-to-have.
Comment 25 rbs 2004-02-16 21:51:59 PST
What about the ambiguity with nsIDOMHTMLElement? 

Someone does nsIDOMHTMLElement.setlang(), but gets something different with
nsIDOMHTMLElement.getlang() because there is an old xml:lang that takes precedence?

or, the deal is that there will be two different results with GetLang() and
GetContentLang(), albeit funny? 
Comment 26 Boris Zbarsky [:bz] 2004-02-16 22:07:33 PST
> Someone does nsIDOMHTMLElement.setlang(), but gets something different with
> nsIDOMHTMLElement.getlang() because there is an old xml:lang that takes
> precedence?

Didn't I mention "half-baked" and "(X)HTML WG" in the same sentence already?
Comment 27 rbs 2004-02-23 01:28:57 PST
I looked at what it will take to fix this, and here are the results of my
investigation:

1) It is not possible presently to leverage on the mapping code to map xml:lang
to style. This is because the |MapAttributesInto| business is sooo
HTML-specific... How did I forget that?!? It turned out that this is precisely
the same issue that has blocked bug 69409 for MathML attributes.

2) While it is possible to fetch the lang attribute everytime, it is much nicer
to just map it into the style data due to these primary reasons:
  a) the lang string is actually converted to a langGroupAtom that groups many
languages/scripts. Given a top-element with lang, e.g., <html lang>, there is
only one such atom in one visibility struct of the style system. The atom is
fetched rapidly whenever/wherever it is neeeded. Plus, the
HasAttributeDependentStyle stuff is kept in sync. (This BTW has to be expanded
to include the namespace as righly pointed out in an XXX comment there). 
  b) Take the lang out of the style system, and for every element in the doc,
one will have to walk back to <html>, convert it to an atom, deal with errors in
children (e.g., an unknown lang) rather than just inheriting the most recent
known lang as can be neatly done with the style system at style resolution
(albeit this is not done yet).
  c) Trying to map xml:lang to the style data provides a nice examplar for you
guys to really appreciate why bug 69409 has been blocked all this time :-)

3) Probably this is the most startling observation: despite the recent rework
that Jonas did (which was mostly for space-efficiency) the attribute mapping
code remains over-engineered... What is going on with this |nsMappedAttributes|
class which is cloned, etc? Actually, the mapping functions are _constant_ for
each tag and I have this feeling that a table-driven approach could do the job
pretty well, while providing a way to finally cater for MathML attributes. This
is not the best place to discuss it, but since things are so fresh in mind, I
may as well put them down for the record. Consider the first-level table:

/*static*/
{
  {noneNS_ID, &MapHTMLAttributesInto}.
  {xmlNS_ID, &MapXMLAttributesInto},
#ifdef MOZ_MATHML
  {mathNS_ID, &MapMathMLAttributesInto},
#endif
#ifdef MOZ_SVG
  {svgNS_ID, &MapSVGAttributesInto}
#endif
}

and then the second-level tables

static void 
MapHTMLAttributesInto(...)
{
  /*static*/
  {nsHTMLAtoms:div, &MapDivAttributesInto}
  {nsHTMLAtoms:img, &MapImgAttributesInto}
  etc..
}

static void 
MapXMLAttributesInto(...)
{
  /*static*/
  {nsXMLAtoms:lang, &MapLangAttributesInto}
}
etc

Then, propagate the change so that the Style System calls the mapping function
based on the namespace directly, reflecting into the attribute style sheet of
the current document, rather than going to the element in a rather complex and
convoluted manner at present (to get the same _function_ for each tag, clone, etc).

Doing so will suppress the middle-man and the over-engineering that is
happening. It is somewhat reminiscent of what alecf did to convert
HasAttributeDependentStyle into a table-driven code.
Comment 28 Boris Zbarsky [:bz] 2004-02-23 01:41:29 PST
> This is because the |MapAttributesInto| business is sooo HTML-specific...

With sicking's changes, this should be pretty simple to fix.

> What is going on with this |nsMappedAttributes| class which is cloned, etc?

If nothing else, it's an nsIStyleRule implementation and the thing the nodes in
the ruletree point to.

> Actually, the mapping functions are _constant_

I'm not sure what this means, exactly...

> based on the namespace directly

Wouldn't that break for XML nodes in the null namespace vs non-XML HTML nodes?

More to the point, what will be the nsIStyleRule objects involved?  The current
cloning, etc is because we actually share nsMappedAttributes objects.  So if you
have 10 <font size="3"> tags we only have one nsMappedAttributes object for
them.  The speed and footprint savings in the ruletree from this sharing are
pretty noticeable.....
Comment 29 rbs 2004-02-23 02:40:54 PST
> > This is because the |MapAttributesInto| business is sooo HTML-specific...
>
> With sicking's changes, this should be pretty simple to fix.

I hope it is in the plan. I have been waiting for that to happen.

> > Actually, the mapping functions are _constant_
>
> I'm not sure what this means, exactly...

C.f. |GetAttributeMappingFunction|, it returns a function based on the tag. It
is currently designed so that different instances of the object could return
different things. But in practice, different instances return the same thing and
this suggests that a static tag-based table-driven approach is appropriate.

> >based on the namespace directly
>
>Wouldn't that break for XML nodes in the null namespace vs non-XML HTML nodes?

What do you mean? The behavior won't be any different from what is there at the
moment. Attributes are either mappable or they aren't, based on the
(attrNamespaceID, attrName).

Take <p xml:lang>, it will boil down to MapHTMLAttributesInto, which should
include a call to MapXmlLangAttr(p_content, xmlNS_ID, nsHTMLAtoms::lang,
to_ruleData) if the doc is XHTML.

> have 10 <font size="3"> tags

The management is pretty involved at the moment. With the table-driven approach,
there would just be the hard-coded tables as illustrated earlier. The question
is whether that won't work and why.
Comment 30 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2004-02-23 09:36:07 PST
The style system requires that the style data come from implementations of
nsIStyleRule that meet the requirements in 

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/content/base/public/nsIStyleRule.h&rev=3.22&mark=54-82#54
Comment 31 Jonas Sicking (:sicking) No longer reading bugmail consistently 2004-02-23 14:29:05 PST
Filed bug 235342 on discussion started in comment 27
Comment 32 Jonas Sicking (:sicking) No longer reading bugmail consistently 2004-02-24 10:31:14 PST
So back to the original issue: How do we want to map xml:lang into style? Do we
want to use the attribute-mapping code and do the same thing as we do for
"lang", or do we want to add GetContentLang to nsIContent?

One problem with using attribute-mapping code is that parts of it currently
can't deal with namespaced mapped attributes. Mostly the functions on
nsIStyledContent only takes an localname-atom argument rather then
localname-atom and namespaceid.

We might need this for other attributes too, but so far neither svg, xul or html
needs this. Don't know about mathml, but i don't think it has any namespaced
attributes at all.
Comment 33 Boris Zbarsky [:bz] 2004-02-24 11:15:33 PST
> Mostly the functions on nsIStyledContent only takes an localname-atom 

That (and the HasAttributeDependentStyle code) have needed fixing for a while. 
We should just do it, imo.

Doing attribute-mapping from nsGenericElement here seems like the way to go.
Comment 34 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2004-02-24 11:31:10 PST
HasAttributeDependentStyle is just an optimization.  It wouldn't surprise me if
the cost of passing an extra 32-bit parameter to that function and checking it
were higher than the benefits gained.
Comment 35 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2004-02-24 11:37:21 PST
(Also, if this code is to be rewritten, the rules on presentational attributes
in CSS 2.1 should be considered.)
Comment 36 Boris Zbarsky [:bz] 2004-02-24 11:50:44 PST
Fair enough.  Not much call to change HasAttributeDependentStyle, I guess.  ;)
Comment 37 Vincent Lönngren 2005-01-28 01:43:57 PST
In fact, xml:lang does not seem to be recognised as an alternative to lang at
all. Even with xml:lang set to something, firefox does not recognise the text to
be in that language.
Comment 38 Vincent Lönngren 2005-01-28 01:55:16 PST
Sorry about that stupid comment, I'm new to this interface. Won't happen again.
Please just ignore it.
Comment 39 Shlomi Fish 2008-05-26 08:23:08 PDT
Hi all!

I've been bitten by this bug on this page:

http://www.shlomifish.org/humour/TheEnemy/The-Enemy-rev4.html

This is after fixing it according to the input of the good people on #mozilla.il. Given enough will, I'd also like to work on a patch for that.

Regards,

      Shlomi Fish

See: http://www.shlomifish.org/meta/FAQ/#xhtml_11 for why I'm using XHTML 1.1 with a content-type of "text/html".
Comment 40 Boris Zbarsky [:bz] 2008-05-26 08:38:41 PDT
xml:lang is ignored in text/html content, as it should be.  That is, you're not seeing this bug.
Comment 41 Mikko Rantalainen 2012-04-19 03:13:08 PDT
Any progress with this issue? I'm writing a web app that outputs application/xhtml+xml for Firefox and I would like to use automatic hyphenation for the content. The goal is to do public release within a year. Should I put a workaround in the code for Firefox (for example, repeat the value of xml:lang in the attribute "lang") or will this get fixed soon enough?
Comment 42 Boris Zbarsky [:bz] 2012-04-19 07:59:20 PDT
I would recommend not using application/xhtml+xml, period.  A number of new HTML5 features work much worse with that MIME type than with text/html.  And yes, you should probably just put both xml:lang= and lang= in your markup if you do want to keep using XHTML.  I certainly don't see us rewriting our attribute code around this edge case (which is what would have to happen) anytime soon.
Comment 43 John Daggett (:jtd) 2012-04-19 17:32:10 PDT
If so, would it make sense to mark this "WONTFIX"?
Comment 44 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 03:51:15 PDT
This actually doesn't even need attribute mapping.  We can make a really simple rule class that stores a string for the language attribute, and have a hashtable of them indexed by the string (so that we only have one per string, so the rule tree works well).  (We just need it not to apply when content->IsInHTMLDocument().)
Comment 45 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 04:09:57 PDT
(In reply to David Baron [:dbaron] (don't cc:, use needinfo? instead) from comment #44)
> (We just need it not to apply when
> content->IsInHTMLDocument().)

Actually that's not the case:


[2013-05-29 19:08:30] <annevk> that's actually also a bug for HTML
[2013-05-29 19:08:44] <dbaron> what is?
[2013-05-29 19:08:45] <annevk> HTML requires xml:lang to work (when properly namespaced in the DOM)
[2013-05-29 19:08:58] <dbaron> even in HTML documents?
[2013-05-29 19:09:00] <annevk> (only possible through scripting)
[2013-05-29 19:09:04] <dbaron> ah, ok
[2013-05-29 19:09:09] <dbaron> then we shouldn't condition it
[2013-05-29 19:09:20] <dbaron> (I'm going to just paste this IRC into the bug if that's ok with you)
[2013-05-29 19:09:30] <annevk> yes
Comment 46 Simon Sapin (:SimonSapin) 2013-05-29 04:15:45 PDT
annevk also pointed out the relevant part of the spec:
http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#language

"To determine the language of a node, user agents must look at the nearest ancestor element (including the element itself if the node is an element) that has a lang attribute in the XML namespace set or is an HTML element and has a lang in no namespace attribute set. That attribute specifies the language of the node (regardless of its value).

If both the lang attribute in no namespace and the lang attribute in the XML namespace are set on an element, user agents must use the lang attribute in the XML namespace, and the lang attribute in no namespace must be ignored for the purposes of determining the element's language."

IIRC, it’s the HTML parser that doesn’t recognize the namespace prefix syntax and doesn’t put an xml:lang attribute in the right namespace, but it’s still possible to set namespaced attributes through the DOM.
Comment 47 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 10:22:07 PDT
Created attachment 755476 [details] [diff] [review]
Map xml:lang attribute into style so that it's used for font selection and hyphenation.

The code in nsHTMLStyleSheet implements LangRule to map xml:lang into
style and the code to manage its uniqueness.

The change to nsGenericHTMLElement fixes the mapping of the HTML lang
attribute to do cascading the way all other rule mapping does so that
the cascading works correctly.

The tests test that the correct style language is used for hyphenation
by copying over a set of hyphenation reftests that check its basic
response to languages.  There are no specific tests for font selection,
but font selection is known to use the same language data from style.

I verified manually (see other attachments to bug) that the rule
uniqueness is being managed correctly.
Comment 48 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 10:22:37 PDT
Created attachment 755478 [details]
testcase used to verify rule uniqueness is working
Comment 49 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 10:25:37 PDT
Created attachment 755482 [details] [diff] [review]
patch (on top of above patch) used to verify rule uniqueness is working correctly

With this patch, loading attachment 755478 [details] gave the output:
Making new lang rule 0x2944840 for language en
Making new lang rule 0x28ba810 for language fr
Making new lang rule 0x3ea2dc0 for language en-US
which is the expected result (only 3 lines, no duplicate languages).
Comment 50 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 10:28:49 PDT
https://tbpl.mozilla.org/?tree=Try&rev=11d4370674d8
Comment 51 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 10:41:25 PDT
Though, actually, I think I should ditch all the reftests except for -1, -4, -11{a,b}. and -12{a,b}.

Boris, does that make sense to you?
Comment 52 Boris Zbarsky [:bz] 2013-05-29 22:25:58 PDT
Comment on attachment 755476 [details] [diff] [review]
Map xml:lang attribute into style so that it's used for font selection and hyphenation.

r=me, though it seems like this table can grow without bound for a given document as long as people keep adding new xml:lang values and then removing them...

Also, why do you want to ditch the reftests you want to ditch?
Comment 53 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 22:56:56 PDT
Because some of them were really testing other hyphenation things rather than testing that xml:lang works, which I think only really needs the tests I listed.  It feels kinda like a waste of machine time to run more than that.  Though I suppose some of them test other interesting characteristics of xml:lang, so it's vaguely useful to have the extras.

As far as the memory use thing... I'm at least hoping that people will use xml:lang for its intended use, and the number of languages used in a document will be reasonably limited.  If people really want to drive up memory usage, they can always create lots of content nodes and *not* get rid of them.
Comment 54 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-29 23:12:24 PDT
Eh, I guess I'll keep the tests.  (I'm going to add "-xmllang" to the filenames, though, to make the difference a little clearer rather than just .html vs. .xhtml.)
Comment 55 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-05-30 01:00:56 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a1944211b61
Comment 56 Ryan VanderMeulen [:RyanVM] 2013-05-30 09:04:58 PDT
https://hg.mozilla.org/mozilla-central/rev/0a1944211b61

Note You need to log in before you can comment on or make changes to this bug.