Unify pseudo-inversion of langGrouping

RESOLVED FIXED in mozilla1.9.1b1

Status

()

Core
Graphics
RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: karlt, Assigned: karlt)

Tracking

Trunk
mozilla1.9.1b1
x86
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

9 years ago
NS_FindFCLangGroup in gfxFontconfigUtils.cpp and GetPangoLanguage both use
(different) mapping arrays in a half-hearted attempt to recover the language
information disposed of with nsLanguageAtomService::LookupLanguage and
langGroups.properties.

The MozPangoLangGroups array is missing some elements that MozGtkLangGroups
has and MozPangoLangGroups has some entries that don't add any information.

We can at least unify these two mappings into one.

Even when we eventually retain the lang information from documents that have
it, we'll still need to make a guess at a language for the cases where the
langGroup is inferred from the document.
(Assignee)

Comment 1

9 years ago
Created attachment 340072 [details] [diff] [review]
gfxFontconfigUtils::GetSampleLangForGroup

am for x-ethi seems more consistent with langGroups.properties than et
(Estonian), which was in MozPangoLangGroups.

pango_language_from_string does the necessary conversion to lowercase.

Also chooses the user's preferred language rather than always using the language expected to get the biggest vote.
Attachment #340072 - Flags: review?(roc)
In GetSampleLangForGroup, don't we have some kind of string tokenizer object you can use instead of tokenizing the environment variable yourself?

+        FcPatternAddString(aPattern, FC_LANG, (FcChar8 *)lang.get());

Use const_cast

Other than that, looks fine.
(Assignee)

Comment 3

9 years ago
(In reply to comment #2)
> In GetSampleLangForGroup, don't we have some kind of string tokenizer object
> you can use instead of tokenizing the environment variable yourself?

I considered these options:

nsCWhitespaceTokenizer
  Perfect, except that it only tokenizes on whitespace.

htmlparser/src/nsScanner.h
htmlparser/public/nsScannerString.h
  Only support PRUnichar strings.

nsCRT::strok
  Provides NUL-terminated tokens and so writes to the source.
  Environment variables are writable AFAIK, but it feels evil, and we don't
  need NUL-terminated tokens.

strchr and PRInt32 nsACString::FindChar(char_type, index_type offset = 0)
  Neither return the end of the string when failing to find a separator,
  and so require special casing the last token.

PRBool FindCharInReadable( PRUnichar aChar,
                           nsAString::const_iterator& aSearchStart,
			   const nsAString::const_iterator& aSearchEnd );
  Moves aSearchStart nicely, but uses deprecated iterators.
  http://hg.mozilla.org/mozilla-central/annotate/ad2eb162ecfc/xpcom/string/public/nsTSubstring.h#l125

#define _GNU_SOURCE
char *strchrnul(const char *s, int c);
  Good, but I'm not sure whether GNU is a standard we require.
(Assignee)

Comment 4

9 years ago
Created attachment 340264 [details] [diff] [review]
 gfxFontconfigUtils::GetSampleLangForGroup v1.1
Attachment #340072 - Attachment is obsolete: true
Attachment #340264 - Flags: review?(roc)
Attachment #340072 - Flags: review?(roc)
(Assignee)

Comment 5

9 years ago
Comment on attachment 340264 [details] [diff] [review]
 gfxFontconfigUtils::GetSampleLangForGroup v1.1

C++ casts and changes to GetSampleLangForGroup to iterate through the
environment variable only once.
Attachment #340264 - Flags: review?(roc) → review+
(Assignee)

Comment 6

9 years ago
http://hg.mozilla.org/mozilla-central/rev/9dd14b803ee8
http://hg.mozilla.org/mozilla-central/rev/3c5dc0b3b6ae
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.1b1
(Assignee)

Updated

9 years ago
Blocks: 461155
You need to log in before you can comment on or make changes to this bug.