Closed Bug 456545 Opened 16 years ago Closed 16 years ago

Unify pseudo-inversion of langGrouping

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.1b1

People

(Reporter: karlt, Assigned: karlt)

References

Details

Attachments

(1 file, 1 obsolete file)

NS_FindFCLangGroup in gfxFontconfigUtils.cpp and GetPangoLanguage both use
(different) mapping arrays in a half-hearted attempt to recover the language
information disposed of with nsLanguageAtomService::LookupLanguage and
langGroups.properties.

The MozPangoLangGroups array is missing some elements that MozGtkLangGroups
has and MozPangoLangGroups has some entries that don't add any information.

We can at least unify these two mappings into one.

Even when we eventually retain the lang information from documents that have
it, we'll still need to make a guess at a language for the cases where the
langGroup is inferred from the document.
am for x-ethi seems more consistent with langGroups.properties than et
(Estonian), which was in MozPangoLangGroups.

pango_language_from_string does the necessary conversion to lowercase.

Also chooses the user's preferred language rather than always using the language expected to get the biggest vote.
Attachment #340072 - Flags: review?(roc)
In GetSampleLangForGroup, don't we have some kind of string tokenizer object you can use instead of tokenizing the environment variable yourself?

+        FcPatternAddString(aPattern, FC_LANG, (FcChar8 *)lang.get());

Use const_cast

Other than that, looks fine.
(In reply to comment #2)
> In GetSampleLangForGroup, don't we have some kind of string tokenizer object
> you can use instead of tokenizing the environment variable yourself?

I considered these options:

nsCWhitespaceTokenizer
  Perfect, except that it only tokenizes on whitespace.

htmlparser/src/nsScanner.h
htmlparser/public/nsScannerString.h
  Only support PRUnichar strings.

nsCRT::strok
  Provides NUL-terminated tokens and so writes to the source.
  Environment variables are writable AFAIK, but it feels evil, and we don't
  need NUL-terminated tokens.

strchr and PRInt32 nsACString::FindChar(char_type, index_type offset = 0)
  Neither return the end of the string when failing to find a separator,
  and so require special casing the last token.

PRBool FindCharInReadable( PRUnichar aChar,
                           nsAString::const_iterator& aSearchStart,
			   const nsAString::const_iterator& aSearchEnd );
  Moves aSearchStart nicely, but uses deprecated iterators.
  http://hg.mozilla.org/mozilla-central/annotate/ad2eb162ecfc/xpcom/string/public/nsTSubstring.h#l125

#define _GNU_SOURCE
char *strchrnul(const char *s, int c);
  Good, but I'm not sure whether GNU is a standard we require.
Attachment #340072 - Attachment is obsolete: true
Attachment #340264 - Flags: review?(roc)
Attachment #340072 - Flags: review?(roc)
Comment on attachment 340264 [details] [diff] [review]
 gfxFontconfigUtils::GetSampleLangForGroup v1.1

C++ casts and changes to GetSampleLangForGroup to iterate through the
environment variable only once.
http://hg.mozilla.org/mozilla-central/rev/9dd14b803ee8
http://hg.mozilla.org/mozilla-central/rev/3c5dc0b3b6ae
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.1b1
Blocks: 461155
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: