Closed Bug 226623 Opened 21 years ago Closed 21 years ago

XFT crashes on PLHashRawAdd

Categories

(Core Graveyard :: GFX: Gtk, defect)

x86
Linux
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: charles.fenwick, Assigned: bryner)

References

Details

(Keywords: crash, regression)

Attachments

(7 files, 2 obsolete files)

After updating tonight (after being away from Linux for a few days,) my GTK2+XFT
build now crashes at various sites. Haven't found a 100% way to reproduce the
crash, but the sequence of logging into Yahoo Mail, going to the Inbox, and
clicking on a message can trigger the crash (sometimes).
Attached file backtrace of crash
backtrace
Well, I did  
     cvs update -j1.35 -j1.34 gfx/src/gtk/nsFontMetricsXft.cpp
and still got the crash.

But, after doing
     cvs update -j1.34 -j1.33 gfx/src/gtk/nsFontMetricsXft.cpp
     cvs update -j1.10 -j1.9 gfx/src/gtk/nsFontMetricsXft.h

it *seems* that the crash no longer occurs (and the lizard runs happier in general).

It looks like this is a regression from bug 223813.
Assignee: blizzard → bryner
Keywords: regression
Does this happen on a particular URL?
Sorrry, should have read closer.  Investigating.
What version of fontconfig and Xft do you have on your system?
Attached patch possible fixSplinter Review
This is my best guess at a fix from looking at the stack.  It ensures that if
we loaded a best-match font, and then later load all fonts, we don't add the
best-match font to the hashtable again (which I think is what NSPR is
complaining about).
This patch improves the situation significantly.

I've done about 45 minutes of testing and it appears that the browser doesn't
crash during routine browsing.

However, I can still make it crash if I try hard enough.  Opening a few
different websites in tabs simultaneously can do the trick as can going to my
spam folder on Yahoo Mail and going through spam mails in rapid sequence.  I
don't consider either case to be routine browsing (for me anyhow).

If you could apply this patch and attach the log output (if it's not too large)
that would really be helpful.  Just redirect stdout to a file, i.e.

./mozilla > xft.log
Attached file hashtable log
by request...
Something is certainly amuck given how many times we add Arial to the hashtable
(in theory we should have already checked to see if it was there).
I'm not able to reproduce any behavior like that with a reduced testcase using
PLHashTable.  I add Times New Roman, Verdana, and Arial, and it subsequently
finds Arial on a lookup.

What compiler version and configure options are you using?

I guess another possibility is heap corruption causing the hashtable to become
confused...
about:buildconfig

Build platform
target
i686-pc-linux-gnu

Build tools
Compiler 	Version 	Compiler flags
gcc 	gcc version 3.3.1 (SuSE Linux) 	-Wall -W -Wno-unused -Wpointer-arith
-Wcast-align -Wno-long-long -pedantic -pthread -pipe
c++ 	gcc version 3.3.1 (SuSE Linux) 	-fno-rtti -fno-exceptions -Wall
-Wconversion -Wpointer-arith -Wcast-align -Woverloaded-virtual -Wsynth
-Wno-ctor-dtor-privacy -Wno-long-long -pedantic -fshort-wchar -pthread -pipe
-I/usr/X11R6/include

Configure arguments
--enable-default-toolkit=gtk2 --enable-crypto --enable-xft
*** Bug 226969 has been marked as a duplicate of this bug. ***
I just experienced a crash. My build was compiled with g++ 3.2.2 (20030222) on 
Redhat 9.0 (with gtk1 instead of gtk2)  It happended only a few seconds into 
my session, but I didn't apply attachment 136353 [details] [diff] [review], yet.  I'll try to gather more 
information.  
 
Attached file anohter hash log
I applied attachment 136342 [details] [diff] [review] and 136353 and got the following hash log. It
didn't take me long to get Mozilla to crash. It crashed just a couple of
minutes into the session.
It also happens on Gtk1 + Xft. So, it's not gtk2 specific. I'm changing the 
summary line. 
 
Summary: GTK2+XFT crashes → XFT crashes on PLHashRawAdd
First a insigth ;-) *I* find the plhash.c unsecure at best. There are no (or no way) to 
ASSERT( ) incoming data in any of the functions. So If we are feeding those functions 
with garbage we are sure to 1) to never detect where it happens, 2) Damage the 
heap (as all but the const functions actually does write). 
 
Secondly a question about nsFontMetricsXft.cpp, on line 2998 you happily assume 
that PL_Hash...( ) never returns null (or garbage). But on line 3031 you check for 
null. So which is it, can PL_HashTableLookup...( ) return null or not? 
 
Attached patch patch with nsClassHashTable (obsolete) — Splinter Review
When I fixed bug 176290, I wanted to use nsClassHashTable (then, brand-new),
but somehow I couldn't manage to compile it so that I fell back to PLHash. I
just tried once more and this time it worked and seems to have fixed this bug.
With or without attachment 136342 [details] [diff] [review], Mozilla doesn't crash while without any
patch (or attachment 136342 [details] [diff] [review] alone), it crashes only after a couple of site
visits.
Comment on attachment 136946 [details] [diff] [review]
patch with nsClassHashTable

asking for r/sr.
Attachment #136946 - Flags: superreview?(blizzard)
Attachment #136946 - Flags: review?(bryner)
Comment on attachment 136946 [details] [diff] [review]
patch with nsClassHashTable

>+typedef nsClassHashtable<nsDepCharHashKey, nsFontXftInfo> nsFontXftInfoHash; 

I'm not convinced that it's safe to use nsDepCharHashKey here.	By doing so,
you assert that the string lifetime is at least as long as the hashtable
entry's.  The string is owned by the pattern.  In DoMatch(), we release our
reference to the font set (and therefore the patterns in it) after calling
GetFontXftInfo().  I think it's possible that this will cause the keys to point
to freed storage.  So we probably need to copy the strings.

I don't think there's a template hashtable key type that manages freeing a raw
char* for you.	I'd say your options are to either use nsCStrings in your hash
table (and take the extra bloat that would come from doing so), or switch to
using PLDHashTable directly, which would allow you to use raw char* pointers
and free them when the entry goes away.  I guess another option would be to add
a new key type in nsHashKeys.h which frees its string in the dtor.

Now I'm wondering if this is the actual cause of the problem with PLHashTable
as well...
bryner, thanks for bringing that up. I'm too preoccupied with the fact that I 
should not free it to realize that it could be freed behind our back. Using 
CStringKey would be easiest (I've already made a patch and tested it) but it 
doesn't come free as you wrote. After removing PLHash-related lines, I'm a bit 
reluctant to go back to PLDHash (which I guess needs similar auxiliary 
functions, etc). I can try adding a new keytype to nsHashKeys.h. Anyway, which 
way do you prefer?
PLDHashTable already contains stub structs and functions that I think do
everything we need here.  Something like this:

class nsFontXftInfo : public PLHashEntryStub
{
...
};

static const PLDHashTableOps fontHashOps = {
  PL_DHashAllocTable,
  PL_DHashFreeTable,
  PL_DHashGetKeyStub,
  PL_DHashStringKey,
  PL_DHashMatchStringKey,
  PL_DHashMoveEntryStub,
  PL_DHashFreeStringKey,
  PL_DHashFinalizeStub,
  NULL
};

Then where you want to add an entry, just do:

entry->key = strdup(family);

after determining that the entry was not already in the table.

I think I'd prefer doing it this way because we don't really gain anything from
using nsCString here.  Using a new key class in nsHashKeys.h should work exactly
the same though, so if you like the C++ interface I'm fine with doing it that way.
bryner, thanks for help. 
I added a new KeyType to nsHashKey.h Can you take a look?
When calling 'Get', I have to cast away 'const' because I use 'char *' as a key
in a new hashkey type. An alternative is to use 'const char *' as a key and
cast away const when calling free() in dtor (of a new HashKey class)

A third way might be to change the function signature of Get in
nsClassHashTable to

nsClassHashtable<KeyClass,T>::Get(const KeyType aKey, T** retVal) const

from

nsClassHashtable<KeyClass,T>::Get(KeyType aKey, T** retVal) const

But, I'm not sure that's allowed. What's the effect of 'double const' modifiers
(like const const sometype)?
Attachment #136946 - Attachment is obsolete: true
Attachment #136946 - Flags: superreview?(blizzard)
Attachment #136946 - Flags: review?(bryner)
bsmedberg should review the hashtable changes
Comment on attachment 136968 [details] [diff] [review]
a new patch with a new nsHashKey type

This needs a little cleanup.

>+        if (!gFontXftMaps.IsInitialized())
>+            gFontXftMaps.Init(INITIAL_FONT_MAP_SIZE);            
>+        if (!gFontXftMaps.IsInitialized()) { // error checking
>             FreeGlobals();
>             return NS_ERROR_OUT_OF_MEMORY;
>         }

This is cleaner thus:

if (!gFontXftMaps.IsInitialized() || gFontXftMaps.Init(INITIAL_FONT_MAP_SIZE))
{ // etc

>+    // cached entry found. 
>+    if (gFontXftMaps.Get(NS_CONST_CAST(char *, family), &info))

See below, you should make the KeyType a const char* and avoid the cast here.

>+    gFontXftMaps.Put(nsCRT::strdup(family), info); 

The "transfer ownership on construction" model is not good. Let's just
construct the class with a const char* and do a strdup in the constructor?

>+class NS_COM nsCharPtrHashKey : public PLDHashEntryHdr
>+{
>+public:
>+  typedef char* KeyType;

Use const char*? You don't ever manipulate the string.

>+  typedef const char* KeyTypePointer;
>+
>+  nsCharPtrHashKey(const char* aKey) : mKey(NS_CONST_CAST(char *, aKey)) { }
>+  nsCharPtrHashKey(const nsCharPtrHashKey& toCopy) :
>+    mKey(nsCRT::strdup(toCopy.mKey)) { }

nsCRT::strdup is deprecated, use plain strdup

>+  ~nsCharPtrHashKey() { CRTFREEIF(mKey); }

and use if (mKey) free(mKey) here.

>+
>+  const char* GetKey() const { return mKey; }
>+  const char* GetKeyPointer() const { return mKey; }
>+  PRBool KeyEquals(KeyTypePointer aKey) const
>+  {
>+    return !strcmp(mKey, aKey);
>+  }
>+
>+  static KeyTypePointer KeyToPointer(KeyType aKey) { return aKey; }
>+  static PLDHashNumber HashKey(KeyTypePointer aKey) { return nsCRT::HashCode(aKey); }
>+
>+  enum { ALLOW_MEMMOVE = PR_TRUE };
>+
>+private:
>+  char* mKey;

const char* here also (you can delete a const *)
>+};
>+
> #endif // nsTHashKeys_h__
Attachment #136968 - Flags: review-
I switched KeyType to const char *. I still have to cast away const when
calling free() in dtor because the buffer is allocated by strdup so that I
can't use |delete|.
Attachment #136968 - Attachment is obsolete: true
Comment on attachment 137061 [details] [diff] [review]
an updated patch per bsmedberg's comment

Oh yeah, free() doesn't take const pointers like delete does... grr. r=me
Attachment #137061 - Flags: review+
Comment on attachment 137061 [details] [diff] [review]
an updated patch per bsmedberg's comment

thanks for r.
asking for sr.
Attachment #137061 - Flags: superreview?(bryner)
Attachment #137061 - Flags: superreview?(bryner) → superreview+
Attachment #137061 - Flags: approval1.6?
Comment on attachment 137061 [details] [diff] [review]
an updated patch per bsmedberg's comment

a=tor for 1.6 checkin.
Attachment #137061 - Flags: approval1.6? → approval1.6+
Fix got landed. Thank you all.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: