Closed Bug 92793 Opened 23 years ago Closed 22 years ago

BeOS classes are not threadsafe

Categories

(Core :: XPCOM, defect)

x86
BeOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla1.2final

People

(Reporter: cls, Assigned: rpotts)

References

Details

Attachments

(2 files, 1 obsolete file)

In a debug BeOS build, there are a ton of thread safety warnings.  So many that
all other debug messages are obscured.  At a glance, the main culprits seem to
be nsTimerBeOS, nsToolkit & nsBaseWidget.  The timer I can understand as it
explicitly calls spawn_thread() to create a native BeOS thread to actually
manage the timers but I'm not sure about the other two.
Factoid I forgot to mention: In BeOS, every window (BWindow) has it's own
thread.  I don't know of any way to force each BWindow to use a particular
thread nor am I sure that's what we want to do.  How hard would it be to
restructure Mozilla to understand a thread per window scheme rather than a
central thread used by everything?
i'm not sure what to say...  Running each top-level window in its own thread
violates mozilla's threading model - of a *single* thread for all UI/layout
activity.

if BeOS *requires* a thread per top-level window, then some type of proxying
layer  is required between the BeOS native widget implementation and the rest of
the mozilla APIs...  And all of the mozilla components will need to run in a
*single* worker thread.

At this point it is not feasible to restructure mozilla to support a
multithreaded UI model.
In my understanding, 
Bezilla's event handling is processed in just one thread 
which runs nsAppShell::Run().

BeOS has two thread per each one BWindow. One for app_server and 
one for event handling. BeOS events are received 
on these BWindow threads and it calls hook functions like 
BWindow::MouseDown() and BWindow::KeyDown().

In current Bezilla, These hook functions call
nsToolkit::CallMathodAsync(), to send messages to the 
Bezilla's event-processing thread.
I think this is the "proxying layer" of Bezilla, and
thus, Bezilla mostly conforms to the mozilla one thread model.

Then, what's bad?

I just looked into the source code and search for where 
the "thread safe" message occurs. And the results are:

1)NS_IF_ADDREF(mToolkit) in nsBaseWidget::GetToolkit()
2)NS_RELEASE(t) in many hook functions of nsViewBeOS and nsWindowBeOS
3)NS_RELEASE(tobjbeos) in nsTimerBeOS::TimerThreadFunc

4)NS_ADDREF(ref) in MethodInfo::MethodInfo() (nsSwitchToUIThread.h)

So the "thread safe" message says that we shouldn't addref or 
release one object from more than one thread. (Is it true?)

If so, we can do some modification to avoid that.
a)do not use nsToolkit to send messages, do that directly,
  like the timer is doing.
b)do not send nsWindow pointers on messages and 
  use ID-like thing to lookup nsWindows.
c)process timers on the thread that runs nsAppShell::Run().

Though, it seems that not so severe problem is happening 
by this bug and we can just supress the annoying messages by
export XPCOM_CHECK_THREADSAFE=0


so, it sounds as if you are not in too bad shape :-)  since the proxy layer is
in place all that you really have to worry about is which thread you are doing
some addref/release calls on...

the danger of ignoring the threadsafe assertions is that if one of your
Release() calls ends up destroying the object, then the destructor (and anything
that it calls) will be called on the wrong thread.  this is generally very bad :-)

the other issue is one of contention...  if your objects are manipulated by
multiple threads unless the increment/decrement of mRefCnt is atomic you can end
up corrupting the count - and possibly deleting the object twice

in general, ignoring the threadsafe assertions will lead to sporatic, random
crashes :-(
rpotts,
Thanks for the explanation.
Then we sould make some patch later...
I had mentioned on the mozilla-beos newsgroup a while ago that I didn't think
our threads were implemented very well.  My initial reasoning was the same
reason this bug was posted.  Today, I tried to run the beos profile command
against mozilla-bin, and all it did was complain about bad thread ids.  Checking
the thread ids is what the threadsafe assertions do as well, pretty much.  This
could potentially be a big problem.  I would have no idea how to resolve it, but
I figured that I would at least mention this fact as well. 
Target Milestone: --- → mozilla1.0
Target Milestone: mozilla1.0 → mozilla1.1
Blocks: 154267
Discussion had been floating around to make ALL ref counting atomic.  I don't
think that this is needed, as it really only affects the BeOS widget library. 
This is because, as we should all know by now, BeOS Windows run in their own
thread.  The one problem, is that the nsBaseWidget does not call the atomic
versions of addref/release.  Changing that, and all other BeOS components,
should GREATLY help the situation.  And, it gets rid of the annoying "not
threadsafe" messages :)
Taking assignment

The "proxy" layer does exist.  Methods "should" be called on the main "UI"
thread, by using the CallMethod and CallMethodAsync methods in nsToolkit.  A
port is used to send information between calling thread and the UI thread.

Most of the problems under BeOS seem to stem from reference counting not being
atomic.  At one point, the idea was to make all ADDREF calls atomic, which was
shot down :(  Well, instead, the attached patch just makes the widget library
ADDREF calls atomic.  I had to modify the nsBaseWidget src to #ifdef XP_BEOS
since the BeOS window extends from it.  If there is a way around that, so that
we can keep the xpwidgets "pure", please, let me know.

See also Bug#154267
Status: NEW → ASSIGNED
please remember that the reference count is *not* the only data that must be
protected.

Each class that is made threadsafe must also protect ALL its member data, to
prevent corruption !!

fixing up the reference counts will get rid of the assertions, but will not fix
the underlying crashes...  it will only move the crashes to places where other
member variables are manipulated in non-threadsafe ways...

-- rick
Blocks: 160605
oops... never mind :-)
i think i've *finally* remembered the issues relating to this bug !!

since these classes are being made threadsafe *only* for proxying purposes,
let's put some comments in the header files saying so -- so people like me won't
freak out later when they forget why :-)

it's important to make it clear that the actual implementations are not
threadsafe -- to avoid invalid assumptions being made in the future...

-- rick
Paul, do you need review for this patch already, or is there more work planned
on it?
Sergei, I need to comment the code, especially in the BaseWidget class.
Then, I'll need an sr=
Unfortunately i'm incompetent in this topic, because i don't understand this
part of Mozilla sufficiently, but i have feeling that BeZilla really has better
stability with those patches here. If this experience is sufficient for r=, i
may put it.
But i think that better person to ask for r= is Daniel. Inspite he don't
participate in development actively nowadays, he said that he still can review
patches, if there is such need.
I took a quick look at the patch... here are some comments:

I think it would be cleaner to put the code for the threadsafe addref/release in
beos/nsWindow rather than #ifdef it in xpwidgets/nsBaseWidget.  Since both of
these methods are virtual... all you need to do is override them in nsWindow

You can use the IMPL_THREADSAFE_ADDREF(nsWindow) and
IMPL_THREADSAFE_RELEASE(nsWindow) macros for the actual implementation.

Once you do that, you should also remove the redundant (non-threadsafe)
implementations of addref/release from the following subclasses:
  - nsButton
  - nsCheckButton
  - nsLabel
  - nsScrollbar
  - nsTextWidget
For all of the above classes, only QueryInterface needs to be overridden. 
Overriding addref/release just looses the threadsafety that nsBaseWindow
provided :-(

Also, it's unclear why nsSound, nsFilePicker and nsBidiKeyboard require
threadsafe ISupports...  Are these classes *actually* proxied to?

And finally, I think there should be a 'big' comment somewhere (probably
nsWindow.h) explaining why the threadsafe ISupports is necessary.

hope thses comments help,
-- rick


The following files are no longer included in the BeOS build, but I modified
them anyway.  They may not compile, but, at this point, it really does not
matter.  They are only here for historical purposes.
  - nsButton
  - nsCheckButton
  - nsLabel
  - nsTextWidget

The only change I am not sure of is for nsScrollBar.  Instead of explicitly
making it threadsafe, I've used the *_INHERITED macros.

As for nsFilePicker, it is currently implemented in the CVS tree as being
THREADSAFE.  This is probably not needed, since it locks mozilla while openning
its native FilePicker window.
Attachment #96622 - Attachment is obsolete: true
Updated the target and severity.
Severity: normal → critical
Target Milestone: mozilla1.1alpha → mozilla1.2final
Since nsBeOSCharSet is called from within the Window's thread, outside of the
widget library, it's reference counting needs to be atomic as well.
Seeing as these are all now BeOS specific changes, should in no way affect other
platforms, and add GREATLY to the stability of Mozilla under BeOS, I would like
to have the above patches applied to the tree in time for the 1.2 release.

Rick, since the tree is supposed to branch for 1.2final within the next couple
of days, could you, if you have time, review my updated patch?  If it is OK, I
will contact drivers for approval.  Thanks.
Comment on attachment 104211 [details] [diff] [review]
Updated patch as per comments by Rick (thanx)

r=rpotts@netscape.com
Attachment #104211 - Flags: review+
Comment on attachment 104708 [details] [diff] [review]
nsBeOSCharSet needs THREADSAFE ref counting as well

r=rpotts@netscape.com
Attachment #104708 - Flags: review+
these patches look great!

-- rick
Comment on attachment 104211 [details] [diff] [review]
Updated patch as per comments by Rick (thanx)

sr=blizzard
Attachment #104211 - Flags: superreview+
Comment on attachment 104708 [details] [diff] [review]
nsBeOSCharSet needs THREADSAFE ref counting as well

sr=blizzard
Attachment #104708 - Flags: superreview+
Comment on attachment 104708 [details] [diff] [review]
nsBeOSCharSet needs THREADSAFE ref counting as well

a=blizzard on behalf of drivers for 1.2final
Attachment #104708 - Flags: approval+
Comment on attachment 104708 [details] [diff] [review]
nsBeOSCharSet needs THREADSAFE ref counting as well

a=asa for checkin to 1.2 (on behalf of drivers)
Patches checked into tree.
Marking FIXED
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Moving all threading bugs to XPCOM. See bug 160356.
Component: Threading → XPCOM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: