Closed Bug 338454 Opened 15 years ago Closed 15 years ago

[BeOS]To implement correct restart in AppRunner for BeOS-platforms

Categories

(Toolkit :: Startup and Profile System, defect)

x86
BeOS
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: sergei_d, Assigned: sergei_d)

References

()

Details

(Keywords: fixed1.8.1)

Attachments

(1 file, 8 obsolete files)

Attached patch patch (obsolete) — Splinter Review
fiefox.in
addding
  export NO_EM_RESTART=1
to BeOS section
Assignee: nobody → sergei_d
Status: NEW → ASSIGNED
Summary: Workaround for restart issue → [BeOS]Workaround for restart issue
so, what are the downsides of this?
Cannot say, as I don't use FF myself.
But for people who tested it today, looks like it is improving FF behaviour.

So waiting for tqh's opinion, he is BeOS FF guru.
This will of course make it a common occurence for users to see the "Firefox is currently running but not responding" dialog when they launch a second copy of Firefox, unless there is BeOS magic that I don't know about which ensures only one process.
There is such magic in BeOS, if i understand problem correctly - each app has launch flag- Single;Multiple;Exclusive.

For proper working we need SingleLaunch, but that execve/XRE issue prevented us from uing it with FF.

SeaMonkey in BeOS uses SingleLaunch flag atm.
Also, there is more interesting version of that:
MultipleLaunch-ArgvOnly
Comment #6 - pls ignore it:)
Comment on attachment 222517 [details] [diff] [review]
patch

r?

(better than nothing)
Attachment #222517 - Flags: first-review?(thesuckiestemail)
wondering if it compiles and works better than current edition.
(In reply to comment #4)
> This will of course make it a common occurence for users to see the "Firefox is
> currently running but not responding" dialog when they launch a second copy of
> Firefox, unless there is BeOS magic that I don't know about which ensures only
> one process.

Isn't that the effect of MOZ_NO_REMOTE? Or can NO_EM_RESTART also cause that?
Maybe for second attachment some code required to avoid running second instance.
Wondering if SingleLaunch flag is sufficient for that.
ok, for roster version maybe B_ALREADY_RUNNING is as good as B_OK if we try to use SingleLaunch and don't take special measures to kill current instance
Comment on attachment 222517 [details] [diff] [review]
patch

r=thesuckiestemail@yahoo.se
Attachment #222517 - Flags: first-review?(thesuckiestemail) → first-review+
I don't think so. I think our best bet is to make it B_SINGLE_LAUNCH and and force no restarts at all.
I misread the original problem, I thought we were talking about MOZ_NO_REMOTE. NO_EM_RESTART is going to cause massive problems for people using extensions. Besides which, if we show the profile manager we restart anyway (there's no way to avoid that).

It will cause problems such as extensions being partially installed or partially uninstalled.
That is what I've understood earlier and why I never did that.
But having 4 running firefox-bin instances 3 of those are zombies is also questionable "feature".
(In reply to comment #15)
> I misread the original problem, I thought we were talking about MOZ_NO_REMOTE.
> NO_EM_RESTART is going to cause massive problems for people using extensions.
> Besides which, if we show the profile manager we restart anyway (there's no way
> to avoid that).
> 
> It will cause problems such as extensions being partially installed or
> partially uninstalled.
> 

extensions partially installed/uninstalled is happening now, on the 1st successful build of Firefox after getting THREADSAFE=1 sqlite in place.  So this problem already exists.  And is probably related to those "zombie" Firefox instances.
(In reply to comment #18)
To clarify, this is happening before applying any of Sergei's proposed changes/patches.
Attached patch patch nsAppRunner (obsolete) — Splinter Review
wrote long-long comment, but bugzilla failed:(
Will try again
Attachment #222517 - Attachment is obsolete: true
Attachment #222545 - Attachment is obsolete: true
next attempt of attachment comment.
1)roster->Launch() works sometimes, probably when components are still in memory/cached. Probably unusable until we get rid of start script
2)startscript patch considered by gurus as troublesome

3)New patch. I have hard dejavu feeling that I proposed it already centuries ago to tqh or even Paul. Calls be_app->Quit() before execv().
Works better than any previous version. No zombies, starts with empty profile etc.

Some thoughts:
a) What for is aNative->Quit() in beginning of LaunchChild()? What it does for BeOS and what it should actually do? Maybe proper place for be_app->Quite is there, inside that Quit()
b)Half-failing execv(). Wondering if BeOS impl relies on posix kill signal there, and if 
#if defined(XP_UNIX) || defined(XP_BEOS)
  InstallUnixSignalHandlers(argv[0]);
#endif
has as result some effect on that, negative or positive. If it really does something for BeOS.
c)Wondering if there is bug already filed - for fresh-build firefox-bin make don't create null/nihil of BeOS-required attributs - signature, mime-type, app launch flags, not to say about icon.
(In reply to comment #20)
> Created an attachment (id=222703) [edit]
> patch nsAppRunner
> 
> wrote long-long comment, but bugzilla failed:(
> Will try again
> 

Hmm, but that code doesn't relaunch, just quits?
(In reply to comment #21)
> next attempt of attachment comment.
> 1)roster->Launch() works sometimes, probably when components are still in
> memory/cached. Probably unusable until we get rid of start script
> 2)startscript patch considered by gurus as troublesome
> 
> 3)New patch. I have hard dejavu feeling that I proposed it already centuries
> ago to tqh or even Paul. Calls be_app->Quit() before execv().
> Works better than any previous version. No zombies, starts with empty profile
> etc.
> 
> Some thoughts:
> a) What for is aNative->Quit() in beginning of LaunchChild()? What it does for
> BeOS and what it should actually do? Maybe proper place for be_app->Quite is
> there, inside that Quit()
> b)Half-failing execv(). Wondering if BeOS impl relies on posix kill signal
> there, and if 
> #if defined(XP_UNIX) || defined(XP_BEOS)
>   InstallUnixSignalHandlers(argv[0]);
> #endif
> has as result some effect on that, negative or positive. If it really does
> something for BeOS.
> c)Wondering if there is bug already filed - for fresh-build firefox-bin make
> don't create null/nihil of BeOS-required attributs - signature, mime-type, app
> launch flags, not to say about icon.
> 

The problem with execve is with the command itself. It attaches one of the Firefox instances to wait for it's parent to die (usually Deskbar or Terminal) and doesn't die until it dies. That's very bad, and you should never need to terminate the Deskbar normally.
Internal restarting will always loose argv-messages as they will be passed into the first instance which then shuts down.
Interesting, why it (Quit()/execve) don't start for you.
 Faster machine? Older sources? Or maybe it works only form dist/bin ?
I will try to pack it and try at faster machine.
If it also "works sometimes", will try 2 another restart versions I "invented".

What about loosing arguments - if I understood things correctly, Mac port performs some trick with writing info to disk.
Attached patch patch with BMessenger (obsolete) — Splinter Review
same as previous, but uses be_app_messenger.SendMessage(B_QUIT_REQUESTED); instead quit.

Btw, story about arguments is interesting, partially depends on whether we use or not ArgsReceived. If we do, we can widely use BMessenging system, including app-independent roster to target messages to proper instance of application.

Also, if both last patches don't work for you, I can try to create version woth load_image() and more fine instance/thread control.
(In reply to comment #25)
> Interesting, why it (Quit()/execve) don't start for you.
>  Faster machine? Older sources? Or maybe it works only form dist/bin ?
> I will try to pack it and try at faster machine.
> If it also "works sometimes", will try 2 another restart versions I "invented".
> 
> What about loosing arguments - if I understood things correctly, Mac port
> performs some trick with writing info to disk.
> 
I think you misunderstood me, I just looked at the patch, and probably should have diffed it, but it only looked as the patch quitted.
Ah, misread the patch.

If you remove your Firefox without existing profile, and then quit. Do you still have Firefox processes. If you do, quit terminal or Deskbar (whichever way you launched it) and see if they go away.
Although this bug will improve things for bug 129411, it will not solve it. Only bug 271613 will.
"If you remove your Firefox without existing profile..." - sorry, i don't understand this sentense.

Btw, i tried now both versions (Quit and Sendmessage) with SingleLaunch and with and without profile - all works, no zombies, good exit.
Attached patch patch with printout (obsolete) — Splinter Review
Version which printouts arguments received by XRE_main and arguments in LaunchChild. Unfortunately i don't know how to force FF to use any other argument besides program path at restart, but maybe usable anyway.

And yeah, it all terminates properly under any circumstances I could invent, with any app launch flag
(In reply to comment #30)
> "If you remove your Firefox without existing profile..." - sorry, i don't
> understand this sentense.
> 
> Btw, i tried now both versions (Quit and Sendmessage) with SingleLaunch and
> with and without profile - all works, no zombies, good exit.
> 

If you launch your Firefox without existing profile...

That should force launchchild to run, which is the case we want to investigate.
So maybe execve wasn't that bad, just that be_app's looper was still running? But that wouldn't explain why the process disappeared when the parent is killed.

(Rephrasing sentences while writing them are bad :)
Tried now bit differen idea i mentioned in Comment 21: a)
Moved
be_app_messenger.SendMessage(B_QUIT_REQUESTED); 
from nsAppRunner::LaunchChild
to nsNativeSupportBeOS::Quit()

Works as expected. Dunno if it has some side effects, if that nsNativeSupport::Quit() is called in other places.
But it puts me to think that we should look also at other yet stub methods in nsNativeSupportBeOS
like ReOpen() 
(and maybe even OnLastWindowClosing())
for curiosity tried really-native version in nsAppRunner:
    extern char **environ;
    thread_id id;
      if ((id = load_image(gRestartArgc, (const char **)gRestartArgv, (const char **)environ)) < 0)

    return NS_ERROR_FAILURE;
    resume_thread(id);

It works too.
Attached patch patch, checking candidate (obsolete) — Splinter Review
r?

variant with SendMessage in aNative::Quit() and execv
Attachment #222703 - Attachment is obsolete: true
Attachment #222714 - Attachment is obsolete: true
Attachment #222716 - Attachment is obsolete: true
Attachment #222725 - Flags: first-review?(thesuckiestemail)
Looks like it works not for all testers:
http://community.livejournal.com/bezilla/194807.html?thread=1075447#t1075447
While for me and grave it worked,
for tigerdog every start with empty profile required two attempts,
and for thaflo all worked, until folder config/settings/Mozilla exists in system,
 disregarding existance there of Firefox profile.

Thus maybe solution with load_image->wait_for_thread->kill_thread  is worth to try instead existing unixish.

Interestingly, that all testers besides me have machine with almost equal performance - intels at 2 GHz or AMD at 1.5
checkin candidate patch does not work if no profile exists.  Starts, creates /boot/home/config/settings/mozilla but never completes component registration.  goes back to prompt before DOMWINDOW=1 is displayed.  If restarted, it simply goes back to prompt and never opens main window.
Comment on attachment 222725 [details] [diff] [review]
patch, checking candidate

cancelling.
Looks like every such thing is too dependent on situation, so needs more testing.

Will create version with 
wait_for_thread soon for testers
Attachment #222725 - Flags: first-review?(thesuckiestemail)
Looks like things are bit more clear now - difference between R5 where it works and Zeta, where it doesn't.
Wondering what is the reason - better execv in Zeta or fact that lot of app-messages are local in Dano/Zeta
http://community.livejournal.com/bezilla/194807.html?replyto=1077751
There is difference between execv() and load_image cases with SingleLaunch and empty profile (so restarts involved).
If you run it from terminal, with execv(), "initial" firefox keep running in Terminal - execv mimics like it is same process, and it wasn't gone. IIRC that how it is intended to work by design.

With load_image() in code, inspite seems working OK, proper restart, single instance, forefox looks "exited" in terminal, said "bye-bye" to parent process.

Wondering which behaviour is more convinient for our case.
Looks like version with be_app->Quit() in nsNativeAppSupport
and load_image() in nsAppRunner() is working even in Zeta
Attached patch patch with load_image() (obsolete) — Splinter Review
This version works both in R5-BONE and Zeta.
Kind of.
To be sure, we need well-working base build ready - with proper "placeless" profile import, working Options and Places.
Hope tigerdog or tqh will create such thing in nearest future.
So review request postponed, to test this patch with "perfect" build.
Attachment #222725 - Attachment is obsolete: true
Attached patch patch with wait_for (obsolete) — Splinter Review
same as previous but uses wait_for_thread instead of resume_thread.

For tigerdog attention
(In reply to comment #43)
> Created an attachment (id=222779) [edit]
> patch with wait_for
> 
> same as previous but uses wait_for_thread instead of resume_thread.
> 
> For tigerdog attention
> 

Finally, some valid test results:
Version 7 works correctly under all circumstances (optimized build, debug builds, no profile when starting, starting with profile from older version, single-launch flag set.)  Version 7 leaves no zombies.  Earlier problems with version 7 before were due to incorrect build procedure on my part.  

Version 8 leaves zombies in all circumstances where restarts are made; that is, first process keeps running after restart, leaving multiple instances running.

Version 7 works well under Zeta.  Will now build under R5 and BONE to complete validation.


Well, problem seems closer to solving.
When it does, we can open another bug - "Implement Updater for BeOS". As kind of very-futuristic TODO.
Because updater logic also uses execv for non-Mac platforms:
http://lxr.mozilla.org/seamonkey/source/toolkit/xre/nsUpdateDriver.cpp#70
but i'm afraid for execv wouldn't work for us as expected also in Updater case.
So experience may be really worth all those 8 attempts:)
Comment on attachment 222779 [details] [diff] [review]
patch with wait_for

Obsoleting, as it was created mostly to find reasons for weird behaviour with tigerdog's builds.And that reason was wrong building sequence.
Attachment #222779 - Attachment is obsolete: true
Additional observations:
Patch 7 works as desired under R5, BONE and Zeta.  Starts with no profile, old profile and current profile are all as expected.  Restarts do not leave zombie copies of Firefox running.

One note: while zombies are not present, it looks like additional shell processes still show in BeOS Process Controller.
Blocks: 314196
For tqh

It seems there was method to preven zombie-mess and always to have single instance running without all that work - be-app>Quit() etc even WITH MultipleLaucnh flag.

For that following code in nsNativeAppSupport::Start() is needed:
nsNativeAppSupportBeOS::Start(PRBool *aResult) 
{
    NS_ENSURE_ARG(aResult);
    if (be_roster->IsRunning("application/x-vnd.Mozilla-Firefox"))
    {
    	//printf("Already Running");
    	*aResult = PR_FALSE;
    	return NS_OK;
    }

--------
Win32 uses something alike for that purpose and it works for us too.
http://lxr.mozilla.org/seamonkey/source/xpfe/components/startup/public/nsINativeAppSupport.idl#79
As you see, it is intended for command-line argument processing
Yes, but it's not pretty, and you'd have to add your own argv message-passing I think. I think windows has some kind of instance just for message-passing.
Summary: [BeOS]Workaround for restart issue → [BeOS]To implement correct restart in AppRunner for BeOS-platforms
Attached patch final patchSplinter Review
same as attachment 22276 [details] [diff] [review]: but with outcommented statements removed
first review? from port-developer.

Adds BeOS case in nsAppRunner::LaunchChild,
adds Quit code in nsNativeAppSupportBeOS
Attachment #222776 - Attachment is obsolete: true
Attachment #223325 - Flags: first-review?(thesuckiestemail)
Comment on attachment 223325 [details] [diff] [review]
final patch

r=thesuckiestemail@yahoo.se

You might want to clean up the whitespace after endif though.
Attachment #223325 - Flags: first-review?(thesuckiestemail) → first-review+
Comment on attachment 223325 [details] [diff] [review]
final patch

BeOS-changes only, but additional look if dodn't affect any other platfrom welcomed.
So asking for
second r?
Attachment #223325 - Flags: second-review?(benjamin)
Now it is important for all mozilla apps, SeaMonkey-trunk included.
Severity: normal → major
Blocks: 129411
Adding dependency for "suiterunner" SeaMonkey.
Blocks: suiterunner
Comment on attachment 223325 [details] [diff] [review]
final patch

I'll rubber-stamp this since I don't pretend to understand BeOS programming.
Attachment #223325 - Flags: second-review?(benjamin) → second-review+
Checking in mozilla/toolkit/xre/nsAppRunner.cpp;
/cvsroot/mozilla/toolkit/xre/nsAppRunner.cpp,v  <--  nsAppRunner.cpp
new revision: 1.138; previous revision: 1.137
done
Checking in mozilla/toolkit/xre/nsNativeAppSupportBeOS.cpp;
/cvsroot/mozilla/toolkit/xre/nsNativeAppSupportBeOS.cpp,v  <--  nsNativeAppSupportBeOS.cpp
new revision: 1.2; previous revision: 1.1
done 
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Re-opening; this definitely should be committed to 1.8 branch (for firefox 2.0) if possible.  Neils, please add this to the "beos 2.0 tracker".
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Added dependency on tracker. 

Tigerdog, as a sidenote: bugs don't generally get reopened if they require backporting (as a Mozilla convention). If you've got other bugs that you think I should prioritize, just add me to the CC list and leave a comment. 
Blocks: 311032
Comment on attachment 223325 [details] [diff] [review]
final patch

Requesting approval for the MOZILLA_1_8_BRANCH. 

This is a BeOS-only change and will not affect any other platforms.
Attachment #223325 - Flags: approval1.8.1?
Comment on attachment 223325 [details] [diff] [review]
final patch

>-#endif
>+#endif 

Please don't add extra whitespace, though.
Attachment #223325 - Flags: approval1.8.1? → approval1.8.1+
Checking in mozilla/toolkit/xre//nsAppRunner.cpp;
/cvsroot/mozilla/toolkit/xre/nsAppRunner.cpp,v  <--  nsAppRunner.cpp
new revision: 1.113.2.15; previous revision: 1.113.2.14
done
Checking in mozilla/toolkit/xre//nsNativeAppSupportBeOS.cpp;
/cvsroot/mozilla/toolkit/xre/nsNativeAppSupportBeOS.cpp,v  <--  nsNativeAppSupportBeOS.cpp
new revision: 1.1.8.1; previous revision: 1.1
done 
Keywords: fixed1.8.1
Closing bug
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Component: XRE Startup → Startup and Profile System
QA Contact: xre.startup → startup
You need to log in before you can comment on or make changes to this bug.