Closed
Bug 433005
Opened 17 years ago
Closed 17 years ago
Running a process synchronously within a thread freezes Firefox 3 during process execution
Categories
(Core :: XPConnect, defect)
Core
XPConnect
Tracking
()
VERIFIED
FIXED
People
(Reporter: michel.gutierrez, Assigned: jst)
References
Details
(Keywords: regression, Whiteboard: [RC2+])
Attachments
(2 files, 3 obsolete files)
11.72 KB,
application/x-xpinstall
|
Details | |
2.70 KB,
patch
|
jst
:
review+
brendan
:
superreview+
mtschrep
:
approval1.9+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b4) Gecko/2008030318 Firefox/3.0b4
From an extension, you want to run an external application and get notified when this application ends.
The way to do so is:
- to create a thread (using an instance of @mozilla.org/thread;1 on Firefox 2, or the thread manager on Firefox 3)
- from within that thread, create an nsIProcess and run it in synchronous mode (myprocess.run(true,...))
- when myprocess.run() returns, the application is finished and you access the exitValue of the process
This works just fine on Firefox 2.
On Firefox 3, the browser UI gets frozen during the execution of the application.
This is a shame as this problem prevents having extensions doing the glue between Firefox and great third-party applications.
Reproducible: Always
Steps to Reproduce:
1. Install the TestProcess extension (attached separately)
2. Go to menu Tools, choose 'Test process'
3. Click OK
4. (an external application runs for 20 seconds)
4.1 try to open Firefox menus to verify browser responsiveness
5. (after 20 seconds, an alert box appears notifying the end of the external application)
Actual Results:
On Firefox 2, no problem.
On Firefox 3, it is not possible to do anything with the browser at step 4.1
Expected Results:
Firefox 3 should behave like Firefox 2 and not freeze the browser during external application running
The problem has been seen on Linux and Windows XP.
This bug is a show stopper for running the Video DownloadHelper extension v3.1 (that integrates video conversion through ffmpeg).
Reporter | ||
Comment 1•17 years ago
|
||
Once installed, the extension add an entry 'Test Process' into the 'Tools' menu.
Choosing this menu entry, a dialog window appears to confirm the execution of an embedded binary.
The extension embeds binaries for windows and linux/i686. The source code of this binary is:
#ifdef WIN32
#include "windows.h"
#endif
main() {
#ifdef WIN32
Sleep(20000);
#else
sleep(20);
#endif
return 123;
}
Reporter | ||
Comment 2•17 years ago
|
||
This is the source code (all contained in the extension overlay) for extension TestProcess.
Reporter | ||
Comment 3•17 years ago
|
||
It happens also on Firefox 3.0rc1
Version: unspecified → 3.0 Branch
Comment 4•17 years ago
|
||
From mozilla.dev.platform, the stack of the two relevant deadlocking threads are:
ntdll!KiFastSystemCallRet
ntdll!ZwWaitForSingleObject+0xc
kernel32!WaitForSingleObjectEx+0xa8
kernel32!WaitForSingleObject+0x12
nspr4!_PR_MD_WAIT_CV+0xc9
nspr4!_PR_WaitCondVar+0x58
nspr4!PR_WaitCondVar+0x3b
js3250!ClaimTitle+0x1afde
js3250!js_LookupPropertyWithFlags+0x898
...
XRE_main
ntdll!KiFastSystemCallRet
ntdll!ZwWaitForSingleObject+0xc
kernel32!WaitForSingleObjectEx+0xa8
kernel32!WaitForSingleObject+0x12
xul!nsProcess::Run+0xe2
xul!NS_InvokeByIndex_P+0x27
xul!XPCWrappedNative::CallMethod+0x4cb
js3250!js_NewGCThing+0x205
js3250!js_DefineNativeProperty+0x2ff
js3250!js_DefineProperty+0x31
xul!DefinePropertyIfFound+0x22c
js3250!js_Invoke+0x2bb
js3250!js_Interpret+0x344
js3250!js_Invoke+0x37e
xul!nsXPCWrappedJSClass::CallMethod+0x5c0
xul!nsXPCWrappedJS::CallMethod+0x38
xul!PrepareAndDispatch+0xe7
xul!SharedStub+0x16
xul!nsThread::ProcessNextEvent+0x218
xul!nsThread::ThreadFunc+0x71
nspr4!_PR_NativeRunThread+0x169
nspr4!pr_root+0xd
MOZCRT19!_callthreadstartex+0x48
MOZCRT19!_threadstartex+0x66
kernel32!BaseThreadStart+0x37
What's happening is that we're making a blocking JS->XPCOM call off the main thread. In gecko 1.8 we would suspend JS requests on all such calls. We removed the suspend-request code in bug 413774, so we're blocking on a GC safepoint in the main thread which won't happen until the synchronous XPCOM call is finished.
Not sure whether this is worth blocking on... the simple way I can think to do this is to suspend requests around XPCOM calls only if they happen off the main thread (since the main thread isn't supposed to block, we don't need to suspend on the main thread). Brendan, thoughts?
Status: UNCONFIRMED → NEW
Component: General → XPConnect
Ever confirmed: true
Flags: wanted1.9.0.x?
Keywords: regression
Product: Firefox → Core
QA Contact: general → xpconnect
Version: 3.0 Branch → Trunk
Comment 5•17 years ago
|
||
Mook notes that you aren't allowed to use multithreaded code from a DOM script: only from JS components. So as written, this testcase is invalid.
If it were reproducable with a JS component, we should renominate somehow.
Flags: wanted1.9.0.x?
Flags: blocking1.9?
Comment 6•17 years ago
|
||
nsProcess::Run could (with some callbacks to avoid out-of-build-order compile-time dependencies) suspend and resume in the blocking case. That would solve the problem too, but bsmedberg's suggestion of non-main XPConnect calls auto-suspending seems like the safe fix for 1.9.
This is a bad incompatibility, sorry I didn't see it when reviewing jst's patch for bug 413774.
No workaround, so critical not major.
Pretty sure this will repro nicely from a JS component -- why wouldn't it?
/be
Severity: major → critical
Whiteboard: [RC2?]
Updated•17 years ago
|
Flags: wanted1.9.0.x?
accessing objects from the dom global off thread should trigger their own deadlocks. but yes, we should just turn autosuspend back on for non main thread instances.
Comment 8•17 years ago
|
||
Brendan, are you suggesting we spin an RC2 for this?
Reporter | ||
Comment 9•17 years ago
|
||
To make sure the testcase is valid and the issue is not due to doing multithreading at the DOM side, it is now implemented as a component.
The implemented service provides 3 non-blocking functions:
void startProcess(in string exePath);
boolean isProcessFinished();
PRInt32 getProcessExitValue();
The DOM side calls startProcess and polls the service on isProcessFinished.
Attachment #320194 -
Attachment is obsolete: true
Attachment #320195 -
Attachment is obsolete: true
Comment 10•17 years ago
|
||
This restores the request-suspension off the main thread. Not going to request review until this is built and tested, and I need to write a unit test.
Assignee: nobody → benjamin
Status: NEW → ASSIGNED
Comment 11•17 years ago
|
||
I may have the wrong status notation -- I'm suggesting this could cause RC2 (it's a bad platform incompatibility), and at least should ride along into any RC2 that is caused by other bugs.
/be
Reporter | ||
Comment 12•17 years ago
|
||
I confirm Benjamin's patch solves the issue on the testcase extension. Many thanks :)
Comment 13•17 years ago
|
||
Benjamin is there a way to add test coverage for this? Can you give us a sense of the regression risk?
Assignee | ||
Updated•17 years ago
|
Attachment #321831 -
Flags: superreview?(brendan)
Attachment #321831 -
Flags: review+
Comment 14•17 years ago
|
||
Comment on attachment 321831 [details] [diff] [review]
Restore request suspension off the main thread
Don't want to ding DOM perf -- or really, want to add as few cycles as possible. So instead of NS_IsMainThread(), compare cx->thread to the JSThread* stashed for the main thread in a static of XPCCallContext. We're all close friends in xpcprivate.h...
/be
Comment 15•17 years ago
|
||
Could just use C++ class friendship, I mean.
/be
Assignee | ||
Comment 16•17 years ago
|
||
I'm unable to test this atm, so others please test before landing...
Attachment #322064 -
Flags: superreview?(brendan)
Attachment #322064 -
Flags: review+
Assignee | ||
Updated•17 years ago
|
Attachment #321831 -
Attachment is obsolete: true
Attachment #321831 -
Flags: superreview?(brendan)
Comment 17•17 years ago
|
||
Comment on attachment 322064 [details] [diff] [review]
bsmedbergs fix with brendan's suggestion.
[Checkin: Comment 27]
Sure, if this tests well and does not regress perf.
/be
Attachment #322064 -
Flags: superreview?(brendan) → superreview+
Comment 18•17 years ago
|
||
Hi,
while trying to get my "External Editor" thunderbird extension ready for TB3 with the new thread API, I fall on a bug which might be this one. But not sure:
Situation:
When I launch a blocking task in a background thread, the GUI hangs until the background thread exits (TB3 alpha1)
How to reproduce
I've simplified and finally came to this simple example which I launch
as a one-liner directly in the console. This hangs the GUI once for
all after a few seconds.
function myEvent(func) {
this.run = func;
}
myEvent.prototype = {
QueryInterface : function(iid) {
if (iid.equals(Components.interfaces.nsIRunnable) ||
iid.equals(Components.interfaces.nsISupports)) {
return this;
}
throw Components.results.NS_ERROR_NO_INTERFACE;
}
};
tm = Components.classes["@mozilla.org/thread-manager;1"].getService();
thread = tm.newThread(0);
runable = new myEvent(function() {for (var i = 0; i<= 100; i++)
{ Math.cos(i); }}); // consume some time...
thread.dispatch(runable, this.thread.DISPATCH_NORMAL);
I'll let some mozilla experts open a new bug if this is something different.
Thanks
Comment 19•17 years ago
|
||
Let's get a try server build with this, so that _somebody_ can test it?
Updated•17 years ago
|
Whiteboard: [RC2?] → [RC2?][RC2+]
Comment 20•17 years ago
|
||
Comment on attachment 322064 [details] [diff] [review]
bsmedbergs fix with brendan's suggestion.
[Checkin: Comment 27]
a+ schrep for 3.0.1 or RC2. Please land on CVS trunk.
Attachment #322064 -
Flags: approval1.9+
Assignee | ||
Comment 21•17 years ago
|
||
FWIW, try server builds available here (mac and linux only, windows failed due to tryserver issues):
https://build.mozilla.org/tryserver-builds/2008-05-22_15:57-jst@mozilla.com-sync-process/
Comment 22•17 years ago
|
||
I can't seem to write an xpcshell testcase that won't assert due to threadsafety issues... and since I can't find a way to add custom components for just one xpcshell test, I'm going to punt on that for the time being.
Assignee: benjamin → jst
Status: ASSIGNED → NEW
Comment 23•17 years ago
|
||
After conversion from .FLV to .MPEG4 I got a message that the file couldn't be read by Quicktime... It always happens. Any help on this issue?
Comment 24•17 years ago
|
||
I am running Firefox 2.0.0.14 and the converter works well.
Reporter | ||
Comment 25•17 years ago
|
||
Oscar, Rob, this is not the right place for DH related stuff. I reply to you directly by mail.
Comment 26•17 years ago
|
||
me too-please help
Updated•17 years ago
|
Status: NEW → ASSIGNED
Hardware: PC → All
Updated•17 years ago
|
Keywords: checkin-needed
Updated•17 years ago
|
Whiteboard: [RC2?][RC2+] → [RC2+]
Updated•17 years ago
|
Flags: blocking1.9+
Assignee | ||
Comment 27•17 years ago
|
||
Fix checked in on the CVS trunk.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Updated•17 years ago
|
Keywords: checkin-needed
Updated•17 years ago
|
Attachment #322064 -
Attachment description: bsmedbergs fix with brendan's suggestion. → bsmedbergs fix with brendan's suggestion.
[Checkin: Comment 27]
Comment 28•17 years ago
|
||
Verified with RC2 on Linux (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008052912 Firefox/3.0). I get a "Process exited with value 123" alert after the extension runs for a while and no issues with responsiveness in menus and the browser in general while it is running.
Status: RESOLVED → VERIFIED
Comment 29•17 years ago
|
||
Hi All !
I have problem like described in this bug
I was made extension for FF 1.5 / 2.x and it works fine. But in FF 3, i got the next trouble :
When my thread is running, the FF UI is frozen, here is part of code which creates thread and do the some job within thread
void MyExtension::Start()
{
...
if (NS_FAILED(threadManager->NewThread(0, getter_AddRefs(m_WorkingThread))))
{
throw;
}
m_WorkingThread->Dispatch(this, nsIThread::DISPATCH_NORMAL);
...
}
void MyExtension::OnDataReceived(void* p)
{
SetEvent(((MyExtension*)p)->m_ThreadEvent);
}
NS_IMETHODIMP MyExtension::Run()
{
WCHAR szName[256];
memset(szName, 0, sizeof(szName));
_ltow((long)this, szName, 10);
m_ThreadEvent = CreateEvent(NULL, FALSE, FALSE, szName);
ResetEvent(m_ThreadEvent);
m_pProcessor->AttachCallback(this, &Handler::OnDataReceived);
m_sMime.Assign(m_pProcessor->Start());
UINT32 nTimeoutCount = 0;
while (WaitForSingleObject(m_ThreadEvent, 1000) == WAIT_TIMEOUT)
{
nTimeoutCount++;
if (nTimeoutCount == 120)
return NS_ERROR_FAILURE;
}
m_pProcessor->Finish();
m_pProcessor->GetDataAndClear((void**)&m_pData, &m_DataLength);
delete m_pProcessor;
m_pProcessor = NULL;
return NS_OK;
}
what's wrong here ?
Updated•16 years ago
|
Flags: wanted1.9.0.x?
Comment 30•14 years ago
|
||
(In reply to comment #29)
> مرحبا جميعا! لدي مشكلة وصفها في مثل هذا الخطأ الأول تم التمديد ل1،5 فرنك فرنسي / 2.x وأنه يعمل بشكل جيد. ولكن في 3 فرنك فرنسي ، وحصلت المشكلة التالية : عند موضوع بلدي قيد التشغيل ، واجهة المستخدم المجمدة هو فرنك فرنسي ، وهنا جزء من التعليمات البرمجية التي تخلق الموضوع والقيام بهذه المهمة بعض الفراغ في موضوع MyExtension : : ابدأ () {... وإذا (NS_FAILED (threadManager -> NewThread (0 ، getter_AddRefs (m_WorkingThread)))) {رمي ؛} m_WorkingThread -> الشحن (هذا ، nsIThread : DISPATCH_NORMAL) ؛}... باطل MyExtension : OnDataReceived (الفراغ * ع) {SetEvent (((MyExtension *) ع) --> m_ThreadEvent) ؛} NS_IMETHODIMP MyExtension : : تشغيل () {WCHAR szName [256] ؛ والعكس شريط ادوات (szName ، 0 ، sizeof (szName)) ؛ _ltow ((طويلة) هذا szName ، ، 10 ) ؛ m_ThreadEvent = CreateEvent (فارغة ، كاذبة ، كاذبة ، szName) ؛ ResetEvent (m_ThreadEvent) ؛ m_pProcessor -> AttachCallback (وهذا ، ومعالج : OnDataReceived) ؛ m_sMime.Assign (m_pProcessor -> ابدأ ()) ؛ nTimeoutCount UInt32 = 0 ؛ في حين أن (WaitForSingleObject (m_ThreadEvent ، 1000) == WAIT_TIMEOUT) {nTimeoutCount + + ، وإذا كان (nTimeoutCount == 120) NS_ERROR_FAILURE العودة ؛} m_pProcessor -> إنهاء () ؛ m_pProcessor -> GetDataAndClear ((الفراغ **) وm_pData ، وm_DataLength) ؛ حذف m_pProcessor ؛ العودة ؛ NS_OK ؛ = m_pProcessor فارغة} ما هو الخطأ هنا؟
You need to log in
before you can comment on or make changes to this bug.
Description
•