Closed Bug 433005 Opened 17 years ago Closed 17 years ago

Running a process synchronously within a thread freezes Firefox 3 during process execution

Categories

(Core :: XPConnect, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: michel.gutierrez, Assigned: jst)

References

Details

(Keywords: regression, Whiteboard: [RC2+])

Attachments

(2 files, 3 obsolete files)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b4) Gecko/2008030318 Firefox/3.0b4 From an extension, you want to run an external application and get notified when this application ends. The way to do so is: - to create a thread (using an instance of @mozilla.org/thread;1 on Firefox 2, or the thread manager on Firefox 3) - from within that thread, create an nsIProcess and run it in synchronous mode (myprocess.run(true,...)) - when myprocess.run() returns, the application is finished and you access the exitValue of the process This works just fine on Firefox 2. On Firefox 3, the browser UI gets frozen during the execution of the application. This is a shame as this problem prevents having extensions doing the glue between Firefox and great third-party applications. Reproducible: Always Steps to Reproduce: 1. Install the TestProcess extension (attached separately) 2. Go to menu Tools, choose 'Test process' 3. Click OK 4. (an external application runs for 20 seconds) 4.1 try to open Firefox menus to verify browser responsiveness 5. (after 20 seconds, an alert box appears notifying the end of the external application) Actual Results: On Firefox 2, no problem. On Firefox 3, it is not possible to do anything with the browser at step 4.1 Expected Results: Firefox 3 should behave like Firefox 2 and not freeze the browser during external application running The problem has been seen on Linux and Windows XP. This bug is a show stopper for running the Video DownloadHelper extension v3.1 (that integrates video conversion through ffmpeg).
Once installed, the extension add an entry 'Test Process' into the 'Tools' menu. Choosing this menu entry, a dialog window appears to confirm the execution of an embedded binary. The extension embeds binaries for windows and linux/i686. The source code of this binary is: #ifdef WIN32 #include "windows.h" #endif main() { #ifdef WIN32 Sleep(20000); #else sleep(20); #endif return 123; }
This is the source code (all contained in the extension overlay) for extension TestProcess.
It happens also on Firefox 3.0rc1
Version: unspecified → 3.0 Branch
From mozilla.dev.platform, the stack of the two relevant deadlocking threads are: ntdll!KiFastSystemCallRet ntdll!ZwWaitForSingleObject+0xc kernel32!WaitForSingleObjectEx+0xa8 kernel32!WaitForSingleObject+0x12 nspr4!_PR_MD_WAIT_CV+0xc9 nspr4!_PR_WaitCondVar+0x58 nspr4!PR_WaitCondVar+0x3b js3250!ClaimTitle+0x1afde js3250!js_LookupPropertyWithFlags+0x898 ... XRE_main ntdll!KiFastSystemCallRet ntdll!ZwWaitForSingleObject+0xc kernel32!WaitForSingleObjectEx+0xa8 kernel32!WaitForSingleObject+0x12 xul!nsProcess::Run+0xe2 xul!NS_InvokeByIndex_P+0x27 xul!XPCWrappedNative::CallMethod+0x4cb js3250!js_NewGCThing+0x205 js3250!js_DefineNativeProperty+0x2ff js3250!js_DefineProperty+0x31 xul!DefinePropertyIfFound+0x22c js3250!js_Invoke+0x2bb js3250!js_Interpret+0x344 js3250!js_Invoke+0x37e xul!nsXPCWrappedJSClass::CallMethod+0x5c0 xul!nsXPCWrappedJS::CallMethod+0x38 xul!PrepareAndDispatch+0xe7 xul!SharedStub+0x16 xul!nsThread::ProcessNextEvent+0x218 xul!nsThread::ThreadFunc+0x71 nspr4!_PR_NativeRunThread+0x169 nspr4!pr_root+0xd MOZCRT19!_callthreadstartex+0x48 MOZCRT19!_threadstartex+0x66 kernel32!BaseThreadStart+0x37 What's happening is that we're making a blocking JS->XPCOM call off the main thread. In gecko 1.8 we would suspend JS requests on all such calls. We removed the suspend-request code in bug 413774, so we're blocking on a GC safepoint in the main thread which won't happen until the synchronous XPCOM call is finished. Not sure whether this is worth blocking on... the simple way I can think to do this is to suspend requests around XPCOM calls only if they happen off the main thread (since the main thread isn't supposed to block, we don't need to suspend on the main thread). Brendan, thoughts?
Status: UNCONFIRMED → NEW
Component: General → XPConnect
Ever confirmed: true
Flags: wanted1.9.0.x?
Keywords: regression
Product: Firefox → Core
QA Contact: general → xpconnect
Version: 3.0 Branch → Trunk
Blocks: 413774
Flags: blocking1.9?
Mook notes that you aren't allowed to use multithreaded code from a DOM script: only from JS components. So as written, this testcase is invalid. If it were reproducable with a JS component, we should renominate somehow.
Flags: wanted1.9.0.x?
Flags: blocking1.9?
nsProcess::Run could (with some callbacks to avoid out-of-build-order compile-time dependencies) suspend and resume in the blocking case. That would solve the problem too, but bsmedberg's suggestion of non-main XPConnect calls auto-suspending seems like the safe fix for 1.9. This is a bad incompatibility, sorry I didn't see it when reviewing jst's patch for bug 413774. No workaround, so critical not major. Pretty sure this will repro nicely from a JS component -- why wouldn't it? /be
Severity: major → critical
Whiteboard: [RC2?]
Flags: wanted1.9.0.x?
accessing objects from the dom global off thread should trigger their own deadlocks. but yes, we should just turn autosuspend back on for non main thread instances.
Brendan, are you suggesting we spin an RC2 for this?
To make sure the testcase is valid and the issue is not due to doing multithreading at the DOM side, it is now implemented as a component. The implemented service provides 3 non-blocking functions: void startProcess(in string exePath); boolean isProcessFinished(); PRInt32 getProcessExitValue(); The DOM side calls startProcess and polls the service on isProcessFinished.
Attachment #320194 - Attachment is obsolete: true
Attachment #320195 - Attachment is obsolete: true
This restores the request-suspension off the main thread. Not going to request review until this is built and tested, and I need to write a unit test.
Assignee: nobody → benjamin
Status: NEW → ASSIGNED
I may have the wrong status notation -- I'm suggesting this could cause RC2 (it's a bad platform incompatibility), and at least should ride along into any RC2 that is caused by other bugs. /be
I confirm Benjamin's patch solves the issue on the testcase extension. Many thanks :)
Benjamin is there a way to add test coverage for this? Can you give us a sense of the regression risk?
Attachment #321831 - Flags: superreview?(brendan)
Attachment #321831 - Flags: review+
Depends on: 435090
Comment on attachment 321831 [details] [diff] [review] Restore request suspension off the main thread Don't want to ding DOM perf -- or really, want to add as few cycles as possible. So instead of NS_IsMainThread(), compare cx->thread to the JSThread* stashed for the main thread in a static of XPCCallContext. We're all close friends in xpcprivate.h... /be
Could just use C++ class friendship, I mean. /be
I'm unable to test this atm, so others please test before landing...
Attachment #322064 - Flags: superreview?(brendan)
Attachment #322064 - Flags: review+
Attachment #321831 - Attachment is obsolete: true
Attachment #321831 - Flags: superreview?(brendan)
Comment on attachment 322064 [details] [diff] [review] bsmedbergs fix with brendan's suggestion. [Checkin: Comment 27] Sure, if this tests well and does not regress perf. /be
Attachment #322064 - Flags: superreview?(brendan) → superreview+
Hi, while trying to get my "External Editor" thunderbird extension ready for TB3 with the new thread API, I fall on a bug which might be this one. But not sure: Situation: When I launch a blocking task in a background thread, the GUI hangs until the background thread exits (TB3 alpha1) How to reproduce I've simplified and finally came to this simple example which I launch as a one-liner directly in the console. This hangs the GUI once for all after a few seconds. function myEvent(func) { this.run = func; } myEvent.prototype = { QueryInterface : function(iid) { if (iid.equals(Components.interfaces.nsIRunnable) || iid.equals(Components.interfaces.nsISupports)) { return this; } throw Components.results.NS_ERROR_NO_INTERFACE; } }; tm = Components.classes["@mozilla.org/thread-manager;1"].getService(); thread = tm.newThread(0); runable = new myEvent(function() {for (var i = 0; i<= 100; i++) { Math.cos(i); }}); // consume some time... thread.dispatch(runable, this.thread.DISPATCH_NORMAL); I'll let some mozilla experts open a new bug if this is something different. Thanks
Let's get a try server build with this, so that _somebody_ can test it?
Whiteboard: [RC2?] → [RC2?][RC2+]
Comment on attachment 322064 [details] [diff] [review] bsmedbergs fix with brendan's suggestion. [Checkin: Comment 27] a+ schrep for 3.0.1 or RC2. Please land on CVS trunk.
Attachment #322064 - Flags: approval1.9+
FWIW, try server builds available here (mac and linux only, windows failed due to tryserver issues): https://build.mozilla.org/tryserver-builds/2008-05-22_15:57-jst@mozilla.com-sync-process/
I can't seem to write an xpcshell testcase that won't assert due to threadsafety issues... and since I can't find a way to add custom components for just one xpcshell test, I'm going to punt on that for the time being.
Assignee: benjamin → jst
Status: ASSIGNED → NEW
After conversion from .FLV to .MPEG4 I got a message that the file couldn't be read by Quicktime... It always happens. Any help on this issue?
I am running Firefox 2.0.0.14 and the converter works well.
Oscar, Rob, this is not the right place for DH related stuff. I reply to you directly by mail.
me too-please help
Status: NEW → ASSIGNED
Hardware: PC → All
Whiteboard: [RC2?][RC2+] → [RC2+]
Flags: blocking1.9+
Fix checked in on the CVS trunk.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Attachment #322064 - Attachment description: bsmedbergs fix with brendan's suggestion. → bsmedbergs fix with brendan's suggestion. [Checkin: Comment 27]
Verified with RC2 on Linux (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008052912 Firefox/3.0). I get a "Process exited with value 123" alert after the extension runs for a while and no issues with responsiveness in menus and the browser in general while it is running.
Status: RESOLVED → VERIFIED
Hi All ! I have problem like described in this bug I was made extension for FF 1.5 / 2.x and it works fine. But in FF 3, i got the next trouble : When my thread is running, the FF UI is frozen, here is part of code which creates thread and do the some job within thread void MyExtension::Start() { ... if (NS_FAILED(threadManager->NewThread(0, getter_AddRefs(m_WorkingThread)))) { throw; } m_WorkingThread->Dispatch(this, nsIThread::DISPATCH_NORMAL); ... } void MyExtension::OnDataReceived(void* p) { SetEvent(((MyExtension*)p)->m_ThreadEvent); } NS_IMETHODIMP MyExtension::Run() { WCHAR szName[256]; memset(szName, 0, sizeof(szName)); _ltow((long)this, szName, 10); m_ThreadEvent = CreateEvent(NULL, FALSE, FALSE, szName); ResetEvent(m_ThreadEvent); m_pProcessor->AttachCallback(this, &Handler::OnDataReceived); m_sMime.Assign(m_pProcessor->Start()); UINT32 nTimeoutCount = 0; while (WaitForSingleObject(m_ThreadEvent, 1000) == WAIT_TIMEOUT) { nTimeoutCount++; if (nTimeoutCount == 120) return NS_ERROR_FAILURE; } m_pProcessor->Finish(); m_pProcessor->GetDataAndClear((void**)&m_pData, &m_DataLength); delete m_pProcessor; m_pProcessor = NULL; return NS_OK; } what's wrong here ?
Flags: wanted1.9.0.x?
(In reply to comment #29) > مرحبا جميعا! لدي مشكلة وصفها في مثل هذا الخطأ الأول تم التمديد ل1،5 فرنك فرنسي / 2.x وأنه يعمل بشكل جيد. ولكن في 3 فرنك فرنسي ، وحصلت المشكلة التالية : عند موضوع بلدي قيد التشغيل ، واجهة المستخدم المجمدة هو فرنك فرنسي ، وهنا جزء من التعليمات البرمجية التي تخلق الموضوع والقيام بهذه المهمة بعض الفراغ في موضوع MyExtension : : ابدأ () {... وإذا (NS_FAILED (threadManager -> NewThread (0 ، getter_AddRefs (m_WorkingThread)))) {رمي ؛} m_WorkingThread -> الشحن (هذا ، nsIThread : DISPATCH_NORMAL) ؛}... باطل MyExtension : OnDataReceived (الفراغ * ع) {SetEvent (((MyExtension *) ع) --> m_ThreadEvent) ؛} NS_IMETHODIMP MyExtension : : تشغيل () {WCHAR szName [256] ؛ والعكس شريط ادوات (szName ، 0 ، sizeof (szName)) ؛ _ltow ((طويلة) هذا szName ، ، 10 ) ؛ m_ThreadEvent = CreateEvent (فارغة ، كاذبة ، كاذبة ، szName) ؛ ResetEvent (m_ThreadEvent) ؛ m_pProcessor -> AttachCallback (وهذا ، ومعالج : OnDataReceived) ؛ m_sMime.Assign (m_pProcessor -> ابدأ ()) ؛ nTimeoutCount UInt32 = 0 ؛ في حين أن (WaitForSingleObject (m_ThreadEvent ، 1000) == WAIT_TIMEOUT) {nTimeoutCount + + ، وإذا كان (nTimeoutCount == 120) NS_ERROR_FAILURE العودة ؛} m_pProcessor -> إنهاء () ؛ m_pProcessor -> GetDataAndClear ((الفراغ **) وm_pData ، وm_DataLength) ؛ حذف m_pProcessor ؛ العودة ؛ NS_OK ؛ = m_pProcessor فارغة} ما هو الخطأ هنا؟
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: