Closed Bug 36750 Opened 25 years ago Closed 24 years ago

Fix nsThreadPool to minimize threads

Tracking

()

Status:

RESOLVED FIXED

Milestone:

M18

People

(Reporter: warrensomebody, Assigned: dougt)

Details

(Keywords: embed, memory-footprint, Whiteboard: [nsbeta3-] fixed on trunk)

Attachments

(6 files)

dougt's patches for this (for reference - don't completely work) 25 years ago Warren Harris 5.27 KB, patch		Details \| Diff \| Splinter Review
my patches -- refcount right, but threads hang around waiting for final join 25 years ago Warren Harris 4.67 KB, patch		Details \| Diff \| Splinter Review
Proposed fix - 1 24 years ago Doug Turner (:dougt) 16.37 KB, patch		Details \| Diff \| Splinter Review
cleaned up diffs -- still have sync problem 24 years ago Warren Harris 18.18 KB, patch		Details \| Diff \| Splinter Review
Reviewed Patch 24 years ago Doug Turner (:dougt) 14.47 KB, patch		Details \| Diff \| Splinter Review
my final approved set of changes 24 years ago Warren Harris 18.17 KB, patch		Details \| Diff \| Splinter Review

Warren Harris

Reporter

Description

•

25 years ago

At the last minute, I had to comment out the code to allow threads to terminate from the thread pool code because they still hung around waiting to be joined with. This effectively leaked them. Here's a message from Wan-Teh about how we could possibly fix this: Subject: Re: clearing PR_JOINABLE_THREAD Date: Fri, 21 Apr 2000 09:05:53 -0700 From: wtc@netscape.com (Wan-Teh Chang) To: Warren Harris <warren@netscape.com> CC: Doug Turner <dougt@netscape.com> References: 1 Warren Harris wrote: > > Hi Wan-Teh, > > I've got a problem with nsThread.cpp that I'm hoping you can help me > solve. Doug and I have changed the implementation of nsThreadPool to > do a little better job managing the threads in the pool, letting the > count of threads grow up to a max, and shrink back down to a minimum > number when not needed. That's good! > However, the problem comes in because I've > made the threads be joinable so that we can synchronize with them when > the thread pool gets destroyed. However, for the threads that > terminate themselves (because they're no longer needed), there's no > way that I can tell to clear the joinable bit. This causes them to not > actually go away, but hang out waiting to be joined with even though > the thread pool has now forgotten about them. Is there some way to > clear the joinable bit? No. Once an NSPR thread is create joinable or unjoinable, there is no way to change that attribute. I'm not sure whether that's an oversight or a conscious design decision. Pthreads has the operation you want: pthread_detach. It's not hard to work around this limitation. You'd create all your threads PR_UNJOINABLE_THREAD. Then you use a thread count and a condition variable to synchronize with them when the thread pool gets destroyed, like this: void WorkerThread(void *arg) { .... // need to exit, either voluntarily or because the // thread pool is being destroyed. PR_Lock(thread_pool_lock); if (--thread_count == 0) { PR_NotifyCondVar(all_threads_exited_cv); } PR_Unlock(thread_pool_lock); } void TheThreadThatDestroysTheThreadPool(void *arg) { ... // signals all the threads in the pool to exit ... // synchronize with them PR_Lock(thread_pool_lock); // you might already be holding this lock when you signal the threads to exit while (thread_count) { PR_WaitCondVar(all_threads_exited_cv, PR_INTERVAL_NO_TIMEOUT); } PR_Unlock(thread_pool_lock); .... } I will take a look at your new thread pool code tonight. Wan-Teh Maybe Doug would be good enough to take this task on again and help me solve this problem. (BTW Doug, If you look for the #if 0 in nsThread.cpp you'll see how I was attempting to let threads go away. I don't think there was any need for the initial Wait with timeout that you had before -- if there are too many already, just let them go.)

Warren Harris

Reporter

Updated

•

25 years ago

Target Milestone: --- → M18

Warren Harris

Reporter

Comment 1

•

25 years ago

Doug: Here's the thread-pool bug.

Doug Turner (:dougt)

Assignee

Comment 2

•

25 years ago

looking at this.

Assignee: warren → dougt

Warren Harris

Reporter

Comment 3

•

25 years ago

Attached patch dougt's patches for this (for reference - don't completely work) — Details — Splinter Review

Warren Harris

Reporter

Comment 4

•

25 years ago

Attached patch my patches -- refcount right, but threads hang around waiting for final join — Details — Splinter Review

Judson Valeski

Updated

•

25 years ago

Keywords: embed, footprint, nsbeta3

Whiteboard: nsbeta3+

Warren Harris

Reporter

Updated

•

25 years ago

Whiteboard: nsbeta3+ → [nsbeta3+]

Doug Turner (:dougt)

Assignee

Comment 5

•

24 years ago

wtc, what are the chances that we add a detach function to nspr?

Wan-Teh Chang

Comment 6

•

24 years ago

I don't have time to do that. Sorry.

Judson Valeski

Comment 7

•

24 years ago

minusing per PDT rules.

Whiteboard: [nsbeta3+] → [nsbeta3-]

Doug Turner (:dougt)

Assignee

Comment 8

•

24 years ago

I have almost fixed this in my tree. It will reduce running footprint by reducing thread count by 75 percent (eg there are usually 4 running file transport thread all the time. I will reduce this to one) If you startup FTP, you will get another four. This should be fixed, if not for seamonkey, for mozilla and other clients.

Whiteboard: [nsbeta3-] → [nsbeta3+]

Doug Turner (:dougt)

Assignee

Comment 9

•

24 years ago

Attached patch Proposed fix - 1 — Details — Splinter Review

Warren Harris

Reporter

Comment 10

•

24 years ago

Doug: Brendan's comments not withstanding, we're going to have to get together to talk through this change like we did before. I want us both to convince ourselves that this fix really works (and I'm going to need your help). How does Thurs look?

Brendan Eich [:brendan]

Comment 11

•

24 years ago

Warren, I haven't given a= yet, and I'd like to see a new patch attached based on my comments, but I think dougt is heading in the right direction, if not already there. /be

Warren Harris

Reporter

Comment 12

•

24 years ago

Doug: I'll try your patch tonight. However, GetThreadPrivate/SetThreadPrivate still bug me. I don't think they're methods that should go on the interface class since there's no way for 2 clients to coordinate who owns the thread private data. Nspr has a concept that TLS has a key that's used for this. I think these methods should go on the implementation class.

Warren Harris

Reporter

Comment 13

•

24 years ago

Sorry, I meant GetPrivateData/SetPrivateData. And I don't even see where these get used anywhere.

Warren Harris

Reporter

Comment 14

•

24 years ago

Subject: Re: thread pool diff Date: Tue, 26 Sep 2000 00:53:11 -0700 From: Warren Harris <warren@netscape.com> To: dougt@netscape.com CC: valeski@netscape.com References: 1 Ok, I reviewed this tonight some more, and I don't like it. A lot of the synchronization looks wrong to me (or removed). For instance, there's no synchronization when the thread starts up with the thread putting the event in the queue. I found this to be important in the past. Also, there are numerous places where you let go of the lock to do something, but don't check invariants when you re-enter. I'm not convinced this is safe. Let's try to get together again to talk this through. Tomorrow afternoon? (BTW, I'll post a slightly cleaned-up diff). Warren

Warren Harris

Reporter

Comment 15

•

24 years ago

Attached patch cleaned up diffs -- still have sync problem — Details — Splinter Review

lchiang

Comment 16

•

24 years ago

per email with Jud, changing nsbeta3+ to nsbeta3- on all "embed" keyword bugs since embedding changes will not be made in the mn6 branch. If you feel this bug fix needs to go into mn6 branch, please list the reasons/user impact/risk and nominate for rtm. Thanks.

QA Contact: leger → lchiang

Whiteboard: [nsbeta3+] → [nsbeta-]

Warren Harris

Reporter

Comment 17

•

24 years ago

I really think this should go into b3. Restoring +

Whiteboard: [nsbeta-] → [nsbeta3+]

lchiang

Comment 18

•

24 years ago

Ok, I'll send this to pdt mailing list for review.. PDT hasn't been looking at bugs with embed in the keyword since we've been leaving that up to embed team/Judson.

Peter Trudelle

Comment 19

•

24 years ago

Warren, why do you think this should go into nsbeta3 rather than rtm? How are users affected either way, and also, what is the risk involved?

Warren Harris

Reporter

Comment 20

•

24 years ago

Here's why I want to fix this: As it stands, we have a parameter that controls the min and max number of threads that deal with file i/o. But this is broken, and we always hang out with the max number of threads (currently 4). This means that usually there are 4 threads sitting idle, taking up memory and holding references to objects (e.g. file transport objects and their associated urls, etc.). What's worse is that if the user goes to download a file, a file fetcher thread gets employed, and if the user downloads more than 4 things, we stall because we don't want to allow more than 4 threads. If we can fix this bug, we'd be able to set the min threads down to 1, so that usually there was only one file fetcher hanging around, and set the max to a large number so that the user effectively never runs into the threshold while downloading files.

Peter Trudelle

Comment 21

•

24 years ago

So the worst scenario is stalling when trying to download more than 4 files concurrently? That doesn't sound like a beta-stopper to me.

Warren Harris

Reporter

Comment 22

•

24 years ago

I think it's pretty important to fix.

Peter Trudelle

Comment 23

•

24 years ago

Yes, you've said that. PDT is trying to understand why you it should go into the branch rather than just rtm, which at this point is happening to very few bugs.

Jim Roskind

Comment 24

•

24 years ago

Reading the discussion about the fix, I think it is too late and too risky for PR3 (hopefully finalized on Thursday of this week). We need a clear unedrstanding of what the restriction is (no more than 4 downloads at once would be a big restriction IMO), or the size of a leak that might be induced by the thread never being destroyed. That info would help us weigh the risk (when you think you really have this nailed) against the benefit, to try to make the RTM call. For now, I've nominated for rtm, but taken it off the PR3 radar (minus for nsbeta3).

Keywords: rtm

Whiteboard: [nsbeta3+] → [nsbeta3-]

Doug Turner (:dougt)

Assignee

Comment 25

•

24 years ago

Attached patch Reviewed Patch — Details — Splinter Review

Doug Turner (:dougt)

Assignee

Comment 26

•

24 years ago

the last attachment is unified diff that wtc warren and I reviewed.

Warren Harris

Reporter

Comment 27

•

24 years ago

Attached patch my final approved set of changes — Details — Splinter Review

Warren Harris

Reporter

Comment 28

•

24 years ago

I attached some slightly cleaned up diffs. I've run with them and they work great. I even upped the file transport count to 16 (not included in the diff) and found that on startup, we can have up to 14 threads running in parallel. They promptly drop back down, grow to 7, drop again -- pretty good. I don't know if we want to allow 14 threads at once -- I'll have to do some timing analysis to see if it's a net win. For file transfers (and ftp downloads), I had other problems that prevented me from determining whether this helped that or not. See bug 54648 for the ftp problems.

Doug Turner (:dougt)

Assignee

Comment 29

•

24 years ago

Okay. I will check these into the TRUNK today given that warren has review and brendan as approved. Warren, you are going to have to fight pdt to get this one into the branch.

Doug Turner (:dougt)

Assignee

Comment 30

•

24 years ago

Fix checked in: Checking in nsThread.cpp; /cvsroot/mozilla/xpcom/threads/nsThread.cpp,v <-- nsThread.cpp new revision: 1.27; previous revision: 1.26 done Checking in nsThread.h; /cvsroot/mozilla/xpcom/threads/nsThread.h,v <-- nsThread.h new revision: 1.16; previous revision: 1.15 done

Status: NEW → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED

lchiang

Comment 31

•

24 years ago

If you want this on PDT's radar to review for the branch, I'm going to reopen this and the module/area owner needs to mark rtm+. This way, PDT will review and this bug doesn't get lost. I'll add FIXED IN TRUNK to the status whiteboard.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

lchiang

Updated

•

24 years ago

Whiteboard: [nsbeta3-] → [nsbeta3-] fixed on trunk

Scott MacGregor

Comment 32

•

24 years ago

Just out of curiousity, are we going to be introducing any negative performance ramifications for the Mac with these changes? I know the simulated threads on the Mac are pretty expensive right? It sounds like we used to have a max of 4 running threads (at least on startup) and after reading Warren's comments I thought he said we could exceed that number. Or do the diffs still place a limit on the total # of threads and Warren is just experimenting with increasing that number? Just curious...

Doug Turner (:dougt)

Assignee

Comment 33

•

24 years ago

Marking as fixed. this is not going to be in n6. mscott, the threads in the file thread pool will allocate up to four threads at startup. http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsFileTransportService.h#31

Status: REOPENED → RESOLVED

Closed: 24 years ago → 24 years ago

Resolution: --- → FIXED

Scott MacGregor

Comment 34

•

24 years ago

Cool! I've been running with this patch for a week or two now and it's looked good so far. Nice job guys.

lchiang

Comment 35

•

24 years ago

If this is not going to be in NS6, please mark this rtm-.

QA Contact: lchiang → asa

You need to log in before you can comment on or make changes to this bug.