Closed Bug 17934 Opened 26 years ago Closed 25 years ago

timing bug introduced in plevent in 3.5

Categories

(SeaMonkey :: General, defect, P3)

All
Linux

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: colin, Assigned: dougt)

Details

Attachments

(1 file)

I know the PLEvent stuff is not part of NSPR anymore, but its still in the NSPR tree and so I'm taking the liberty of putting the bug report into NSPR. Please reassign to the appropriate Mozilla folk. I've run into a problem with NSPR 3.5 while trying to get M11 up and running on OpenVMS. This is a problem that has been introduced in 3.5. Background. I had some timing problems before on OpenVMS. Sometimes I would go to read the char from the notify pipe, only to find it wasn't there yet. Under this circumstance, it was important NOT to decrement notifyCount, otherwise it would get out of whack wrt the number of events outstanding and the number of bytes in the pipe. This fix made it into an earlier version of NSPR. Now I have a new problem which occurs under the same circumstances. In this situation, we now do NOT read the char, we do NOT decrement notifyCount, but we still return a successful status. GetEvent sees the good status, and pulls the next event off the queue. Now the number of pending events is less than notifyCount (and the number of bytes in the pipe). This inconsistency didn't matter in pre-3.5, but it does matter in 3.5 due to a change that has been made. Assume the above has just happened and so the number of queued events is 0 but notifyCount is 1 and there's one byte in the pipe. In pre-3.5, next time in to GetEvent we would call AcknowledgeNativeNotify. This would cause the byte to be read from the pipe and notifyCount to be decremented. Back in GetEvent we go to pull the event off the queue only to find there isn't one. No problem, we just continue on our merry way HAVING JUST CORRECTED the events/notifyCount mismatch problem. In 3.5, GetEvent has been changed and we only call AcknowledgeNativeNotify if the event queue has something in it. So in the above situation, we never call AcknowledgeNativeNotify and so notifyCount stays at 1 and the byte remains in the pipe. In the idle loop, the select keeps firing for the pipe fd because there's a byte in there, but GetEvent will never read the pipe because there's no events in the queue. Result is lots of CPU being used but we get nowhere. I see two ways to fix this. 1. Have AcknowledgeNativeNotify return false if it didn't read the byte from the pipe. This will cause GetEvent to stall until the byte really is ready to be read, and prevent the problem from occurring. 2. Put some code in GetEvent to detect when the queue is empty but notifyCount is non-zero, and then suck on the pipe until the situation is cleared. Option 2 sound like a lot more work and needs some very careful synchronization. Option 1 seems like the way to go. I have tried it here on OpenVMS and it does indeed fix the problem. Colin.
Assignee: srinivas → travis
Component: NSPR → Browser-General
Product: NSPR → Browser
Version: 3.5 → other
Reassigned the bug to travis.
Assignee: travis → troy
Troy, seeing as you made these changes, I am forwarding this over to you.
Assignee: troy → travis
No, I did not make changes to GetEvent()...
Attached patch patch fileSplinter Review
I just attached a patch file for option 1.
Target Milestone: M13
Travis, is this really our bug?
I own this in the future. However I don't know anything about the current PLEvent stuff nor do I have time to really try going into it with the current WebShell work.
Target Milestone: M13 → M15
Move to M15.
Keywords: patch
Move to M16 for now ...
Target Milestone: M15 → M16
Is this still a bug/ problem?
BTW, I don't own this code. I'm not sure who to give it to if it is still a problem.
I personally believe that this is still a problem for UNIX systems, but the timing is such that the bug is never revealed. It should still be fixed though! Its not a problem for VMS any more (which used to use the UNIX code path in plevent) because I wrote a new native notification method for VMS, and therefore I no longer use the buggy code.
OS: OpenVMS → Linux
Hardware: DEC → All
M16 has been out for a while now, these bugs target milestones need to be updated.
taking ownership for now. will look at later this week.
Assignee: travis → dougt
colin is this a blocker for you? I have changes that I am working on that will remove only post notification once instead of per event. It will make this diff mute. I am not sure that my patch will make nsbeta3.
This is no longer a blocker for me. See above, I wrote a new native notification method for VMS.
futured.
Target Milestone: M16 → Future
WONTFIX.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → WONTFIX
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: