Closed
Bug 1133939
Opened 10 years ago
Closed 10 years ago
closing an nsPipeInputStream should delete unused segments from the nsPipe
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla39
People
(Reporter: bkelly, Assigned: bkelly)
References
Details
Attachments
(2 files, 15 obsolete files)
30.06 KB,
patch
|
bkelly
:
review+
Sylvestre
:
approval-mozilla-aurora+
bkelly
:
checkin+
|
Details | Diff | Splinter Review |
8.56 KB,
patch
|
bkelly
:
review+
Sylvestre
:
approval-mozilla-aurora+
bkelly
:
checkin+
|
Details | Diff | Splinter Review |
After bug 1100398, the nsPipe must now check to see if any other input streams are still referencing a segment before deleting it. Once the slowest input stream finishes with a segment it can finally be deleted.
Currently the code only checks for this when advancing the read cursor from a ReadSegments(). There is, however, another way for a segment to become unused and ready for deletion. If the slowest input stream is closed, then it should check to see if any segments can be free'd.
If we don't check for segment deletion at close we could in theory deadlock a pipe. Consider a pipe were stream 1 has read to the end, but stream 2 has not read anything. The writer blocks because the pipe is full. Stream 2 is then closed, but its segments are not free'd. The writer will never be able to write again and stream 1 will be stuck at the end.
Assignee | ||
Comment 1•10 years ago
|
||
Initial patch. I need to add tests before submitting for review.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•10 years ago
|
||
Attachment #8565609 -
Attachment is obsolete: true
Attachment #8566053 -
Flags: review?(nfroyd)
Assignee | ||
Comment 3•10 years ago
|
||
Attachment #8566054 -
Flags: review?(nfroyd)
![]() |
||
Comment 4•10 years ago
|
||
Comment on attachment 8566053 [details] [diff] [review]
P1 Free buffer resources when an nsPipeInputStream is closed. (v1)
Review of attachment 8566053 [details] [diff] [review]:
-----------------------------------------------------------------
I think you have successfully moved code to AdvanceReadSegment, but I'm not sure about some of your other modifications.
::: xpcom/io/nsPipe3.cpp
@@ +291,5 @@
> nsresult GetReadSegment(const nsPipeReadState& aReadState,
> const char*& aSegment, uint32_t& aSegmentLen);
> void AdvanceReadCursor(nsPipeReadState& aReadState, uint32_t aCount,
> uint32_t* aAvailableOut);
> + bool AdvanceReadSegment(nsPipeReadState& aReadState);
I think this would be a prime candidate for returning a real enum.
@@ +610,5 @@
> +}
> +
> +void
> +nsPipe::DrainInputStream(nsPipeReadState& aReadState, nsPipeEvents& aEvents,
> + uint32_t* aAvailableOut)
aAvailableOut doesn't appear to be modified by this function.
@@ +621,5 @@
> + }
> +
> + // if we've free'd up a segment, notify output stream that pipe has
> + // room for a new segment.
> + if (true && mOutput.OnOutputWritable(aEvents)) {
Um, I assume this is supposed to be |modified && mOutput.OnOutputWritable(aEvents)|?
Attachment #8566053 -
Flags: review?(nfroyd) → feedback+
![]() |
||
Comment 5•10 years ago
|
||
Comment on attachment 8566054 [details] [diff] [review]
P2 Add tests validating nsPipeOutputStream AsyncWait behavior. (v0)
Review of attachment 8566054 [details] [diff] [review]:
-----------------------------------------------------------------
f+ partly because the implementation patch seemed to be incomplete, so I don't know whether the tests are actually good or not.
::: xpcom/tests/gtest/TestPipes.cpp
@@ +666,5 @@
> + nsCOMPtr<nsIAsyncInputStream> reader;
> + nsCOMPtr<nsIAsyncOutputStream> writer;
> +
> + uint32_t segmentSize = 1024;
> + uint32_t numSegments = 1;
Nit: |const| on these, please.
@@ +669,5 @@
> + uint32_t segmentSize = 1024;
> + uint32_t numSegments = 1;
> +
> + // Use async input streams so we can NS_ConsumeStream() the current data
> + // while the pipe is still being written to.
Um. I don't see tests where we intersperse NS_ConsumeStream calls with writing the data. Is this comment still relevant, or does it need to be rewritten?
@@ +672,5 @@
> + // Use async input streams so we can NS_ConsumeStream() the current data
> + // while the pipe is still being written to.
> + nsresult rv = NS_NewPipe2(getter_AddRefs(reader), getter_AddRefs(writer),
> + true, true, // non-blocking - reader, writer
> + segmentSize, numSegments);
Nit: indentation of these continuation lines looks funny.
@@ +704,5 @@
> + nsCOMPtr<nsIAsyncInputStream> reader;
> + nsCOMPtr<nsIAsyncOutputStream> writer;
> +
> + uint32_t segmentSize = 1024;
> + uint32_t numSegments = 1;
Same comments from previous test apply here and below. I'd like to figure out some way to refactor this, but I'm not sure it's worth it.
Attachment #8566054 -
Flags: review?(nfroyd) → feedback+
Assignee | ||
Comment 6•10 years ago
|
||
Thanks. I missed some things from when I was debugging another problem with this patch applied. Also, I found a logic bug that needed fixing after removing the 'true' in that conditional.
Attachment #8566682 -
Flags: review?(nfroyd)
Assignee | ||
Comment 7•10 years ago
|
||
Sorry, I don't have time to factor out the common parts of the test (unless you really require it). I'm trying to get this and a few other patches in before merge next week.
Attachment #8566689 -
Flags: review?(nfroyd)
Assignee | ||
Comment 8•10 years ago
|
||
Assignee | ||
Comment 9•10 years ago
|
||
I was so caught up in the fact I posted a patch with a hardcoded 'true', that I forgot to address the enum return value comment. I'm not convinced this really improves code readability, but here it is.
Attachment #8566925 -
Flags: review?(nfroyd)
Assignee | ||
Comment 10•10 years ago
|
||
This is another issue I noticed while converting the bool returns to enums.
Basically, if we only do Notify() in the cases where we want to wake up the writer stream, we run the risk of waking up a peer reader stream that's blocked instead. This would leave the pipe deadlocked.
To avoid this we should be doing NotifyAll() in all cases.
This is only tangentially related to this bug and patch. I could open a new bug if you prefer. Its only a few lines changed, though, so it would be nice just to include it here.
Assignee | ||
Updated•10 years ago
|
Attachment #8566928 -
Flags: review?(nfroyd)
![]() |
||
Updated•10 years ago
|
Attachment #8566682 -
Flags: review?(nfroyd) → review+
![]() |
||
Updated•10 years ago
|
Attachment #8566689 -
Flags: review?(nfroyd) → review+
![]() |
||
Comment 11•10 years ago
|
||
Comment on attachment 8566925 [details] [diff] [review]
P1 interdiff 002 convert bool returns to enums
Review of attachment 8566925 [details] [diff] [review]:
-----------------------------------------------------------------
Hooray.
::: xpcom/io/nsPipe3.cpp
@@ +630,4 @@
>
> bool modified = false;
> while(mWriteSegment >= aReadState.mSegment) {
> modified = AdvanceReadSegment(aReadState) || modified;
This bit needs to be changed for the new return type of AdvanceReadSegment.
Attachment #8566925 -
Flags: review?(nfroyd) → feedback+
![]() |
||
Updated•10 years ago
|
Attachment #8566928 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 12•10 years ago
|
||
Good catch! Would be nice if enums didn't coerce to boolean values in this case.
Attachment #8567136 -
Flags: review?(nfroyd)
Assignee | ||
Comment 13•10 years ago
|
||
![]() |
||
Comment 14•10 years ago
|
||
Comment on attachment 8567136 [details] [diff] [review]
P1 intediff 004 fix missed block from enum change
Review of attachment 8567136 [details] [diff] [review]:
-----------------------------------------------------------------
I approve of your restraint in not commenting how this error shows that bools are just as convenient. :)
I think you might be able to prohibit implicit coercion by defining them as enum classes.
Attachment #8567136 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 15•10 years ago
|
||
Comment 16•10 years ago
|
||
Backed out for asserts across various platforms & suites.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e02960e6ee7a
https://treeherder.mozilla.org/logviewer.html#?job_id=6833409&repo=mozilla-inbound
Assignee | ||
Comment 17•10 years ago
|
||
I see the problem. I will update patch and push to try shortly.
Assignee | ||
Comment 18•10 years ago
|
||
Somewhat shocked the previous try and gtests didn't trigger this. Basically we were deleting segments that were still being written to. I'll run a full try before re-landing.
Assignee | ||
Comment 19•10 years ago
|
||
Assignee | ||
Comment 20•10 years ago
|
||
Meant to do a full try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4b70fc15d87a
Assignee | ||
Comment 21•10 years ago
|
||
Pushed an old patch by accident. One more try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=47abdbf3a487
Assignee | ||
Comment 22•10 years ago
|
||
The full try looked good.
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/27de4b553912
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/212080d51fb7
Comment 23•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/27de4b553912
https://hg.mozilla.org/mozilla-central/rev/212080d51fb7
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox38:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Comment 24•10 years ago
|
||
Backed out because of bug 1136453:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b74259f9cc1a
https://hg.mozilla.org/releases/mozilla-aurora/rev/0a882df01fc3
Target Milestone: mozilla38 → ---
Updated•10 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Merge of the backout to m-c: https://hg.mozilla.org/mozilla-central/rev/b74259f9cc1a
Assignee | ||
Comment 26•10 years ago
|
||
So the main issue here is that the pipe input streams do the following in ReadSegments:
1) Lock the pipe
2) Get a segment buffer
3) Unlock the pipe
4) Copy from the segment buffer
5) Lock the pipe
6) Advance the read cursor
7) Unlock the pipe
Problems occur if the pipe is closed during step (4). The lock is not held in step (4), so the close is not blocked and can modify the pipe state out from under the read. Previously this was happening with the mAvailable member being zero'd and then underflowed when the read completed. (I added an assertion which catches this now.) My previous landing here also deleted the buffer which is of course much worse in this condition.
This patch solves the problem by setting an mActiveRead in the nsPipeReadState. If there is an active read, then the close knows not to drain the stream yet. In these cases the completion of the read will trigger the pending drain.
To try to make this safer and easier to understand I've introduced an RAII class to represent read segments. This is called AutoReadSegment. It ensures that the active flag is always removed after a segment is no longer used.
This passes local gtests. I will do a bunch of try runs now as well.
Attachment #8572852 -
Flags: review?(nfroyd)
Assignee | ||
Comment 27•10 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=089e8d0912a3
I'll focus on mac 10.6 mochitest-other retriggers since that is what triggered the assertion in bug 1136453.
Assignee | ||
Comment 28•10 years ago
|
||
Sorry. I had some unsaved comments in my editor that were supposed to be included in the patch.
Attachment #8572852 -
Attachment is obsolete: true
Attachment #8572852 -
Flags: review?(nfroyd)
Attachment #8572871 -
Flags: review?(nfroyd)
Assignee | ||
Comment 29•10 years ago
|
||
Actually, rather than try to verify with intermittents, I realized we can force the condition with a gtest. This simulates a close-while-reading condition by calling Close() from the ReadSegments() function handler.
If we do this with only a partial segment write/read, we trigger exactly the assertion hit in bug 1136453.
If we do this with a full segment write/read, we hit the assertion on the line right before.
Attachment #8572969 -
Flags: review?(nfroyd)
Assignee | ||
Comment 30•10 years ago
|
||
The gtest revealed there was still an issue in the close-while-reading case. The segment advancing code expected the current streams state structure to be in the mInputList. This is not the case if the stream has been closed, though.
Updating the interdiff directly since Nathan has not started reading yet. With this update the new gtests pass.
Attachment #8572871 -
Attachment is obsolete: true
Attachment #8572871 -
Flags: review?(nfroyd)
Attachment #8572973 -
Flags: review?(nfroyd)
Assignee | ||
Comment 31•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Assignee | ||
Comment 32•10 years ago
|
||
Some minor fixes for non-debug and windows build issues revealed by try.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b8b194c117fe
![]() |
||
Updated•10 years ago
|
Attachment #8572969 -
Flags: review?(nfroyd) → review+
![]() |
||
Comment 33•10 years ago
|
||
Comment on attachment 8572973 [details] [diff] [review]
P1 interdiff 006 wait for in-flight reads to complete
Review of attachment 8572973 [details] [diff] [review]:
-----------------------------------------------------------------
OK, so. We now have public methods in nsPipe that AFAICT are really only supposed to be called by AutoReadSegment. But AutoReadSegment doesn't really do anything auto-ly, but it does properly trigger things in nsPipe to set flags in nsPipeState that are set in an auto-ly fashion, which nsPipe will then respect?
I feel like this is a wee bit too complicated. But I'm not sure whether that's a property of the problem being tackled or the particular implementation we have.
Assignee | ||
Comment 34•10 years ago
|
||
Attachment #8574772 -
Flags: review?(nfroyd)
Assignee | ||
Comment 35•10 years ago
|
||
![]() |
||
Updated•10 years ago
|
Attachment #8574772 -
Flags: review?(nfroyd) → review+
![]() |
||
Comment 36•10 years ago
|
||
Comment on attachment 8572973 [details] [diff] [review]
P1 interdiff 006 wait for in-flight reads to complete
Review of attachment 8572973 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/io/nsPipe3.cpp
@@ +610,5 @@
> return NS_FAILED(mStatus) ? mStatus : NS_BASE_STREAM_WOULD_BLOCK;
> }
>
> + // The input stream locks the pipe while getting the buffer to read from,
> + // but then unlocks while actually data copying is taking place. In
Nit: "...while actual data copying..."
Attachment #8572973 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 37•10 years ago
|
||
Comment 38•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/516fb2d378b0
https://hg.mozilla.org/mozilla-central/rev/76e28af52416
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Assignee | ||
Comment 39•10 years ago
|
||
Approval Request Comment
[Feature/regressing bug #]: Caused by 1100398.
[User impact if declined]: Intermittent test failures as in bug 1136453.
[Describe test coverage new/current, TreeHerder]: New gtests in P2 patch that positively confirm the fix.
[Risks and why]: Low risk.
[String/UUID change made/needed]: None.
Attachment #8566053 -
Attachment is obsolete: true
Attachment #8566682 -
Attachment is obsolete: true
Attachment #8566925 -
Attachment is obsolete: true
Attachment #8566928 -
Attachment is obsolete: true
Attachment #8567136 -
Attachment is obsolete: true
Attachment #8567410 -
Attachment is obsolete: true
Attachment #8572973 -
Attachment is obsolete: true
Attachment #8574772 -
Attachment is obsolete: true
Attachment #8580735 -
Flags: review+
Attachment #8580735 -
Flags: checkin+
Attachment #8580735 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 40•10 years ago
|
||
Approval Request Comment: See P1 request.
Attachment #8566054 -
Attachment is obsolete: true
Attachment #8566689 -
Attachment is obsolete: true
Attachment #8572969 -
Attachment is obsolete: true
Attachment #8573023 -
Attachment is obsolete: true
Attachment #8580736 -
Flags: review+
Attachment #8580736 -
Flags: checkin+
Attachment #8580736 -
Flags: approval-mozilla-aurora?
Comment 41•10 years ago
|
||
Why do you think it is low risk? The patch is pretty big and touching xpcom.
Moreover, 38 is marked as disabled.
Assignee | ||
Comment 42•10 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #41)
> Why do you think it is low risk? The patch is pretty big and touching xpcom.
> Moreover, 38 is marked as disabled.
The previous patch was backed out and relanded with a fix. I consider the new patches low risk because we now have tests that exercise the failure more consistently.
Assignee | ||
Comment 43•10 years ago
|
||
I would also note this has had a week of burn-in time on nightly before requesting uplift.
![]() |
||
Comment 44•10 years ago
|
||
Also worth noting that we do appear to have several crashes (across all Firefox/Thunderbird versions) in the last week in AdvanceReadPipe. All these crashes are version 36 or before, so they're prior to Ben's work in this area. Can't say for certain that these patches would fix those, but it seems plausible.
Updated•10 years ago
|
Attachment #8580735 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•10 years ago
|
Attachment #8580736 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•10 years ago
|
Comment 45•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/eb89ef2c7878
https://hg.mozilla.org/releases/mozilla-aurora/rev/ac89e90a4393
Flags: in-testsuite+
You need to log in
before you can comment on or make changes to this bug.
Description
•