Closed
Bug 1240985
Opened 9 years ago
Closed 9 years ago
Fix bugs found by IPC fuzzer
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
mozilla47
People
(Reporter: billm, Assigned: billm)
References
Details
(Whiteboard: [e10s-45-uplift])
Attachments
(17 files)
I've been working on a MessageChannel fuzzer for the last two weeks. It basically just randomly sends/receives messages of different priorities and sometimes cancels them or times them out. It has found a number of bugs, almost all of them having to do with CPOW cancellation. I'm going to post patches here.
Working on this, I'm also trying to eliminate some of the weird edge cases in the code. Besides bug fixes, I'm going to make a few policy changes:
1. Right now, if a CPOW from the parent races with a sync message from the child, we dispatch the CPOW in the child first. This causes all sorts of weird issues. Most importantly, if we then cancel the CPOW, it's very difficult to know what to do with the sync message from the child, so we crash. To avoid this, and to simplify the possible interleavings, I've decided that we should just cancel any CPOWs that race with sync messages from the child. It makes CPOWs a little less reliable, but they're already a kind of "best effort" mechanism, so I think that's okay. And it's better than crashing later on.
2. Currently if we time out a CPOW, we refuse to respond to any sync messages from the child until we get a response to the timed out message. The comment that justifies this no longer makes sense to me and I can't think of any reason why it's needed. I think the need for it may have been eliminated when we added cancellation. In any case, I think we should remove this and allow these messages to be dispatched.
I feel a lot more confident making these changes now that we have the fuzzer. It seems to be really good at exposing the kinds of races that were very hard to track down otherwise.
Assignee | ||
Comment 1•9 years ago
|
||
This is sort of an odd bug. The code uses a Unix pipe to signal to the glib event loop that it needs to wake up. There's a comment that claims:
// We should only ever have a single message on the wakeup pipe, since we
// are only signaled when the queue went from empty to non-empty.
As far as I can tell, this isn't true. We call MessagePumpForUI::ScheduleWork unconditionally from here:
https://dxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/message_loop.cc#335
This is a problem because if we write too many bytes to the pipe without reading them, the OS causes the write calls to block. That, in turn, can lead to a deadlock.
To solve this, I added a flag to ensure that we only write to the pipe if it's empty.
Attachment #8709769 -
Flags: review?(jld)
Assignee | ||
Comment 2•9 years ago
|
||
Fixing these bugs would have been very difficult without logging. The later patches also include more logging.
Attachment #8709772 -
Flags: review?(dvander)
Assignee | ||
Comment 3•9 years ago
|
||
This fixes a somewhat minor bug that happens if the child cancels a CPOW that the parent timed out. In that case we want to end the timeout rather than trying to cancel anything (since the transaction was already effectively cancelled by the parent).
Attachment #8709773 -
Flags: review?(dvander)
Assignee | ||
Comment 4•9 years ago
|
||
The way the cancellation code works, several state variables (mCurrentTransaction and mAwaitingSyncReply) are overwritten when we cancel. These variables are normally set/unset in a stacky way using the RAII classes AutoEnterTransaction and AutoSetValue. The sequence normally looks like this:
On entering Send:
set mCurrentTransaction, mAwaitingSyncReply
send and wait for reply
set mCurrentTransaction, mAwaitingSyncReply to their previous values
However, if we cancel while waiting for the reply, we *don't* want to restore them to their original values. The AutoEnterTransaction destructor was fixed a while ago to handle this, but somehow I didn't realize I also needed to fix AutoSetValue. This patch does that.
Attachment #8709774 -
Flags: review?(dvander)
Assignee | ||
Comment 5•9 years ago
|
||
This patch implements the first policy change I mentioned in comment 0: racy CPOWs are cancelled instead of being executed. It also removes the intentional crash that was needed if we cancel during a racy CPOW. And it fixes AutoEnterTransaction to have a more precise assertion about how transaction IDs nest now that we can't have these races.
Finally, it fixes an unrelated problem where we need to call WasTransactionCanceled during ProcessPendingRequests. (This should really be a separate patch but I had trouble splitting it out.) The comment hopefully explains.
Attachment #8709775 -
Flags: review?(dvander)
Assignee | ||
Comment 6•9 years ago
|
||
This is kind of another policy change. We're never supposed to send anything when dispatching urgent messages. Right now, we do a mix of cancelling (when dispatching sync urgent messages) and crashing (when dispatching async urgent messages) in this situation. However, I realized that we can just return false when doing the Send. This seems like a much better solution.
Attachment #8709776 -
Flags: review?(dvander)
Assignee | ||
Comment 7•9 years ago
|
||
Like I said in comment 0, I don't think we need to do this anymore.
Attachment #8709777 -
Flags: review?(dvander)
Assignee | ||
Comment 8•9 years ago
|
||
This fixes another cancellation bug. Let's say we're doing a sync Send and we receive a message. In some cases we want to process that message during the Send. But before the worker thread gets a chance to do that, we receive a cancel message for the transaction. In that case, the Send call returns immediately without processing the message that was received. In order to process it, we need to make sure to queue an OnMaybeDequeueOneTask. This patch does that.
Attachment #8709779 -
Flags: review?(dvander)
Assignee | ||
Comment 9•9 years ago
|
||
The fuzzer also found a problem if you cancel while dispatching an urgent message. I couldn't figure out how to handle that, so I added this crash. Hopefully it will never happen. (We only cancel when we run a nested event loop, so I hope that never happens.)
Attachment #8709781 -
Flags: review?(dvander)
Assignee | ||
Comment 10•9 years ago
|
||
Oh, one thing I forgot is that I started attaching a transaction ID to the cancel messages so that we can make assertions about it. That part is kind of mixed into several of these patches. Sorry.
Also, I'll get the actual fuzzer posted tomorrow.
Attachment #8709772 -
Flags: review?(dvander) → review+
Attachment #8709773 -
Flags: review?(dvander) → review+
Attachment #8709774 -
Flags: review?(dvander) → review+
Comment on attachment 8709775 [details] [diff] [review]
Cancel racy CPOW messages
Review of attachment 8709775 [details] [diff] [review]:
-----------------------------------------------------------------
I like this a lot more than crashing. Do we know if that condition occurred a lot in the wild?
::: ipc/glue/MessageChannel.cpp
@@ +806,5 @@
> +
> + if (mSide == ChildSide && msg.is_sync() && msg.transaction_id() != mCurrentTransaction) {
> + // A message has arrived that is racing with our own message. It
> + // must be a CPOW message since those are the only kind of sync
> + // messages that parent can send. We just cancel it. This avoids
"the parent"
@@ +820,5 @@
> + }
> +
> + bool defer = ShouldDeferMessage(msg);
> + if (msg.is_sync() || msg.priority() == IPC::Message::PRIORITY_URGENT) {
> + // Only log the interesting messages.
Could you move this comment out of the |if| and change it to "Log interesting messages."? (Every time I see an if-condition in IPDL I automatically expect some critical logic to follow :)
Attachment #8709775 -
Flags: review?(dvander) → review+
Attachment #8709776 -
Flags: review?(dvander) → review+
Comment on attachment 8709777 [details] [diff] [review]
don't return MsgNotAllowed
Review of attachment 8709777 [details] [diff] [review]:
-----------------------------------------------------------------
::: ipc/glue/MessageChannel.cpp
@@ -1398,5 @@
> - //
> - // We do this because want to avoid a situation where we process an
> - // incoming message from the child here while it simultaneously starts
> - // processing our timed-out CPOW. It's very bad for both sides to
> - // be processing sync messages concurrently.
I agree we can probably drop this. It looks like mTimedOutMessagePriority might be dead after this change (unless it's used again later).
Attachment #8709777 -
Flags: review?(dvander) → review+
Attachment #8709779 -
Flags: review?(dvander) → review+
(In reply to Bill McCloskey (:billm) from comment #9)
> Created attachment 8709781 [details] [diff] [review]
> crash if cancelling during urgent dispatch
>
> The fuzzer also found a problem if you cancel while dispatching an urgent
> message. I couldn't figure out how to handle that, so I added this crash.
> Hopefully it will never happen. (We only cancel when we run a nested event
> loop, so I hope that never happens.)
What makes this difficult to handle?
Comment 14•9 years ago
|
||
Comment on attachment 8709769 [details] [diff] [review]
fix for chromium ipc bug in glib message pump
Review of attachment 8709769 [details] [diff] [review]:
-----------------------------------------------------------------
I stared at this for a while trying to see if there were some kind of ordering of events that would do something unexpected, but I think the “worst” that can happen is for the pipe buffer to briefly hold 2 bytes instead of 0 or 1, but that should be fine.
::: ipc/chromium/src/base/message_pump_glib.cc
@@ +231,5 @@
> if (!state_) // state_ may be null during tests.
> return false;
>
> // We should only ever have a single message on the wakeup pipe, since we
> // are only signaled when the queue went from empty to non-empty. The glib
Does this comment need to be adjusted?
Attachment #8709769 -
Flags: review?(jld) → review+
Comment 15•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/fddcf8b3b088
https://hg.mozilla.org/integration/mozilla-inbound/rev/34f87cc8ac24
https://hg.mozilla.org/integration/mozilla-inbound/rev/12de76467424
https://hg.mozilla.org/integration/mozilla-inbound/rev/a7eecc68ca20
https://hg.mozilla.org/integration/mozilla-inbound/rev/488690ba4c8f
https://hg.mozilla.org/integration/mozilla-inbound/rev/3fcd50d83821
https://hg.mozilla.org/integration/mozilla-inbound/rev/b87f893bd6aa
https://hg.mozilla.org/integration/mozilla-inbound/rev/26e6fe3875a5
Assignee | ||
Comment 16•9 years ago
|
||
I wasn't able to land the policy change about CPOWs racing with sync messages because it breaks a bunch of tests that rely on CPOWs. I'll think some more about what to do about that.
I also made some changes to my fuzzer and now I'm seeing more assertions, so I think there's more work to do here.
> What makes this difficult to handle?
If we're dispatching an async urgent message, it's not clear what it means to cancel it. There might be another sync message that the other side is waiting on, and that should be cancelled. But there might be nothing going on, and in that case the cancel shouldn't do anything I guess.
Anyway, it would be pretty bad if we cancel during these messages. That would mean that we're spinning a nested event loop while dispatching an urgent message. We're not supposed to send at all while dispatching urgent messages, and we can't make any claims about what will happen during a nested event loop.
Keywords: leave-open
Comment 17•9 years ago
|
||
Comment 18•9 years ago
|
||
It looked like the way you were doing variadic macros broke on Visual Studio, so I pushed a followup, trying to match what NS_IMPL_CYCLE_COLLECTION_TRAVERSE does. Though looking at it, I think I did not quite get it right...
Flags: needinfo?(wmccloskey)
Comment 19•9 years ago
|
||
Ok, that made it worse, so Kwierso is just going to back all these patches out.
Comment 20•9 years ago
|
||
This might work, though I've lost my nerve to actually try it on inbound:
#define IPC_LOG(...) \
MOZ_STATIC_ASSERT_VALID_ARG_COUNT(__VA_ARGS__); \
MOZ_LOG(sLogModule, LogLevel::Debug, (__VA_ARGS__))
All backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/10d4e8736dbb
Probably wouldn't have been able to merge or cherry-pick tonight with all of the infra problems that popped up, anyway...
Comment 22•9 years ago
|
||
Mass B2G debug bustage too like the log below:
https://treeherder.mozilla.org/logviewer.html#?job_id=20163871&repo=mozilla-inbound
Assignee | ||
Updated•9 years ago
|
Whiteboard: [e10s-45-uplift]
Comment 24•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/19ab70a7137c
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d7dd1de5dd5
https://hg.mozilla.org/integration/mozilla-inbound/rev/5a2a8262a368
https://hg.mozilla.org/integration/mozilla-inbound/rev/2020b3de24bc
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7ce9369ddb6
https://hg.mozilla.org/integration/mozilla-inbound/rev/908c5be531ae
https://hg.mozilla.org/integration/mozilla-inbound/rev/cda5277636b6
https://hg.mozilla.org/integration/mozilla-inbound/rev/aac985c60052
Comment 25•9 years ago
|
||
Awesome work but please file found issues by a fuzzer in separate bugs. This makes it easier to track down and to reference to.
These patches don't make sense in separate bugs.
Comment 27•9 years ago
|
||
(In reply to Pulsebot from comment #24)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/e7ce9369ddb6
This needs rebasing for Aurora. The others graft cleanly.
Flags: needinfo?(wmccloskey)
Comment 28•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/19ab70a7137c
https://hg.mozilla.org/mozilla-central/rev/6d7dd1de5dd5
https://hg.mozilla.org/mozilla-central/rev/5a2a8262a368
https://hg.mozilla.org/mozilla-central/rev/2020b3de24bc
https://hg.mozilla.org/mozilla-central/rev/e7ce9369ddb6
https://hg.mozilla.org/mozilla-central/rev/908c5be531ae
https://hg.mozilla.org/mozilla-central/rev/cda5277636b6
https://hg.mozilla.org/mozilla-central/rev/aac985c60052
Assignee | ||
Comment 29•9 years ago
|
||
Comment on attachment 8709769 [details] [diff] [review]
fix for chromium ipc bug in glib message pump
I'm requesting uplift for all the patches here. I'm hoping these patches will fix a bunch of e10s crashes like bug 1177013.
Approval Request Comment
[Feature/regressing bug #]: bug 1177013
[User impact if declined]: crashes like bug 1210821
[Describe test coverage new/current, TreeHerder]: we have a new fuzzer to discover bugs like this now
[Risks and why]: These changes could result in additional crashes for e10s users (although they're designed to *fix* crashes). However, they really only affect e10s, which won't ship in 45. So the risk of breaking the 45 release is quite low.
[String/UUID change made/needed]: none
Flags: needinfo?(wmccloskey)
Attachment #8709769 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 30•9 years ago
|
||
Also, it's probably easier if I handle the uplifts here. That way I can do the rebasing.
Comment 31•9 years ago
|
||
Comment on attachment 8709779 [details] [diff] [review]
Make sure to queue mDequeueOneTasks after cancellation
[Triage Comment]
Fix crashes, taking it
Attachment #8709779 -
Flags: approval-mozilla-aurora+
Comment 32•9 years ago
|
||
Comment on attachment 8709777 [details] [diff] [review]
don't return MsgNotAllowed
[Triage Comment]
Attachment #8709777 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Attachment #8709776 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Attachment #8709775 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Attachment #8709774 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Attachment #8709773 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Attachment #8709772 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Attachment #8709769 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 33•9 years ago
|
||
This patch causes us to record the reason why a Send call failed. It's important for the fuzzer to know whether a message actually made it to the other side or not so that it can make assertions about message ordering. This allows that to happen.
Attachment #8711277 -
Flags: review?(dvander)
Assignee | ||
Comment 34•9 years ago
|
||
Just to follow up on comment 16, the additional bugs I found are mostly caused by the message races that I had hoped to eliminate, but I couldn't because cancelling in this case broke our tests. Instead, I've put together some different fixes for them.
I also found some additional bugs when I added some artificial sleeps and timeouts.
Assignee | ||
Comment 35•9 years ago
|
||
This patch fixes some more cancellation issues. Namely:
1. I had fixed a problem where a cancel message for a transaction that was timed out would do the right thing. However, there's also a related problem where we send CPOW message M, time it out, then receive a message M' that's in the same transaction as M, and cancel M' (and hence M as well). In this case, the cancellation happens on *our* side rather than from a message. We need to end the timeout in that case, which we weren't doing.
2. Let's say that the child sends a sync message MC at the same time the parent sends a high prio sync message MP. MP will be dispatched first, in the child. If we cancel MP, then MC will be automatically be cancelled as well, in the child at least (since WasTransactionCancelled will return true). But, in the parent, MC will still be in mPending. When the parent eventually replies to MC, the child will crash. This patch removes MC from mPending in the parent if this happens. (Note that the "Cancel racy CPOW messages" would have forbidden this scenario, but I couldn't land it.)
Attachment #8711285 -
Flags: review?(dvander)
Assignee | ||
Comment 36•9 years ago
|
||
This is kind of a weird situation that mostly impacts fuzzing, but I'd like to fix it. Let's say that we have one of these race situations where the child sends a sync normal prio message at the same time the parent sends a CPOW message. The parent's message will dispatch in the child first. Suppose the child cancels it. This is the situation where we intentionally crash, since my "Cancel racy CPOW messages" patch didn't land. However, we don't intentionally crash until the stack has unwound to the original sync send message call in the child. At that point, we call WasTransactionCancelled and crash.
However, in between the cancellation and crashing, a lot of messages can be sent and received. The request to cancel happens inside a message handler, which could do a lot more stuff after it cancels. The fact that the sync message was never delivered means that message ordering has been violated, so the fuzzer will find "bugs", even though we're planning to crash soon after.
This patch moves up the crash so that it happens at the time of the cancel. That way we can't do any more messaging and so the fuzzer won't find any false positives like this.
Attachment #8711286 -
Flags: review?(dvander)
Assignee | ||
Comment 37•9 years ago
|
||
Remember that patch "don't return MsgNotAllowed"? It turns out it was only valid to remove MsgNotAllowed when we had the "Cancel racy CPOW messages" patch. The comment in the patch explains:
// If we've timed out a message and we're awaiting the reply to the timed
// out message, we have to be careful what messages we process. Here's what
// can go wrong:
// 1. child sends a normal priority sync message S
// 2. parent sends a high priority sync message H at the same time
// 3. parent times out H
// 4. child starts processing H and sends a high priority message H' nested
// within the same transaction
// 5. parent dispatches S and sends reply
// 6. child asserts because it instead expected a reply to H'.
Rather than using MsgNotAllowed as before, I thought of a better way:
// To solve this, we refuse to process S in the parent until we get a reply
// to H. More generally, let the timed out message be M. We don't process a
// message unless the child would need the response to that message in order
// to process M. Those messages are the ones that have a higher priority
// than M or that are part of the same transaction as M.
Attachment #8711287 -
Flags: review?(dvander)
Assignee | ||
Comment 38•9 years ago
|
||
Here's another problem I ran into:
Suppose that we're waiting for a reply to a sync message M. Eventually we time out, call ShouldContinueFromTimeout, and it returns false. However, ShouldContinueFromTimeout drops the lock. Suppose that, in that time, a nested message for the same transaction as M is received--call it M'. Once ShouldContinueFromTimeout returns, Send will return without having looked at M'.
At this point we have a problem. When the link thread added M' to mPending, it did not post an OnMaybeDequeueOne task because Send was still on the stack. But Send returned without looking at mPending, so now the message is essentially lost. The parent is not allowed to send anymore until it gets a response to the timed out message M. And the child is blocked waiting for a response to M'. So we're basically deadlocked.
We could solve this by calling ProcessPendingRequests after ShouldContinueFromTimeout. However, that causes some other weird issues. It seems easier to just unconditionally post an OnMaybeDequeueOne task every time a message is received. Even if we dispatch the message from ProcessPendingRequests, the worst that will happen is that OnMaybeDequeueOne will find an empty queue and do nothing. This should be fairly uncommon anyway since nested CPOW messages are uncommon. Overall, it's a nice way of eliminating weird bugs.
Attachment #8711289 -
Flags: review?(dvander)
Assignee | ||
Comment 39•9 years ago
|
||
This patch changes a function signature and also changes WasTransactionCanceled to null out mRecvd in a weird case. Here's the comment:
// Imagine this scenario:
// 1. Child sends high prio sync message H1.
// 2. Parent sends reply to H1.
// 3. Parent sends high prio sync message H2.
// 4. Child's link thread receives H1 reply and H2 before worker wakes up.
// 5. Child dispatches H2 while still waiting for H1 reply.
// 6. Child cancels H2.
//
// In this case H1 will also be considered cancelled. However, its
// reply is still sitting in mRecvd, which can trip up later Sends. So
// we null it out here.
I thought about changing things so that Send checks mRecvd before calling ProcessPendingRequests. I think that might work, but this seems like a more conservative change.
Attachment #8711290 -
Flags: review?(dvander)
Assignee | ||
Comment 40•9 years ago
|
||
Another problem caused by ShouldContinueFromTimeout dropping the lock. In this case, the transaction we're waiting on could be cancelled while ShouldContinueFromTimeout runs. In that case, we don't want to act as if a timeout occurred because we'll never get a reply to the timed out message.
Attachment #8711291 -
Flags: review?(dvander)
Assignee | ||
Comment 41•9 years ago
|
||
Here's the fuzzer. With the patches above, it runs cleanly for me.
Besides randomly sending messages of different types, it checks that normal and urgent priority messages are dispatched in the order they were sent. It doesn't check high priority messages (CPOWs) because cancellation and races basically means that they might not run in order.
It also has hooks in MessageChannel.cpp that allow it to simulate timeouts and also to inject sleep calls when the mMonitor is not held. Both of these found several bugs.
Finally, I added a hook so that an intentional crash just causes the fuzzer to exit normally. That way our intentional crashes don't look like bugs.
If you run the fuzzer, it will usually exit pretty quickly since it hits intentional crashes so often. If it doesn't exit, that means you hit a deadlock. I've been running it in a loop, checking the exit code.
Attachment #8711292 -
Flags: review?(dvander)
Assignee | ||
Comment 42•9 years ago
|
||
It's getting a bit late. Tomorrow I'm going to uplift everything that had approval except for the MsgNotAllowed patch, which was wrong. It's likely the remaining patches will miss beta 1, but I think they're probably less important.
Assignee | ||
Comment 43•9 years ago
|
||
Unfortunately, the patches that have been approved for Aurora cause test_cpows.xul to die with an assertion failure on Aurora. I guess I'll look into this tomorrow.
Assignee | ||
Comment 44•9 years ago
|
||
Fixed the Aurora thing. It turns out that a line from the patch David didn't review is needed to avoid an assert. It just sets the ID of the cancel message. I'm not sure why it's not failing on central. I'm going to fix that now, as well as backing out the MsgNotAllowed patch.
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/b7104abd6362
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/d1a3b8124b7a
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/0e4c292ead72
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/882c647d9103
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/20666a406bfa
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/4128f782d644
Comment 45•9 years ago
|
||
![]() |
||
Comment 46•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4a51cd342caf
https://hg.mozilla.org/mozilla-central/rev/376dc3151e42
landed these morning, in time for Firefox 46
Attachment #8711277 -
Flags: review?(dvander) → review+
Attachment #8709781 -
Flags: review?(dvander) → review+
Attachment #8711285 -
Flags: review?(dvander) → review+
Attachment #8711286 -
Flags: review?(dvander) → review+
Attachment #8711287 -
Flags: review?(dvander) → review+
Comment on attachment 8711289 [details] [diff] [review]
Always enqueue dispatch task on message arrival
Review of attachment 8711289 [details] [diff] [review]:
-----------------------------------------------------------------
This sounds like a good idea.
Attachment #8711289 -
Flags: review?(dvander) → review+
Attachment #8711290 -
Flags: review?(dvander) → review+
Attachment #8711291 -
Flags: review?(dvander) → review+
Comment 48•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/b94c1a472d46
https://hg.mozilla.org/integration/mozilla-inbound/rev/f10c0445c450
https://hg.mozilla.org/integration/mozilla-inbound/rev/06b4cb890ab0
https://hg.mozilla.org/integration/mozilla-inbound/rev/85c8cb422bad
https://hg.mozilla.org/integration/mozilla-inbound/rev/2c0c0ed35656
https://hg.mozilla.org/integration/mozilla-inbound/rev/b7f2ce03b34e
https://hg.mozilla.org/integration/mozilla-inbound/rev/c3dfc2c674ff
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d7f80a057f0
https://hg.mozilla.org/integration/mozilla-inbound/rev/30f1acd9387f
Comment 49•9 years ago
|
||
Comment 50•9 years ago
|
||
(In reply to Pulsebot from comment #49)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/296fd9d50c25
(Pushed a follow-up to make CancelMessage |explicit|.)
All backed out between these two commmits:
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/f8be949a7397
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/928befa33041
Something from this or bug 1242097 caused near permafails like https://treeherder.mozilla.org/logviewer.html#?job_id=20470069&repo=mozilla-inbound
Flags: needinfo?(wmccloskey)
Comment 52•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ce037f2107d
https://hg.mozilla.org/integration/mozilla-inbound/rev/3274c88c07a2
https://hg.mozilla.org/integration/mozilla-inbound/rev/3e614d3776e0
https://hg.mozilla.org/integration/mozilla-inbound/rev/cbdcef6d6e01
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a967aa0f327
https://hg.mozilla.org/integration/mozilla-inbound/rev/abc21e8ce10c
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b1d57fc5b3c
https://hg.mozilla.org/integration/mozilla-inbound/rev/9e5480083084
https://hg.mozilla.org/integration/mozilla-inbound/rev/958a7057e1ac
Comment 53•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4ce037f2107d
https://hg.mozilla.org/mozilla-central/rev/3274c88c07a2
https://hg.mozilla.org/mozilla-central/rev/3e614d3776e0
https://hg.mozilla.org/mozilla-central/rev/cbdcef6d6e01
https://hg.mozilla.org/mozilla-central/rev/2a967aa0f327
https://hg.mozilla.org/mozilla-central/rev/abc21e8ce10c
https://hg.mozilla.org/mozilla-central/rev/8b1d57fc5b3c
https://hg.mozilla.org/mozilla-central/rev/9e5480083084
https://hg.mozilla.org/mozilla-central/rev/958a7057e1ac
Comment 54•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4ce037f2107d
https://hg.mozilla.org/mozilla-central/rev/3274c88c07a2
https://hg.mozilla.org/mozilla-central/rev/3e614d3776e0
https://hg.mozilla.org/mozilla-central/rev/cbdcef6d6e01
https://hg.mozilla.org/mozilla-central/rev/2a967aa0f327
https://hg.mozilla.org/mozilla-central/rev/abc21e8ce10c
https://hg.mozilla.org/mozilla-central/rev/8b1d57fc5b3c
https://hg.mozilla.org/mozilla-central/rev/9e5480083084
https://hg.mozilla.org/mozilla-central/rev/958a7057e1ac
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 55•9 years ago
|
||
Comment on attachment 8709781 [details] [diff] [review]
crash if cancelling during urgent dispatch
Asking for approval for the remainder of these patches. They're a continuation of the same work that wasn't finished in time for the merge. Reasons are the same as comment 29.
Attachment #8709781 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•9 years ago
|
Attachment #8711292 -
Attachment description: patch → fuzzer code (for testing only)
Assignee | ||
Updated•9 years ago
|
Attachment #8709781 -
Flags: approval-mozilla-beta?
Comment 56•9 years ago
|
||
Comment on attachment 8709781 [details] [diff] [review]
crash if cancelling during urgent dispatch
Let's take it too
Should be in 45 beta 2.
Attachment #8709781 -
Flags: approval-mozilla-beta?
Attachment #8709781 -
Flags: approval-mozilla-beta+
Attachment #8709781 -
Flags: approval-mozilla-aurora?
Attachment #8709781 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 57•9 years ago
|
||
Landing on beta first in the hopes it will make it into beta2:
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/e26322e1dbb0
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/2e2b6ee1256d
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/1f14d50e3ee6
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/3bd120e0ce2c
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/37bdb0e79e27
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/a78b92088738
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/c04e48db3167
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/f1d61f8c26d3
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/55272f23ad95
Assignee | ||
Comment 58•9 years ago
|
||
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/25abb1fd2b07
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/d7943f066935
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/344dc7f95bf7
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/f4688b5593f3
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/f2b01eab1bfe
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/172b78c572c2
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/e2eeee0ceb5f
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/3860af14030c
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/980fea2f7011
Assignee | ||
Comment 59•9 years ago
|
||
Comment on attachment 8711292 [details] [diff] [review]
fuzzer code (for testing only)
Review of attachment 8711292 [details] [diff] [review]:
-----------------------------------------------------------------
Gabor asked to review some IPC code. This seems like it would be a good start.
Attachment #8711292 -
Flags: review?(dvander) → review?(gkrizsanits)
Updated•9 years ago
|
status-firefox45:
--- → affected
status-firefox47:
--- → affected
Comment 60•9 years ago
|
||
It was too late for beta2, should be in beta 3.
Please open a new bug for further patches. It is going to be too complex otherwise.
Thanks
Comment 61•9 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #60)
> It was too late for beta2, should be in beta 3.
>
> Please open a new bug for further patches. It is going to be too complex
> otherwise.
> Thanks
seems this was not to late for beta 2 (at least in https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=55272f23ad95) was before the release tags
Comment 62•9 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #61)
> seems this was not to late for beta 2 (at least in
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> beta&revision=55272f23ad95) was before the release tags
You need to look at the revision that was actually tagged for release, not just the order they showed up on Treeherder/pushlog. Mercurial history isn't guaranteed to be linear. This did in fact miss beta 2.
Comment 63•9 years ago
|
||
Comment on attachment 8711292 [details] [diff] [review]
fuzzer code (for testing only)
Review of attachment 8711292 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry for the lag, I wanted to play with it a bit, but could not figure out how to run the test (relevant docs are outdated, and it seems like I'm missing one more flag to get the test compiled...). Anyway I don't want to hold up landing this any longer I can do the experimenting part later. I think I understand more or less everything what the test does, my only remark is:
::: ipc/ipdl/test/cxx/TestDemon.h
@@ +49,5 @@
> +
> +private:
> + bool mDone;
> + int mIncoming[3];
> + int mOutgoing[3];
Both for mIncoming and mOutgoing you only use 0 and 2. It would be easier to follow if you just used mIncomingUrgent/mIncomingNormal pairs or something... Or do you plan to expend it to high prio too somehow and that's why you need 3 pairs?
Attachment #8711292 -
Flags: review?(gkrizsanits) → review+
Depends on: 1248750
Comment 64•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Keywords: leave-open
Comment 65•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in
before you can comment on or make changes to this bug.
Description
•