Closed Bug 1724048 Opened 4 years ago Closed 2 years ago

Intermittent Assertion failure: neckoTarget, at /builds/worker/checkouts/gecko/netwerk/protocol/http/HttpChannelChild.cpp:2823

Categories

(Core :: Networking: HTTP, defect, P2)

defect

Tracking

()

RESOLVED FIXED
118 Branch
Tracking Status
firefox-esr102 --- wontfix
firefox-esr115 --- wontfix
firefox116 --- wontfix
firefox117 --- wontfix
firefox118 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: valentin)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, intermittent-failure, Whiteboard: [necko-triaged][necko-priority-review][no-nag])

Attachments

(1 file)

Filed by: smolnar [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=347394766&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/XIHE4lS5TSKptZVzmbTqOw/runs/0/artifacts/public/logs/live_backing.log


INFO - TEST-OK | dom/tests/mochitest/general/test_storagePermissionsAccept.html | took 761ms
[task 2021-08-04T17:58:36.047Z] 17:58:36     INFO - GECKO(804) | [Parent 8948, Main Thread] WARNING: WebProgress Ignored: no longer current window global: file /builds/worker/checkouts/gecko/dom/ipc/BrowserParent.cpp:2956
[task 2021-08-04T17:58:36.057Z] 17:58:36     INFO - GECKO(804) | Assertion failure: neckoTarget, at /builds/worker/checkouts/gecko/netwerk/protocol/http/HttpChannelChild.cpp:2823
[task 2021-08-04T17:58:36.068Z] 17:58:36     INFO - GECKO(804) | [Parent 8948, Main Thread] WARNING: Not resolving response because actor is dead.: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:931
[task 2021-08-04T17:58:36.069Z] 17:58:36     INFO - GECKO(804) | [Parent 8948, Main Thread] WARNING: IPDL resolver dropped without being called!: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:959
[task 2021-08-04T17:58:36.069Z] 17:58:36     INFO - GECKO(804) | [Parent 8948, Main Thread] WARNING: Not resolving response because actor is dead.: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:931
[task 2021-08-04T17:58:36.072Z] 17:58:36     INFO - TEST-START | dom/tests/mochitest/general/test_storagePermissionsLimitForeign.html
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Resolution: --- → INCOMPLETE
Summary: Intermittent [Tier 2] Assertion failure: neckoTarget, at /builds/worker/checkouts/gecko/netwerk/protocol/http/HttpChannelChild.cpp:2823 → Intermittent Assertion failure: neckoTarget, at /builds/worker/checkouts/gecko/netwerk/protocol/http/HttpChannelChild.cpp:2823
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

I just hit this locally and caught it in rr (though not pernosco-friendly-configuration rr). (For my own reference: the recording is .local/share/rr/mach-17 on my ThinkPad.)

In my rr recording:
for the HttpChannelChild instance in question, we just never get a call to HttpChannelChild::SetEventTarget() (or at least: we receive a call to AsyncCallImpl first, and we fail this assertion there due to mNeckoTarget being null.)

valentin, do is there anything that would be useful for me to try to capture here from my rr recording, while I've got it handy? (And/or, is this assertion actually valid & important? It's not clear what the consequences/implications are of it failing, if any.)

Flags: needinfo?(valentin.gosu)

Thank you, Daniel. I think I know what's going on here:
When we AsyncOpen the HttpChannelChild, we exit early, possibly because we're already shutting down. But we've already added the channel to the loadgroup here, so when the loadgroup gets cleaned up, we try to cancel the channel. But because we didn't get to that point where we call SetEventTarget, we trigger the assertion.

I think we should actually move the mLoadGroup->AddRequest call just before we call gNeckoChild->SendPHttpChannelConstructor, and after we set the event target to avoid this issue, and to make sure we don't add already cancelled requests to the loadgroup.

Flags: needinfo?(valentin.gosu)
Whiteboard: [necko-triaged][necko-priority-review]
Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → INCOMPLETE

Reopening, since this still happens (as of a month ago at least when I hit it locally). We have some analysis/suggestions in comment 13.

Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → INCOMPLETE

Reopening; presumably still an issue, even though it's infrequent. See analysis/discussion above.

Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

(adding [no-nag] to whiteboard to stop the bot from automatically closing it every month.)

Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged][necko-priority-review][no-nag]

This was also reported as bug 1848421.
I'll try the method in comment 13. That will probably fix it.

Assignee: nobody → valentin.gosu
Priority: P5 → P2
Duplicate of this bug: 1848421

When we AsyncOpen the HttpChannelChild, we might exit early, possibly because
we're already shutting down. But if we've already added the channel to the
loadgroup when the loadgroup gets cleaned up we try to cancel the channel.
Because we didn't get to that point where we call SetEventTarget,
we would trigger the neckoTarget assertion in HttpChannelChild::AsyncCallImpl.

This patch makes sure we add the channel to the loadgroup when we are certain
it will actually be opened.

Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/0e5fd894164e Only add the HttpChannelChild to the loadGroup when guaranteed we will actually open the channel. r=necko-reviewers,kershaw
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 118 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: