Closed Bug 1493724 Opened Last year Closed 11 months ago

AltSvcTransaction::MaybeValidate() doesn’t fail at “Failed due to precondition"

Categories

(Core :: Networking: HTTP, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: timhuang, Assigned: u408661)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged])

Attachments

(1 file)

When the logs show AltSvcTransaction::Close
-> ~AltSvcTransaction, things work. AltSvcTransaction::MaybeValidate()
(which only does something on its first invocation – it no-ops after
that) doesn’t fail at “Failed due to precondition” since mConnection
is not null. However, when ~AltSvcTransaction is called without
AltSvcTransaction::Close having been called, MaybeValidate is first
invoked from the dtor where mConnection is null; the precondition
check fails.


To repro this:

Set proxy to point to Tor (eg. With Tor browser or in Firefox with
SOCKS5 hostname proxying)
Visit https://www.facebook.com/si/proxy where FB will serve an Alt-Svc
header pointing to a Tor Onion Service - this page will show “tor”
when not using Alt-Svc and “onion” when Alt-Svc is being used
Refresh the page a bunch
Nick, can you have a look at this?
Flags: needinfo?(hurley)
Priority: -- → P3
Whiteboard: [necko-triaged]
Tim - it's unclear to me what your expected behaviour is here. Everything you describe in the first paragraph is to be expected. For the record, I suspect, but have not proven, that the reason you see the precondition check failing when reloading the page a bunch is because you run past the parallel speculative connection limit, which would kill the AltSvcTransaction without a connection.

I guess my real question here is... what is the problem with the precondition check failing? AltSvc is advisory, we are not required to use it. However, it's possible that we are failing to use it when we perhaps should (there could be some sort of race between the failing check and the reloads?). I'm not sure that the problem would be fixable if it is, in fact, a race (AltSvc is racy by definition), but I can't say for sure either way until I know what the actual problem is.
Flags: needinfo?(hurley) → needinfo?(tihuang)
We found that the issue is lying on the alt-svc cache. So, the alt-svc cache using the same HashKey for the same original scheme, host, and port. This would not be a problem if the Alt-Svc field only serves one value. However, there could be multiple values in the Alt-Svc field. Say, the first Alt-Svc value is verified and the second value fails due to a certain reason, the alt-svc cache entry will be overridden by the result of the second one. Which means, Firefox won't use the first value in this case even it has been verified.

Do you think that this is a regression, Nick?
Flags: needinfo?(tihuang) → needinfo?(hurley)
Ah, I didn't catch the part about multiple alt-svc options in the original report, thanks for clarifying.

I wouldn't call this a regression (since alt-svc has always behaved this way), but I would say it's a bit buggy. Like I mentioned before, nothing spec-breaking or otherwise illegal (as it is on the client to determine whether or not to use alt-svc, and the client can add extra conditions), but I don't imagine this is what we had in mind when we implemented alt-svc.

I'll take a look at a fix this week.
Assignee: nobody → hurley
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(hurley)
OK, this is going to be a little more involved to get the proper fix. The root cause here is... we don't support having multiple altsvc mappings at all. Right now, an altsvc is a simple 1-to-1 mapping of origin->altsvc. The simple fix (while keeping the strict 1-1 mapping) is to just not clear out an existing, valid mapping until the new one is validated. That would be pretty straightforward. For the record, there aren't multiple alt-svc options coming in at once with the facebook URL above, but each time you request (or nearly each time), it sends an alt-svc header with a new .onion address. So that's why the old one gets cleared out.

However, in light of things like bug 1497263 and some discussions via email about racing connections to different altsvc entries, it seems like we really should just support having multiple, valid altsvc entries for the same origin.

So, while I've looked at a fix this week (as mentioned in comment 4), don't go expecting a final fix this week - this will be a little more involved :)
Blocks: 1498663
So, after looking at our telemetry around altsvc, I'm not totally convinced that keeping more than one altsvc response per origin is actually going to buy us anything. I'll need to gather some more/different telemetry to be sure, though (our current telemetry is limited in what it tells us, and the httparchive data in google bigquery is too large for me to get anything useful out of - one query has used up my entire monthly free quota!).

I'll hang bugs to get the telemetry off of bug 1498663. For this one, I'll go ahead and get the single-altsvc fix reviewed and landed.
Right now, as soon as we receive an alt-svc header or frame for an origin, we will overwrite any mapping we already have for that origin, even if it's still valid. This means that, should validation fail for the new mapping, we will have blown away a perfectly usable alt-svc mapping for no good reason. This patch prevents us from overwriting until we know the new mapping is good.
Pushed by hurley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/32e5d5e6c980
Don't overwrite still-valid altsvc mappings until a new one is validated. r=valentin
https://hg.mozilla.org/mozilla-central/rev/32e5d5e6c980
Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
You need to log in before you can comment on or make changes to this bug.