neqo update breaks OpenBSD sandboxing because access to the route socket isnt allowed
Categories
(Core :: Networking, defect, P2)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr128 | --- | unaffected |
| firefox136 | --- | unaffected |
| firefox137 | --- | wontfix |
| firefox138 | --- | wontfix |
| firefox139 | --- | fixed |
People
(Reporter: gaston, Assigned: leggert)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression, Whiteboard: [necko-triaged])
testing 137.0b2 on OpenBSD, the main process blows early at startup on a socket() syscall:
85610 firefox CALL socket(AF_ROUTE,0x3<SOCK_RAW>,0)
85610 firefox PLDG socket, "inet", errno 22 Invalid argument
firefox is sandboxed on OpenBSD with https://man.openbsd.org/pledge and the main process already has a large pledge class list, but the route socket isnt allowed by any sandboxing config.
i've been pointed at https://github.com/mozilla/mtu/pull/29/ which added that feature to the mtu crate, which was added in bug #1942325
not fully sure what could call those codepaths to be called, maybe https://searchfox.org/mozilla-central/source/third_party/rust/mtu/src/bsd.rs#293 ?
| Reporter | ||
Updated•1 year ago
|
Comment 1•1 year ago
|
||
Set release status flags based on info from the regressing bug 1942325
:mail, since you are the author of the regressor, bug 1942325, could you take a look?
For more information, please visit BugBot documentation.
| Assignee | ||
Comment 2•1 year ago
|
||
The Firefox QUIC stack (neqo) now implements path MTU discovery for UDP. In order to not overshoot the MTU of the local interface towards the server and incur avoidable packet loss, it needs to determine that MTU via a PF_ROUTE query.
Ideally, you would allow opening the PF_ROUTE socket and RTM_GET messages.
You could also deny opening the PF_ROUTE socket; the code should handle that error but will then run into packet loss while probing for the path MTU, which will lower user experience some.
| Reporter | ||
Comment 3•1 year ago
|
||
(In reply to Lars Eggert [:lars] from comment #2)
The Firefox QUIC stack (neqo) now implements path MTU discovery for UDP. In order to not overshoot the MTU of the local interface towards the server and incur avoidable packet loss, it needs to determine that MTU via a
PF_ROUTEquery.Ideally, you would allow opening the
PF_ROUTEsocket andRTM_GETmessages.
i think that's .. not going to happen. the sandboxing model is strict here. unless that happened in a separate process, that wouldnt be sandboxed..
You could also deny opening the
PF_ROUTEsocket; the code should handle that error
process gets a segfault because it tries to do a syscall with forbidden args. that's how the sandboxing is designed.
so what's the alternative ? disable quic altogether ? disable pmtud if possible ? throw sandboxing over the wall ?
| Assignee | ||
Comment 4•1 year ago
•
|
||
You could disable QUIC, but that would leave OpenBSD users with a decidedly second-class web experience.
I can't recall it we have the option to disable PMTUD for QUIC. @mxinden? It would also affect the web UX, since throughputs would be lower.
I'd suggest allowlisting this call, but YMMV.
| Assignee | ||
Comment 5•1 year ago
|
||
@landry is there a way on OpenBSD to obtain an interface MTU without going through the route socket?
| Reporter | ||
Comment 6•1 year ago
|
||
(In reply to Lars Eggert [:lars] from comment #5)
@landry is there a way on OpenBSD to obtain an interface MTU without going through the route socket?
i've been told there's http://man.openbsd.org/getifaddrs.3 (which .. is already used in the code ?), for AF_LINK sockets the mtu is in the if_data struct member.
But since the method takes an ip as parameter, a route lookup is still needed ?
(note that i'm not a network/libc internals specialist, and i dont grok rust at all)
| Reporter | ||
Comment 7•1 year ago
|
||
a (maybe gross) patch that seems to work on top of 137.0b2, hardcoding 1492 instead of doing any interface lookup:
Index: third_party/rust/neqo-transport/src/path.rs
--- third_party/rust/neqo-transport/src/path.rs.orig
+++ third_party/rust/neqo-transport/src/path.rs
@@ -522,18 +522,7 @@ impl Path {
now: Instant,
stats: &mut Stats,
) -> Self {
- let iface_mtu = match mtu::interface_and_mtu(remote.ip()) {
- Ok((name, mtu)) => {
- qdebug!("Outbound interface {name} has MTU {mtu}");
- stats.pmtud_iface_mtu = mtu;
- Some(mtu)
- }
- Err(e) => {
- qwarn!("Failed to determine outbound interface: {e}");
- None
- }
- };
- let mut sender = PacketSender::new(cc, pacing, Pmtud::new(remote.ip(), iface_mtu), now);
+ let mut sender = PacketSender::new(cc, pacing, Pmtud::new(remote.ip(), Some(1492)), now);
sender.set_qlog(qlog.clone());
Self {
local,
| Assignee | ||
Comment 8•1 year ago
|
||
(In reply to Landry Breuil (:gaston) from comment #6)
i've been told there's http://man.openbsd.org/getifaddrs.3 (which .. is already used in the code ?), for AF_LINK sockets the mtu is in the
if_datastruct member.But since the method takes an ip as parameter, a route lookup is still needed ?
Yes, we need to do a route lookup to determine which interface to get the information from.
Your patch means that MTUs larger than 1492 will never be utilized. It's your call whether you want to limit OpenBSD users in this way. If this is what OpenBSD wants, we could do something similar on that platform, but I'd want to refer to some sort of official statement from OpenBSD for this. (Also, why 1492 and not 1500?)
| Reporter | ||
Comment 9•1 year ago
|
||
i don't know what OpenBSD wants, but what i would want is going back to the previous neqo version where i didnt have to figure out a way out of this nightmare :) 136 didnt have those issues..
pretty sure the 'official statement' you'd get from the OpenBSD community would be something along the lines of "QUIC is just a bad idea", which doesn't help my issue here :)
i used 1492 when testing because that's what works for pppoe, that's all. from what i've been told that wont work for mobile-constrained networks with mss fixups or ipsec/wg tunnels either.
again, not a specialist of this mtu/pmtud thing at all, i'm just trying to keep firefox building (and running!) on openbsd over time.
| Assignee | ||
Comment 10•1 year ago
|
||
PMTUD in 137 will increase throughputs by ~10% due to being able to utilize full-sized packets. I'd think that OpenBSD users would want that.
Who controls that pledge list for Firefox on OpenBSD? Sounds like a case needs to be made to enable the ability to access the route socket.
| Reporter | ||
Comment 11•1 year ago
|
||
(In reply to Lars Eggert [:lars] from comment #10)
PMTUD in 137 will increase throughputs by ~10% due to being able to utilize full-sized packets. I'd think that OpenBSD users would want that.
Who controls that
pledgelist for Firefox on OpenBSD? Sounds like a case needs to be made to enable the ability to access the route socket.
its' not a question of "who controls that pledge list" (that's me, and the list is already more or less "everything"), but there's actually no class for "ability to access the route socket", and i doubt there will be one "just to please firefox".
other sandboxed softwares in openbsd using the route socket open it before enabling the sandboxing (a coding technique called "hoisting" where you move as much initialization as possible before sandboxing, so that you have to enable less capabilities possible in the mainloop), not "anytime during the process lifetime".
but, back to the original issue. some people smarter than me started looking into it, The kernel sandboxing code in https://github.com/openbsd/src/blob/master/sys/kern/kern_pledge.c#L1503 doesn't seem to have a case for AF_ROUTE.
[15:45] daytona:~/ $cat r.c
#include <sys/types.h>
#include <sys/socket.h>
#include <net/if.h>
#include <net/route.h>
#include <unistd.h>
#include <err.h>
int
main(void)
{
int s;
if (pledge("stdio inet route", NULL) == -1)
err(1, "pledge");
s = socket(AF_ROUTE, SOCK_RAW, AF_INET);
return 0;
}
$make r && ./r
r[21942]: pledge "inet", syscall 97
Abort trap (core dumped)
| Reporter | ||
Comment 12•1 year ago
|
||
from what i understand, pmtud is about finding the 'best' MTU from our host to a remote IP. And the 'local interface' MTU seems to be a starting point from the pmtud algorithm, trying to increase/decrease it over subsequent hops to find the best one for the complete path. Couldnt we use a fixed 'safe value' (eg 1350 or 9000, depending of which direction the algorithm works) instead of absolutely wanting to get the MTU from the local interface ?
i see that before https://github.com/mozilla/mtu/pull/29, this was achieved without accessing the route socket, but "didnt work in firefox" ? so there was no pmtud done in quic before ?
another option which i dont know is doable: open the route socket before sandboxing, and only once, reusing the FD for each subsequent use. I dunno from where all this rust code is called at process startup, but the sandboxing is started here. Before that point, everything is allowed.
and finally, another data point: you can't 'know' if you're sandboxed, eg no 'if sandboxed do X otherwise do Y'.
Comment 13•1 year ago
|
||
I am not familiar with OpenBSD's sandboxing, thus removing my need-info.
Couldnt we use a fixed 'safe value' (eg 1350 or 9000, depending of which direction the algorithm works) instead of absolutely wanting to get the MTU from the local interface ?
Yes, a 'safe value' is already in place, see first value in each of the two arrays below. That said, how do we detect whether querying the routing socket is safe or not?
For the record, related mozilla/mtu pull request "Support rtable on OpenBSD": https://github.com/mozilla/mtu/pull/100
| Reporter | ||
Comment 14•1 year ago
|
||
(In reply to Max Inden from comment #13)
I am not familiar with OpenBSD's sandboxing, thus removing my need-info.
Couldnt we use a fixed 'safe value' (eg 1350 or 9000, depending of which direction the algorithm works) instead of absolutely wanting to get the MTU from the local interface ?
Yes, a 'safe value' is already in place, see first value in each of the two arrays below. That said, how do we detect whether querying the routing socket is safe or not?
you don't, since as i said you cant know if you're sandboxed or not. and i dunno if the mtu crate is used outside firefox, but i know that firefox (mainline and esr) are sandboxed on OpenBSD by default.
so i think it's safe to assume that the mtu crate within firefox source tree shouldn't try to get the mtu from the interface, but start with the first value of those arrays. that sounds like an acceptable tradeoff to me, what do you think ?
| Assignee | ||
Comment 15•1 year ago
|
||
(In reply to Landry Breuil (:gaston) from comment #14)
so i think it's safe to assume that the mtu crate within firefox source tree shouldn't try to get the mtu from the interface, but start with the first value of those arrays. that sounds like an acceptable tradeoff to me, what do you think ?
That's what PMTUD already does. The routing table lookup is needed to determine at which value PMTUD should stop trying to increase. Without the routing table lookup, we'll try and send packets that are too large, so they get dropped and count as congestion.
The only way forward here that I see would be a Firefox pref (controlling a neqo connection parameter) that disables the routing table lookup.
Updated•1 year ago
|
| Reporter | ||
Comment 16•1 year ago
|
||
(In reply to Lars Eggert [:lars] from comment #15)
(In reply to Landry Breuil (:gaston) from comment #14)
The only way forward here that I see would be a Firefox pref (controlling a neqo connection parameter) that disables the routing table lookup.
can that happen in a reasonable timeframe so that it gets backported to 137 ? otherwise i'll have to ship with a local patch disabling the corresponding code.
Comment 17•1 year ago
|
||
(In reply to Landry Breuil (:gaston) from comment #16)
(In reply to Lars Eggert [:lars] from comment #15)
(In reply to Landry Breuil (:gaston) from comment #14)
The only way forward here that I see would be a Firefox pref (controlling a neqo connection parameter) that disables the routing table lookup.
can that happen in a reasonable timeframe so that it gets backported to 137 ? otherwise i'll have to ship with a local patch disabling the corresponding code.
I am not sure if we have enough time to release a new neqo version and uplift it.
Max, what do you think?
| Assignee | ||
Comment 18•1 year ago
|
||
I don't think so either. Plus, OpenBSD is a tier-3 platform. Let's make this part of the next neqo release.
| Assignee | ||
Comment 19•1 year ago
|
||
Assigning to Lars for now.
Comment 21•1 year ago
|
||
I am not sure if we have enough time to release a new neqo version and uplift it.
Max, what do you think?
Clear the need info, since we've decided to address this in the next release.
| Reporter | ||
Comment 22•1 year ago
|
||
thanks :lars !
Comment 23•1 year ago
|
||
For the record, the above patch in Neqo will land in mozilla-central as part of Bug 1959128.
| Reporter | ||
Comment 24•11 months ago
|
||
now that bug 1959128 has landed, i see that pmtud is disabled by default on OpenBSD (via https://github.com/mozilla/gecko-dev/blob/7d16bef227c6bb997b62ae46bf8f0b116edb1a17/netwerk/socket/neqo_glue/src/lib.rs#L265) - i'll try to test this in the coming days.
Updated•11 months ago
|
Updated•11 months ago
|
Updated•11 months ago
|
| Reporter | ||
Comment 25•10 months ago
|
||
completely forgot, but i can confirm that 139 with the latest neqo doesn't blow up on OpenBSD with our sandboxing.
Description
•