<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

BugBot [:suhaib / :marco/ :calixte]

Reporter

Updated

•

1 year ago

Summary: neqo update breaks OpenBSD sandboxing → neqo update breaks OpenBSD sandboxing because access to the route socket isnt allowed

Comment 1

•

1 year ago

Set release status flags based on info from the regressing bug 1942325

:mail, since you are the author of the regressor, bug 1942325, could you take a look?

For more information, please visit BugBot documentation.

status-firefox136: --- → unaffected

status-firefox-esr128: --- → unaffected

Flags: needinfo?(mail)

Assignee

Comment 2

•

1 year ago

The Firefox QUIC stack (neqo) now implements path MTU discovery for UDP. In order to not overshoot the MTU of the local interface towards the server and incur avoidable packet loss, it needs to determine that MTU via a PF_ROUTE query.

Ideally, you would allow opening the PF_ROUTE socket and RTM_GET messages.

You could also deny opening the PF_ROUTE socket; the code should handle that error but will then run into packet loss while probing for the path MTU, which will lower user experience some.

Reporter

Comment 3

•

1 year ago

(In reply to Lars Eggert [:lars] from comment #2)

The Firefox QUIC stack (neqo) now implements path MTU discovery for UDP. In order to not overshoot the MTU of the local interface towards the server and incur avoidable packet loss, it needs to determine that MTU via a PF_ROUTE query.

Ideally, you would allow opening the PF_ROUTE socket and RTM_GET messages.

i think that's .. not going to happen. the sandboxing model is strict here. unless that happened in a separate process, that wouldnt be sandboxed..

You could also deny opening the PF_ROUTE socket; the code should handle that error

process gets a segfault because it tries to do a syscall with forbidden args. that's how the sandboxing is designed.

so what's the alternative ? disable quic altogether ? disable pmtud if possible ? throw sandboxing over the wall ?

Assignee

Comment 4

•

1 year ago

•

Edited

You could disable QUIC, but that would leave OpenBSD users with a decidedly second-class web experience.

I can't recall it we have the option to disable PMTUD for QUIC. @mxinden? It would also affect the web UX, since throughputs would be lower.

I'd suggest allowlisting this call, but YMMV.

Assignee

Comment 5

•

1 year ago

@landry is there a way on OpenBSD to obtain an interface MTU without going through the route socket?

Flags: needinfo?(landry)

Reporter

Comment 6

•

1 year ago

(In reply to Lars Eggert [:lars] from comment #5)

@landry is there a way on OpenBSD to obtain an interface MTU without going through the route socket?

i've been told there's http://man.openbsd.org/getifaddrs.3 (which .. is already used in the code ?), for AF_LINK sockets the mtu is in the if_data struct member.

But since the method takes an ip as parameter, a route lookup is still needed ?

(note that i'm not a network/libc internals specialist, and i dont grok rust at all)

Flags: needinfo?(landry)

Reporter

Comment 7

•

1 year ago

a (maybe gross) patch that seems to work on top of 137.0b2, hardcoding 1492 instead of doing any interface lookup:

Index: third_party/rust/neqo-transport/src/path.rs
--- third_party/rust/neqo-transport/src/path.rs.orig
+++ third_party/rust/neqo-transport/src/path.rs
@@ -522,18 +522,7 @@ impl Path {
         now: Instant,
         stats: &mut Stats,
     ) -> Self {
-        let iface_mtu = match mtu::interface_and_mtu(remote.ip()) {
-            Ok((name, mtu)) => {
-                qdebug!("Outbound interface {name} has MTU {mtu}");
-                stats.pmtud_iface_mtu = mtu;
-                Some(mtu)
-            }
-            Err(e) => {
-                qwarn!("Failed to determine outbound interface: {e}");
-                None
-            }
-        };
-        let mut sender = PacketSender::new(cc, pacing, Pmtud::new(remote.ip(), iface_mtu), now);
+        let mut sender = PacketSender::new(cc, pacing, Pmtud::new(remote.ip(), Some(1492)), now);
         sender.set_qlog(qlog.clone());
         Self {
             local,

Assignee

Comment 8

•

1 year ago

(In reply to Landry Breuil (:gaston) from comment #6)

i've been told there's http://man.openbsd.org/getifaddrs.3 (which .. is already used in the code ?), for AF_LINK sockets the mtu is in the if_data struct member.

But since the method takes an ip as parameter, a route lookup is still needed ?

Yes, we need to do a route lookup to determine which interface to get the information from.

Your patch means that MTUs larger than 1492 will never be utilized. It's your call whether you want to limit OpenBSD users in this way. If this is what OpenBSD wants, we could do something similar on that platform, but I'd want to refer to some sort of official statement from OpenBSD for this. (Also, why 1492 and not 1500?)

Reporter

Comment 9

•

1 year ago

i don't know what OpenBSD wants, but what i would want is going back to the previous neqo version where i didnt have to figure out a way out of this nightmare :) 136 didnt have those issues..

pretty sure the 'official statement' you'd get from the OpenBSD community would be something along the lines of "QUIC is just a bad idea", which doesn't help my issue here :)

i used 1492 when testing because that's what works for pppoe, that's all. from what i've been told that wont work for mobile-constrained networks with mss fixups or ipsec/wg tunnels either.

again, not a specialist of this mtu/pmtud thing at all, i'm just trying to keep firefox building (and running!) on openbsd over time.

Assignee

Comment 10

•

1 year ago

PMTUD in 137 will increase throughputs by ~10% due to being able to utilize full-sized packets. I'd think that OpenBSD users would want that.

Who controls that pledge list for Firefox on OpenBSD? Sounds like a case needs to be made to enable the ability to access the route socket.

Reporter

Comment 11

•

1 year ago

(In reply to Lars Eggert [:lars] from comment #10)

PMTUD in 137 will increase throughputs by ~10% due to being able to utilize full-sized packets. I'd think that OpenBSD users would want that.

Who controls that pledge list for Firefox on OpenBSD? Sounds like a case needs to be made to enable the ability to access the route socket.

its' not a question of "who controls that pledge list" (that's me, and the list is already more or less "everything"), but there's actually no class for "ability to access the route socket", and i doubt there will be one "just to please firefox".

other sandboxed softwares in openbsd using the route socket open it before enabling the sandboxing (a coding technique called "hoisting" where you move as much initialization as possible before sandboxing, so that you have to enable less capabilities possible in the mainloop), not "anytime during the process lifetime".

but, back to the original issue. some people smarter than me started looking into it, The kernel sandboxing code in https://github.com/openbsd/src/blob/master/sys/kern/kern_pledge.c#L1503 doesn't seem to have a case for AF_ROUTE.

[15:45] daytona:~/ $cat r.c 
#include <sys/types.h>
#include <sys/socket.h>
#include <net/if.h>
#include <net/route.h>
#include <unistd.h>
#include <err.h>

int
main(void)
{
int s;

if (pledge("stdio inet route", NULL) == -1)
err(1, "pledge");
s = socket(AF_ROUTE, SOCK_RAW, AF_INET);

return 0;
}
$make r && ./r
r[21942]: pledge "inet", syscall 97
Abort trap (core dumped)

Reporter

Comment 12

•

1 year ago

from what i understand, pmtud is about finding the 'best' MTU from our host to a remote IP. And the 'local interface' MTU seems to be a starting point from the pmtud algorithm, trying to increase/decrease it over subsequent hops to find the best one for the complete path. Couldnt we use a fixed 'safe value' (eg 1350 or 9000, depending of which direction the algorithm works) instead of absolutely wanting to get the MTU from the local interface ?

i see that before https://github.com/mozilla/mtu/pull/29, this was achieved without accessing the route socket, but "didnt work in firefox" ? so there was no pmtud done in quic before ?

another option which i dont know is doable: open the route socket before sandboxing, and only once, reusing the FD for each subsequent use. I dunno from where all this rust code is called at process startup, but the sandboxing is started here. Before that point, everything is allowed.

and finally, another data point: you can't 'know' if you're sandboxed, eg no 'if sandboxed do X otherwise do Y'.

https://github.com/mozilla/neqo/blob/318b611512cb40235fc251a4870b43ecfd4a71fa/neqo-transport/src/pmtud.rs#L22-L29

Comment 13

•

1 year ago

I am not familiar with OpenBSD's sandboxing, thus removing my need-info.

Couldnt we use a fixed 'safe value' (eg 1350 or 9000, depending of which direction the algorithm works) instead of absolutely wanting to get the MTU from the local interface ?

Yes, a 'safe value' is already in place, see first value in each of the two arrays below. That said, how do we detect whether querying the routing socket is safe or not?

For the record, related mozilla/mtu pull request "Support rtable on OpenBSD": https://github.com/mozilla/mtu/pull/100

Flags: needinfo?(mail)

edgul

Updated

•

1 year ago

Severity: -- → S3

Priority: -- → P2

Whiteboard: [necko-triaged]

edgul

Updated

•

1 year ago

Blocks: QUIC

Reporter

Comment 14

•

1 year ago

(In reply to Max Inden from comment #13)

I am not familiar with OpenBSD's sandboxing, thus removing my need-info.

Couldnt we use a fixed 'safe value' (eg 1350 or 9000, depending of which direction the algorithm works) instead of absolutely wanting to get the MTU from the local interface ?

Yes, a 'safe value' is already in place, see first value in each of the two arrays below. That said, how do we detect whether querying the routing socket is safe or not?

you don't, since as i said you cant know if you're sandboxed or not. and i dunno if the mtu crate is used outside firefox, but i know that firefox (mainline and esr) are sandboxed on OpenBSD by default.

so i think it's safe to assume that the mtu crate within firefox source tree shouldn't try to get the mtu from the interface, but start with the first value of those arrays. that sounds like an acceptable tradeoff to me, what do you think ?

Assignee

Comment 15

•

1 year ago

(In reply to Landry Breuil (:gaston) from comment #14)

so i think it's safe to assume that the mtu crate within firefox source tree shouldn't try to get the mtu from the interface, but start with the first value of those arrays. that sounds like an acceptable tradeoff to me, what do you think ?

That's what PMTUD already does. The routing table lookup is needed to determine at which value PMTUD should stop trying to increase. Without the routing table lookup, we'll try and send packets that are too large, so they get dropped and count as congestion.

The only way forward here that I see would be a Firefox pref (controlling a neqo connection parameter) that disables the routing table lookup.

Pascal Chevrel:pascalc

Updated

•

1 year ago

status-firefox137: affected → fix-optional

Reporter

Comment 16

•

1 year ago

(In reply to Lars Eggert [:lars] from comment #15)

(In reply to Landry Breuil (:gaston) from comment #14)

The only way forward here that I see would be a Firefox pref (controlling a neqo connection parameter) that disables the routing table lookup.

can that happen in a reasonable timeframe so that it gets backported to 137 ? otherwise i'll have to ship with a local patch disabling the corresponding code.

Kershaw Chang [:kershaw]

Comment 17

•

1 year ago

(In reply to Landry Breuil (:gaston) from comment #16)

(In reply to Lars Eggert [:lars] from comment #15)

(In reply to Landry Breuil (:gaston) from comment #14)

The only way forward here that I see would be a Firefox pref (controlling a neqo connection parameter) that disables the routing table lookup.

can that happen in a reasonable timeframe so that it gets backported to 137 ? otherwise i'll have to ship with a local patch disabling the corresponding code.

I am not sure if we have enough time to release a new neqo version and uplift it.
Max, what do you think?

Flags: needinfo?(mail)

Assignee

Comment 18

•

1 year ago

I don't think so either. Plus, OpenBSD is a tier-3 platform. Let's make this part of the next neqo release.

https://github.com/mozilla/neqo/pull/2496

Assignee

Comment 19

•

1 year ago

Valentin Gosu [:valentin] {{ PTO 2 - 12 April }}

Comment 20

•

1 year ago

Assigning to Lars for now.

Assignee: nobody → leggert

Kershaw Chang [:kershaw]

Comment 21

•

1 year ago

I am not sure if we have enough time to release a new neqo version and uplift it.
Max, what do you think?

Clear the need info, since we've decided to address this in the next release.

Flags: needinfo?(mail)

Reporter

Comment 22

•

1 year ago

thanks :lars !

Comment 23

•

1 year ago

For the record, the above patch in Neqo will land in mozilla-central as part of Bug 1959128.

Updated

•

1 year ago

Depends on: 1959128

Reporter

Comment 24

•

11 months ago

now that bug 1959128 has landed, i see that pmtud is disabled by default on OpenBSD (via https://github.com/mozilla/gecko-dev/blob/7d16bef227c6bb997b62ae46bf8f0b116edb1a17/netwerk/socket/neqo_glue/src/lib.rs#L265) - i'll try to test this in the coming days.

Updated

•

11 months ago

Depends on: 1961340

Ryan VanderMeulen (PTO, back 6-April)

Updated

•

11 months ago

No longer depends on: 1961340

Dianna Smith [:diannaS]

Updated

•

11 months ago

Status: NEW → RESOLVED

Closed: 11 months ago

status-firefox137: fix-optional → wontfix

status-firefox138: affected → wontfix

status-firefox139: --- → fixed

Resolution: --- → FIXED

Updated

•

11 months ago

Target Milestone: --- → 139 Branch

Mihai Boldan, Desktop QA [:mboldan]

Updated

•

11 months ago

QA Whiteboard: [qa-triage-done-c140/b139]