ACK+Cmnt: [PATCH 1/1] net: Avoid address overwrite in kernel_connect
Thadeu Lima de Souza Cascardo
cascardo at canonical.com
Tue Sep 12 12:36:41 UTC 2023
On Mon, Sep 11, 2023 at 11:21:31PM -0400, Khalid Elmously wrote:
> From: Jordan Rife <jrife at google.com>
>
> BugLink: https://bugs.launchpad.net/bugs/2035163
>
> BPF programs that run on connect can rewrite the connect address. For
> the connect system call this isn't a problem, because a copy of the address
> is made when it is moved into kernel space. However, kernel_connect
> simply passes through the address it is given, so the caller may observe
> its address value unexpectedly change.
>
> A practical example where this is problematic is where NFS is combined
> with a system such as Cilium which implements BPF-based load balancing.
> A common pattern in software-defined storage systems is to have an NFS
> mount that connects to a persistent virtual IP which in turn maps to an
> ephemeral server IP. This is usually done to achieve high availability:
> if your server goes down you can quickly spin up a replacement and remap
> the virtual IP to that endpoint. With BPF-based load balancing, mounts
> will forget the virtual IP address when the address rewrite occurs
> because a pointer to the only copy of that address is passed down the
> stack. Server failover then breaks, because clients have forgotten the
> virtual IP address. Reconnects fail and mounts remain broken. This patch
> was tested by setting up a scenario like this and ensuring that NFS
> reconnects worked after applying the patch.
>
> Signed-off-by: Jordan Rife <jrife at google.com>
> Signed-off-by: David S. Miller <davem at davemloft.net>
> (backported from commit 0bdf399342c5acbd817c9098b6c7ed21f1974312)
> [ kmously: adjusted for lack of READ_ONCE() ]
> Signed-off-by: Khalid Elmously <khalid.elmously at canonical.com>
This should really be sent to generic, such that is propagates to all kernels.
The fact that this has been sent for gcp/gke only now is likely due to their
request to get it merged sooner. This will require a respin. What is the respin
here? What cycle is it? What kernels are being requested to be respun sooner
rather than wait until the next cycle?
Otherwise,
Acked-by: Thadeu Lima de Souza Cascardo <cascardo at canonical.com>
> ---
> net/socket.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/socket.c b/net/socket.c
> index 5c49074ef7f2ae..7344dcc7cb1ccb 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -3453,7 +3453,12 @@ EXPORT_SYMBOL(kernel_accept);
> int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
> int flags)
> {
> - return sock->ops->connect(sock, addr, addrlen, flags);
> + struct sockaddr_storage address;
> +
> + memcpy(&address, addr, addrlen);
> +
> + return sock->ops->connect(sock, (struct sockaddr *)&address,
> + addrlen, flags);
> }
> EXPORT_SYMBOL(kernel_connect);
>
> --
> 2.34.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list