[PATCH 0/3][SRU Focal] UBUNTU: SAUCE: Add IB peer memory interface
Stefan Bader
stefan.bader at canonical.com
Thu Nov 4 14:52:32 UTC 2021
On 15.10.21 01:04, dann frazier wrote:
> BugLink: https://launchpad.net/bugs/1923104
> BugLink: https://launchpad.net/bugs/1947206
>
> This patchset was last discussed here as an RFC:
> https://lists.ubuntu.com/archives/kernel-team/2021-August/123747.html
> This is the first formal submission for focal.
>
> = What is this? =
> Nvidia's GPUDirect RDMA feature requires the nvidia_peermem module, which
> is bundled along with the nvidia driver in the >= 460-server branches.
> nvidia_peermem itself depends on a non-upstream infiniband interface called
> "IB peer memory". We ship a SAUCE patch for IB peer memory in hirsute
> and impish, but not in the focal LTS kernel. This means that the
> nvidia_peermem module we ship in l-r-m will not load in the focal LTS
> kernel.
>
I have beeen thinking over this a while. One think I am not feeling really happy
with changing the API of a public function (one never knows which external users
are around). Also I am wondering why having this in the HWE kernels is not
enough. It just seems like a lot of change in a LTS kernel to be "safe".
-Stefan
> This is a backport of the SAUCE patch for IB peer memory that we're
> carrying in hirsute/impish to focal.
>
> = Upstream status =
> This feature is not expected to ever land upstream in this form. There
> is work going on upstream (dma-buf/p2pdma I believe) that is expected
> to provide equivalent functionality in the future, but it's not clear
> when all the pieces will be in place for it. And even when they are,
> I suspect it will be far too invasive to integrate into our 5.4.
>
> = Porting Assistance =
> We have a commitment to assist with porting this patch set forward to new
> kernel versions, including security/bug fixes we pull in from upstream stable.
>
> = Testing =
> Nvidia have tested this internally, both prior to sending us the patch,
> and once again with a PPA kernel we provided. We have an automated
> functional smoke test that we plan to integrate into SRU regression
> testing.
>
> = New patch dependency =
> Going back to 5.4 requires backporting an additional upstream patch,
> which changes the API of an exported symbol (ib_umem_get). The only out
> of tree modules of which I'm aware that use this symbol are the Mellanox
> OFED drivers, but they also bundle their own ib_core module that overrides
> the ib_umem_get interface we provide, so they aren't directly impacted.
> Of course, we can't rule out other users.
>
> History:
>
> v1:
> - Added additional patch from Nvidia that provides a few updates. The
> equivalent changes are also currently pending review for both hirsute
> and impish:
> https://lists.ubuntu.com/archives/kernel-team/2021-October/124937.html
> RFC v2:
> - Add some paragraphs of context into this cover letter
> - Describe backport process for upstream patch
> - Tag non-upstream patch as SAUCE and clarify provenance and testing
>
> Feras Daoud (1):
> [SRU Focal] UBUNTU: SAUCE: RDMA/core: Introduce peer memory interface
>
> Jack Morgenstein (1):
> [SRU Focal] UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory
>
> Moni Shoua (1):
> [SRU Focal] IB: Allow calls to ib_umem_get from kernel ULPs
>
> dann frazier (1):
> UBUNTU: Ubuntu-5.4.0-89.100+peerdirect.1
>
> debian.master/changelog | 15 +-
> drivers/infiniband/core/Makefile | 2 +-
> drivers/infiniband/core/ib_peer_mem.h | 58 ++
> drivers/infiniband/core/peer_mem.c | 559 ++++++++++++++++++
> drivers/infiniband/core/umem.c | 69 ++-
> drivers/infiniband/core/umem_odp.c | 33 +-
> drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 +-
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 2 +-
> drivers/infiniband/hw/cxgb4/mem.c | 2 +-
> drivers/infiniband/hw/efa/efa_verbs.c | 2 +-
> drivers/infiniband/hw/hns/hns_roce_cq.c | 2 +-
> drivers/infiniband/hw/hns/hns_roce_db.c | 3 +-
> drivers/infiniband/hw/hns/hns_roce_mr.c | 4 +-
> drivers/infiniband/hw/hns/hns_roce_qp.c | 2 +-
> drivers/infiniband/hw/hns/hns_roce_srq.c | 5 +-
> drivers/infiniband/hw/i40iw/i40iw_verbs.c | 2 +-
> drivers/infiniband/hw/mlx4/cq.c | 2 +-
> drivers/infiniband/hw/mlx4/doorbell.c | 3 +-
> drivers/infiniband/hw/mlx4/mr.c | 8 +-
> drivers/infiniband/hw/mlx4/qp.c | 5 +-
> drivers/infiniband/hw/mlx4/srq.c | 3 +-
> drivers/infiniband/hw/mlx5/cq.c | 11 +-
> drivers/infiniband/hw/mlx5/devx.c | 4 +-
> drivers/infiniband/hw/mlx5/doorbell.c | 3 +-
> drivers/infiniband/hw/mlx5/mem.c | 11 +-
> drivers/infiniband/hw/mlx5/mr.c | 80 ++-
> drivers/infiniband/hw/mlx5/odp.c | 2 +-
> drivers/infiniband/hw/mlx5/qp.c | 4 +-
> drivers/infiniband/hw/mlx5/srq.c | 2 +-
> drivers/infiniband/hw/mthca/mthca_provider.c | 2 +-
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 2 +-
> drivers/infiniband/hw/qedr/verbs.c | 9 +-
> drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 2 +-
> drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c | 2 +-
> drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c | 7 +-
> drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 2 +-
> drivers/infiniband/sw/rdmavt/mr.c | 2 +-
> drivers/infiniband/sw/rxe/rxe_mr.c | 2 +-
> include/linux/mlx5/mlx5_ifc.h | 11 +-
> include/rdma/ib_umem.h | 38 +-
> include/rdma/ib_umem_odp.h | 9 +-
> include/rdma/peer_mem.h | 175 ++++++
> 42 files changed, 1043 insertions(+), 130 deletions(-)
> create mode 100644 drivers/infiniband/core/ib_peer_mem.h
> create mode 100644 drivers/infiniband/core/peer_mem.c
> create mode 100644 include/rdma/peer_mem.h
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20211104/cda25f08/attachment.sig>
More information about the kernel-team
mailing list