[PATCH 0/3][SRU Focal] UBUNTU: SAUCE: Add IB peer memory interface

Kamal Mostafa kamal at canonical.com
Thu Nov 4 14:56:16 UTC 2021


Hi Dann-

We discussed this in the Kernel Team meeting today -- we've got some
significant concerns about applying this to Focal.

It's awfully huge, and appears to change lots of API-level stuff which
could be a problem for other users of the subsystem (as you noted -- "Of
course, we can't rule out other users").  We are also very reluctant to
apply such a massive change to our Focal kernel source for fear that it
will be highly likely to result in merge collisions with future upstream
v5.4-stable updates.

The team would also like to know: How large and invasive is the
Hirsute/Impish SAUCE delta for this, compared to this Focal backport?

 -Kamal

On Thu, Oct 14, 2021 at 4:04 PM dann frazier <dann.frazier at canonical.com>
wrote:

> BugLink: https://launchpad.net/bugs/1923104
> BugLink: https://launchpad.net/bugs/1947206
>
> This patchset was last discussed here as an RFC:
>   https://lists.ubuntu.com/archives/kernel-team/2021-August/123747.html
> This is the first formal submission for focal.
>
> = What is this? =
> Nvidia's GPUDirect RDMA feature requires the nvidia_peermem module, which
> is bundled along with the nvidia driver in the >= 460-server branches.
> nvidia_peermem itself depends on a non-upstream infiniband interface called
> "IB peer memory". We ship a SAUCE patch for IB peer memory in hirsute
> and impish, but not in the focal LTS kernel. This means that the
> nvidia_peermem module we ship in l-r-m will not load in the focal LTS
> kernel.
>
> This is a backport of the SAUCE patch for IB peer memory that we're
> carrying in hirsute/impish to focal.
>
> = Upstream status =
> This feature is not expected to ever land upstream in this form. There
> is work going on upstream (dma-buf/p2pdma I believe) that is expected
> to provide equivalent functionality in the future, but it's not clear
> when all the pieces will be in place for it. And even when they are,
> I suspect it will be far too invasive to integrate into our 5.4.
>
> = Porting Assistance =
> We have a commitment to assist with porting this patch set forward to new
> kernel versions, including security/bug fixes we pull in from upstream
> stable.
>
> = Testing =
> Nvidia have tested this internally, both prior to sending us the patch,
> and once again with a PPA kernel we provided. We have an automated
> functional smoke test that we plan to integrate into SRU regression
> testing.
>
> = New patch dependency =
> Going back to 5.4 requires backporting an additional upstream patch,
> which changes the API of an exported symbol (ib_umem_get). The only out
> of tree modules of which I'm aware that use this symbol are the Mellanox
> OFED drivers, but they also bundle their own ib_core module that overrides
> the ib_umem_get interface we provide, so they aren't directly impacted.
> Of course, we can't rule out other users.
>
> History:
>
> v1:
>  - Added additional patch from Nvidia that provides a few updates. The
>    equivalent changes are also currently pending review for both hirsute
>    and impish:
>
> https://lists.ubuntu.com/archives/kernel-team/2021-October/124937.html
> RFC v2:
>  - Add some paragraphs of context into this cover letter
>  - Describe backport process for upstream patch
>  - Tag non-upstream patch as SAUCE and clarify provenance and testing
>
> Feras Daoud (1):
>   [SRU Focal] UBUNTU: SAUCE: RDMA/core: Introduce peer memory interface
>
> Jack Morgenstein (1):
>   [SRU Focal] UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory
>
> Moni Shoua (1):
>   [SRU Focal] IB: Allow calls to ib_umem_get from kernel ULPs
>
> dann frazier (1):
>   UBUNTU: Ubuntu-5.4.0-89.100+peerdirect.1
>
>  debian.master/changelog                       |  15 +-
>  drivers/infiniband/core/Makefile              |   2 +-
>  drivers/infiniband/core/ib_peer_mem.h         |  58 ++
>  drivers/infiniband/core/peer_mem.c            | 559 ++++++++++++++++++
>  drivers/infiniband/core/umem.c                |  69 ++-
>  drivers/infiniband/core/umem_odp.c            |  33 +-
>  drivers/infiniband/hw/bnxt_re/ib_verbs.c      |  12 +-
>  drivers/infiniband/hw/cxgb3/iwch_provider.c   |   2 +-
>  drivers/infiniband/hw/cxgb4/mem.c             |   2 +-
>  drivers/infiniband/hw/efa/efa_verbs.c         |   2 +-
>  drivers/infiniband/hw/hns/hns_roce_cq.c       |   2 +-
>  drivers/infiniband/hw/hns/hns_roce_db.c       |   3 +-
>  drivers/infiniband/hw/hns/hns_roce_mr.c       |   4 +-
>  drivers/infiniband/hw/hns/hns_roce_qp.c       |   2 +-
>  drivers/infiniband/hw/hns/hns_roce_srq.c      |   5 +-
>  drivers/infiniband/hw/i40iw/i40iw_verbs.c     |   2 +-
>  drivers/infiniband/hw/mlx4/cq.c               |   2 +-
>  drivers/infiniband/hw/mlx4/doorbell.c         |   3 +-
>  drivers/infiniband/hw/mlx4/mr.c               |   8 +-
>  drivers/infiniband/hw/mlx4/qp.c               |   5 +-
>  drivers/infiniband/hw/mlx4/srq.c              |   3 +-
>  drivers/infiniband/hw/mlx5/cq.c               |  11 +-
>  drivers/infiniband/hw/mlx5/devx.c             |   4 +-
>  drivers/infiniband/hw/mlx5/doorbell.c         |   3 +-
>  drivers/infiniband/hw/mlx5/mem.c              |  11 +-
>  drivers/infiniband/hw/mlx5/mr.c               |  80 ++-
>  drivers/infiniband/hw/mlx5/odp.c              |   2 +-
>  drivers/infiniband/hw/mlx5/qp.c               |   4 +-
>  drivers/infiniband/hw/mlx5/srq.c              |   2 +-
>  drivers/infiniband/hw/mthca/mthca_provider.c  |   2 +-
>  drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |   2 +-
>  drivers/infiniband/hw/qedr/verbs.c            |   9 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c  |   2 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c  |   2 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c  |   7 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c |   2 +-
>  drivers/infiniband/sw/rdmavt/mr.c             |   2 +-
>  drivers/infiniband/sw/rxe/rxe_mr.c            |   2 +-
>  include/linux/mlx5/mlx5_ifc.h                 |  11 +-
>  include/rdma/ib_umem.h                        |  38 +-
>  include/rdma/ib_umem_odp.h                    |   9 +-
>  include/rdma/peer_mem.h                       | 175 ++++++
>  42 files changed, 1043 insertions(+), 130 deletions(-)
>  create mode 100644 drivers/infiniband/core/ib_peer_mem.h
>  create mode 100644 drivers/infiniband/core/peer_mem.c
>  create mode 100644 include/rdma/peer_mem.h
>
> --
> 2.33.0
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20211104/d5722250/attachment-0001.html>


More information about the kernel-team mailing list