Trusty SRU - Mellanox refresh

Ming Lei ming.lei at canonical.com
Tue Jul 29 09:04:43 UTC 2014


Hi Tim,

HP has confirmed these patches do fix the bonding issue:

https://bugs.launchpad.net/lomond/+bug/1341856/comments/6

Thanks,

On Mon, Jul 28, 2014 at 8:39 PM, Tim Gardner <tim.gardner at canonical.com> wrote:
> Some positive test results in the bug report would be nice. I'll apply these
> patches to Trusty when I see that.
>
> rtg
>
>
> On 07/27/2014 07:02 AM, Narinder Gupta wrote:
>>
>> Thanks Eyal,
>> For Mcdivitt HP tested the patches from lomond PPA which Dannf Frazier
>> build and as per them that fixes the bonding issue. HP had successfully
>> tested the bond 5 and bond 6 and it works and unblock the HP CSI test
>> team. Following parches were requested from ming Lei for Mcdivitt.
>>
>> /4 days ago/    Amir Vadai
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=search;h=refs/heads/master-next-net-test;s=Amir+Vadai;st=author>
>>
>> net/mlx4_en: Disable blueflame using ethtool private...
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=eec2dcd96e383c7af9281fa778fb4aa16cec4e65>
>> commit
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=eec2dcd96e383c7af9281fa778fb4aa16cec4e65>
>> |
>> commitdiff
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commitdiff;h=eec2dcd96e383c7af9281fa778fb4aa16cec4e65>
>> |
>> tree
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=tree;h=eec2dcd96e383c7af9281fa778fb4aa16cec4e65;hb=eec2dcd96e383c7af9281fa778fb4aa16cec4e65>
>> |
>> snapshot
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=snapshot;h=eec2dcd96e383c7af9281fa778fb4aa16cec4e65;sf=tgz>
>> /4 days ago/    Eyal Perry
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=search;h=refs/heads/master-next-net-test;s=Eyal+Perry;st=author>
>>
>> net/mlx4_en: current_mac isn't updated in port up
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=d26eeba14210b5f8c8fdc0a4d1bac01033a482c7>
>> commit
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=d26eeba14210b5f8c8fdc0a4d1bac01033a482c7>
>> |
>> commitdiff
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commitdiff;h=d26eeba14210b5f8c8fdc0a4d1bac01033a482c7>
>> |
>> tree
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=tree;h=d26eeba14210b5f8c8fdc0a4d1bac01033a482c7;hb=d26eeba14210b5f8c8fdc0a4d1bac01033a482c7>
>> |
>> snapshot
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=snapshot;h=d26eeba14210b5f8c8fdc0a4d1bac01033a482c7;sf=tgz>
>> /4 days ago/    Noa Osherovich
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=search;h=refs/heads/master-next-net-test;s=Noa+Osherovich;st=author>
>>
>> net/mlx4_en: Fix mac_hash database inconsistency
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=a0013c427270c6045be4438f73d23fe8349d1f69>
>> commit
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=a0013c427270c6045be4438f73d23fe8349d1f69>
>> |
>> commitdiff
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commitdiff;h=a0013c427270c6045be4438f73d23fe8349d1f69>
>> |
>> tree
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=tree;h=a0013c427270c6045be4438f73d23fe8349d1f69;hb=a0013c427270c6045be4438f73d23fe8349d1f69>
>> |
>> snapshot
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=snapshot;h=a0013c427270c6045be4438f73d23fe8349d1f69;sf=tgz>
>> /4 days ago/    Shani Michaelli
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=search;h=refs/heads/master-next-net-test;s=Shani+Michaelli;st=author>
>>
>> net/mlx4_en: Protect MAC address modification with...
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=ee9a0e89192fe5d6c66f855a114d27f102af9d07>
>> commit
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=ee9a0e89192fe5d6c66f855a114d27f102af9d07>
>> |
>> commitdiff
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commitdiff;h=ee9a0e89192fe5d6c66f855a114d27f102af9d07>
>> |
>> tree
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=tree;h=ee9a0e89192fe5d6c66f855a114d27f102af9d07;hb=ee9a0e89192fe5d6c66f855a114d27f102af9d07>
>> |
>> snapshot
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=snapshot;h=ee9a0e89192fe5d6c66f855a114d27f102af9d07;sf=tgz>
>> /4 days ago/    Shani Michaelli
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=search;h=refs/heads/master-next-net-test;s=Shani+Michaelli;st=author>
>>
>> net/mlx4_en: Fix errors in MAC address changing when...
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=fb3aa02d8dd043d5ea27c8c4edb69b8e1b10fcc9>
>> commit
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commit;h=fb3aa02d8dd043d5ea27c8c4edb69b8e1b10fcc9>
>> |
>> commitdiff
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=commitdiff;h=fb3aa02d8dd043d5ea27c8c4edb69b8e1b10fcc9>
>> |
>> tree
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=tree;h=fb3aa02d8dd043d5ea27c8c4edb69b8e1b10fcc9;hb=fb3aa02d8dd043d5ea27c8c4edb69b8e1b10fcc9>
>> |
>> snapshot
>>
>> <http://kernel.ubuntu.com/git?p=ming/ubuntu-trusty.git;a=snapshot;h=fb3aa02d8dd043d5ea27c8c4edb69b8e1b10fcc9;sf=tgz>
>>
>>
>>
>> Please let  me know if you need any comment from tester? I can ask  him
>> to update in the email or in bug report.
>>
>> On Jul 27, 2014, at 7:51 AM, Eyal Perry wrote:
>>
>>> Hi all,
>>> Regarding the list of patches I’ve sent, it’s mostly bug-fixes pulled
>>> from upstream and 4 fixes that made by-demand for HP urgent issues.
>>> More specific details on purpose and mainly testing of these patches:
>>> ·2 patches that are required for HP certification – and are relevant
>>> for x86 redhat – so it is less urgent to integrate it to Ubuntu at the
>>> moment, sorry for the bother.
>>> UBUNTU: SAUCE: (no-up) net/mlx4_core: Use low memory profile on kdump
>>> kernel
>>> UBUNTU: SAUCE: (no-up) net/mlx4_en: Reduce memory consumption on kdump
>>> kernel
>>> ·This patch exposes the ability to disable blueflame. It help
>>> improving bi-directional IP forwarding test performance on mcdivitt –
>>> HP request.
>>> *      UBUNTU: SAUCE: (no-up) net/mlx4_en: Disable blueflame using
>>> ethtool private flags*
>>> *·4 patches that fix MAC modification handling issues – fixes an issue
>>> with bonding alb/tlb modes – HP request.*
>>> *A simple test would be to set a tlb bond over 2 ports of mellanox NIC
>>>
>>> – and “make” bonding driver switch the role of active interface
>>> between the 2 interfaces by setting the ports up/down.*
>>> *      UBUNTU: SAUCE: (no-up) net/mlx4_en: current_mac isn't updated
>>> in port up*
>>> *      UBUNTU: SAUCE: (linux-next) net/mlx4_en: Fix mac_hash database
>>> inconsistency*
>>> *      net/mlx4_en: Protect MAC address modification with the
>>> state_lock mutex*
>>> *      net/mlx4_en: Fix errors in MAC address changing when port is down*
>>> ·Without this patch, if probe_vf (mlx4_core module parameter) is being
>>>
>>> used (usually with a big number >= 8),
>>> You’ll see such prints: “localhost systemd-udevd: worker [14567]
>>> /devices/pci0000:00/0000:00:02.0/0000:03:00.1 timeout; kill it”
>>>       net/mlx4_core: Defer VF initialization till PF is fully initialized
>>> ·Without this patch, when loading the mlx4_core with probed VF you’ll
>>> see these prints: “PCIe link width is x0, device supports x8”, and on
>>> VM “Unable to determine PCI device chain minimum BW”
>>>       net/mlx4_core: Don't issue PCIe speed/width checks for VFs
>>> ·Without this patch – when you attach a single port VF of “port 2” to
>>> a VM, and unload/load the mellanox driver you’ll observe the following
>>> error messages:
>>> “mlx4_core 0000:04:00.0: vhcr command QP_ATTACH (0xf0b) slave:7
>>> in_param 0x66b5a000 in_mod=0x1000086a, op_mod=0x0 failed with error:0,
>>> status -22”
>>>       net/mlx4_core: Adjust port number in qp_attach wrapper when
>>> detaching
>>> ·Way to reproduce:
>>> 1) Start VM and assign it VFx
>>> 2) Configure bonding between 2 ports of the VF
>>> 3) Assign IP to the bond
>>> 4) Shut down this VM
>>> 5) Start new VM and assign it VMy
>>> 6) Go over steps 2-3 for this VM
>>> 7) Try running rping between this VM and hypervisor (at this point
>>> rping does not work)
>>>       net/mlx4_core: Reset RoCE VF gids when guest driver goes down
>>> ·Configure: Load mlx4_core with “options mlx4_core port_type_array=2,2
>>> debug_level=1 num_vfs=1,1,2 probe_vf=0,1,1 log_num_mgm_entry_size=-1”
>>> Run rping client on the probed VF of port2 (rping -dvca 192.168.30.1
>>> -C 1) – without this patch it would fail with the following error:
>>> “cma event RDMA_CM_EVENT_UNREACHABLE, error -110.
>>>       net/mlx4_core: Fix slave id computation for single port VF
>>> ·1)Run VPI on the Hypervisor, with opensm and ipoib running on the IB
>>> port
>>>
>>> 2) Bring up the guest/VF
>>> 3) Configure the guest
>>> 4) Rping from guest as client – works.
>>> 5) On the guest, unload ONLY the low level driver (mlx4_ib/mlx4_en)
>>> and bring it back up – bringing up mlx4_ib FIRST, then mlx4_en.
>>> 6) Re-configure the guest interfaces
>>> 7) Rping from guest as client. DOES NOT WORK
>>>       net/mlx4_core: Add UPDATE_QP SRIOV wrapper support
>>> ·Set up a bonding interface over an VXLAN encapsulating device with
>>> ConnectX3-Pro HW, Send traffic, and check with tcpdump that GSO is
>>> functioning.
>>>       bonding: Advertize vxlan offload features when supported
>>> ·Single ported VF are currently supported only when all HCA ports are
>>> set to Ethernet – such operation would fail, but without this patch it
>>> will return success (0).
>>>       net/mlx4_core: Fix the error flow when probing with invalid VF
>>> configuration
>>> ·Load the mlx4_core with the following module parameter
>>> “log_num_mgm_entry_size=-1” to enable vxlan offloads – no traffic will
>>> get to the RX side (i.e. tcpdump).
>>>       net/mlx4_en: Don't configure the HW vxlan parser when vxlan
>>> offloading isn't set
>>> ·Without this patches network interface names for port 2 of 2-port
>>> devices mellanox are inconsistent – HP requested this patch but I
>>> don’t think it’s their unique need.
>>> Can be easily checked with a command as follows: $ grep .
>>> /sys/bus/pci/drivers/mlx4_core/0000\:24\:00.0/net/*/dev_id
>>> Should return:
>>> /sys/bus/pci/drivers/mlx4_core/0000:24:00.0/net/eth8/dev_id:0x0
>>> /sys/bus/pci/drivers/mlx4_core/0000:24:00.0/net/eth9/dev_id:0x1
>>> Instead of this buggy output without the patch:
>>> /sys/bus/pci/drivers/mlx4_core/0000:24:00.0/net/p5p1/dev_id:0x0
>>> /sys/bus/pci/drivers/mlx4_core/0000:24:00.0/net/rename13/dev_id:0x0
>>>       Revert "net/mlx4_en: Fix bad use of dev_id"
>>> ·Not sure about testing these three:
>>>       net/mlx4_core: Load the Eth driver first
>>>       net/mlx4_core: Keep only one driver entry release mlx4_priv
>>>       net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()
>>> /Best Regards,/
>>> /Eyal./
>>> *From:*Brian Fromme [mailto:brian.fromme at canonical.com]
>>> *Sent:*Saturday, July 26, 2014 12:38 AM
>>> *To:*Narinder Gupta; Rafael Tinoco
>>> *Cc:*Michael Miller; Dann Frazier; Raghuram Kota; Tim Gardner; Ming
>>> Lei; Eyal Perry; kernel-team
>>> *Subject:*Re: Trusty SRU - Mellanox refresh
>>>
>>> That's an excellent question, Narinder.  Eyal, Tim, Rafael, etc.  Can
>>> you help us to understand how to test these patches?  We can request
>>> that HP gets involved in the testing, but only if we can explain what
>>> these changes are and how to test them.
>>>  thanks,
>>>  Brian
>>>
>>> On Fri, Jul 25, 2014 at 3:06 PM, Narinder Gupta
>>> <narinder.gupta at canonical.com <mailto:narinder.gupta at canonical.com>>
>>>
>>> wrote:
>>> Brian,
>>> Will you please brief me the changes we are suppose to test. I can ask
>>> HP to test and submit the results.
>>>
>>> Thanks and Regards,
>>> Narinder Gupta (PMP)narinder.gupta at canonical.com
>>> <mailto:narinder.gupta at canonical.com>
>>> Canonical, Ltd.                    narindergupta [irc.freenode.net
>>> <http://irc.freenode.net>]
>>> +1.281.736.5150  <tel:%2B1.281.736.5150>
>>> narindergupta2007[skype]
>>>
>>> Ubuntu- Linux for human beings |www.ubuntu.com  <http://www.ubuntu.com>
>>> |www.canonical.com  <http://www.canonical.com>
>>>
>>>
>>> On Fri, Jul 25, 2014 at 3:56 PM, Brian Fromme
>>> <brian.fromme at canonical.com <mailto:brian.fromme at canonical.com>> wrote:
>>> Oops, Narinder is the PM for McDivitt.  Adding him to this thread.
>>>  cheers,
>>>  Brian
>>>
>>> On Fri, Jul 25, 2014 at 2:35 PM, Michael Miller
>>> <michael.miller at canonical.com <mailto:michael.miller at canonical.com>>
>>>
>>> wrote:
>>> I'm thinking it would Perry Hoffman and Scott Hinchley. I hope I
>>> spelled their names correctly.
>>>
>>> On Fri, Jul 25, 2014 at 3:31 PM, Brian Fromme
>>> <brian.fromme at canonical.com <mailto:brian.fromme at canonical.com>> wrote:
>>> Yup.  Adding Dann Frazier and Raghu.  Can you guys help us to figure
>>> out who can integrate and test these on our McDivitt cartridge?
>>>  thanks,
>>>  Brian
>>>
>>> On Fri, Jul 25, 2014 at 1:10 PM, Michael Miller
>>> <michael.miller at canonical.com <mailto:michael.miller at canonical.com>>
>>>
>>> wrote:
>>> Brian,
>>> Shouldn't this also go to the HP folks working the McDivitt issues? I
>>> don't have access to a McDivitt.
>>> -- mikem
>>>
>>> On Fri, Jul 25, 2014 at 1:50 PM, Tim Gardner
>>> <tim.gardner at canonical.com <mailto:tim.gardner at canonical.com>> wrote:
>>> Gents - I'd like some positive testing confirmation before I apply
>>> this to Trusty.
>>>
>>> rtg
>>> --
>>> Tim Gardnertim.gardner at canonical.com <mailto:tim.gardner at canonical.com>
>>>
>>
>
>
> --
> Tim Gardner tim.gardner at canonical.com




More information about the kernel-team mailing list