NACK: [SRU][L, K, J, F][M, U][PATCH 0/1] UBUNTU: [Packaging] Expose base (parent) abi for derivatives

Roxana Nicolescu roxana.nicolescu at
Wed Jul 5 12:02:18 UTC 2023

On 05-07-2023 12:23, Thadeu Lima de Souza Cascardo wrote:
> On Wed, Jul 05, 2023 at 11:49:35AM +0200, Roxana Nicolescu wrote:
>> On 05-07-2023 11:17, Thadeu Lima de Souza Cascardo wrote:
>>> On Wed, Jul 05, 2023 at 11:01:35AM +0200, Roxana Nicolescu wrote:
>>>> On 06-06-2023 12:38, Thadeu Lima de Souza Cascardo wrote:
>>>>> On Tue, Jun 06, 2023 at 09:43:36AM +0200, Roxana Nicolescu wrote:
>>>>>> BugLink:
>>>>>> SRU Justification
>>>>>> [ Impact ]
>>>>>> Upstream stable added a change in the format of jbd2 in 5.15.87 that
>>>>>> was cherry-picked to kinetic:linux.
>>>>>> This is incompatible with the current changes in the lttng-module
>>>>>> kinetic.
>>>>>> We previously encountered this issue for focal #lp2004644
>>>>>> and we proactively released a new version for kinetic, but the fix did
>>>>>> not work as expected.
>>>>>> As shown in the buglink, the build error is triggered because some
>>>>>> function declaration does not match the kernel's.
>>>>>> In `include/instrumentation/events/jbd2.h` in lttng-modules, there is
>>>>>> a conditional declaration based on the kernel version.
>>>>>> But for 5.19 kernels, the upstream kernel version has remained the
>>>>>> same.
>>>>>> [ Fix ]
>>>>>> Versioning based on linux upstream version does not work in this case.
>>>>>> We expose the abi version of the kernel, but it alone will not solve
>>>>>> this issue because we deliver many derivatives, each with its own
>>>>>> version. Hence a base main abi version needs to be exposed in the
>>>>>> headers.
>>>>> I don't like this approach. lttng-modules and every other out-of-tree module
>>>>> out there are doing the wrong thing. I worked with openvswitch in the past and
>>>>> they kept out-of-tree modules even though the module was already in the tree,
>>>>> and they kept it updated with the changing kernel ABI. What they did is that
>>>>> they had their build look into the kernel headers for the function signatures
>>>>> that were relevant and created a config file stating which signature each
>>>>> relevant function had and then checking for that in their source code.
>>>>> A couple of years ago, we had the same issue with wireguard and we attempted a
>>>>> similar solution. There was a bug where it was not looking into the right
>>>>> headers, but aside from that, it worked just fine. I believe the same solution
>>>>> should be looked after here, and be considered for upstream too.
>>>>> Cascardo.
>>>> Hi Thadeau,
>>>> I went ahead and suggested this change to lttng-modules upstream.
>>>> They think maintaining a configure script will require a lot of effort and
>>>> it's not adding any value
>>>> to their existing way of keeping up with kernel changes. And I agree with
>>>> them.
>>> Their module cannot build on some kernels because they are testing for version
>>> numbers. And here we are introducing another version number to actually mean
>>> (this other function signature is actually the one from this other major
>>> version number here).
>>> And are they the ones who are going to maintain that, or are we going to add
>>> that to our copy of lttng? Then, of course, the burden of maintainership is not
>>> falling into them.
>> We have an in-house versioning system so probably no, the change
>> will be added to upstream, but the testing is actually done by us, so
>> we will be the ones adding the version there.
>>> They don't want to migrate to a model that is so much better to maintain
>>> because the technical debt is too much at this point.
>>>> Therefore I am kindly asking to approve my initial suggestion. I need this
>>>> at least for kinetic only because
>>>> kinetic is gonna reach eol very soon and it would be nice to ship a working
>>>> module before that.
>>> One more reason not to add technical debt on our side just to fix something
>>> that has been broken for this long on a dead-end series. I would go even
>>> further and ask that we revisit what has been done in Jammy.
>> Not sure what you mean here, nothing has been done in Jammy.
>> Jammy was automatically solved by their fix in upstream because in jammy
>> the kernel version(5.15.x) keeps changing.
>> So what do you suggest, leave this broken anyway for kinetic?
> Sorry, you said that you found this previously in focal, my mind recorded
> it as jammy.
> As for kinetic, yes, it should be left as broken. If I understand
> correctly, in order to fix it, you would have to release a new kernel
> exposing this macro in its headers, and then release a new lttng using that
> macro.
> That means that this new lttng will still fail on already released kernels
> that do not expose the macro. Users on kinetic will only benefit if they
> upgrade and reboot their kernels. And all of that must happen 15 days
> before that Ubuntu release stops getting support. So by the time users
> reboot their kernels to be able to use lttng (after how many weeks of not
> being able to use it?), they need to upgrade to Lunar.
> What is the story on Lunar here? Does lttng work there with 6.2 kernels?
> What about Jammy? Does the lttng on Jammy work with both 5.15 and 6.2
> kernels?
Yes, it works fine with lunar and jammy (6.2+ and 5.15).
>>> Honestly, if we really care about lttng, we should work with upstream to
>>> migrate everything to the openvswitch model.
>> I personally think both sides have good reasons and we're kinda passing the
>> ball from one to the other.
> That is not my suggestion here. In order to stop pushing work to one
> another, if we want to see this work done, we must do it ourselves in
> collaboration with upstream. It is one thing to ask upstream to do work, a
> second thing to introduce another bandaid on our kernels, a third thing to
> do the upstream work that we want to get done. Of course, if there is
> opposition that cannot be conciliated, it might not be worth it. But I
> haven't seen evidence that we are at this stage.
My suggestion to them came together with the actual solution.
And I also told them I would be happy to improve it if needed.
I realized I haven't link the conversation, so here it is:
I still need to reply to their last email. I will add some of the 
arguments you suggested.
Maybe they will be more open afterwards.
>> I am happy to try to convince them again, but it's not gonna be easy.
>> And I do believe no matter our differences in thinking, we need to deliver
>> functioning
>> modules to our users and I think this solution is not gonna add too much
>> technical depth
>> and it's a good compromise to deliver a quick fix very soon.
> Quick fixes are not always the best. In your cover letter, you mentioned
> this could be used by other third-party modules. And there lies the
> problem. Once we ship this, and third-party modules start relying on it, we
> cannot remove it anymore. And since I don't think it is the right solution,
> I see the difficult in removing it as a big problem.
> I already mentioned one of the issues with this solution: it doesn't work
> for older kernels that don't have the macro. And if you think of that,
> since you are not going to make lttng start building correctly for the
> older kernels, why not just say: if 5.19 and Ubuntu, just use this new
> thing here. That will make it work with the most recent kernels and might
> only break for the old 5.19 kernels that did not have it, and only for
> users who are going to install lttng anew while running an old kernel when
> the new ones have been available for a while.
> You want to still be able to build for all those kernels we released? This
> solution doesn't help with it.
Not necessarily. I just wanted to have a working module for the latest 
kinetic version.
But I agree with you it won't add too much value if they have to upgrade 
to the latest
kinetic kernel, they might as well upgrade to lunar.
> If we won't do the entire work for upstream right now, and we still want to
> fix this, put such a configure, header and ifdef in our downstream code.
> Use it to demonstrate the power of that construct in comparison with the
> other non-solution. That build the foundation, demonstrates that it works,
> shows our goodwill in trying to get the work done and making it work with
> all of our kernels.
I think I will leave it broken for kinetic and work on convincing them to
approve it.

I must admit when i first submitted it I was not fully convinced
this is a way better solution, but with all these arguments brought by you,
I agree. The script may seem like a bodge but it brings more
advantages to the table such us no maintenance required from our side and
solving the package for all kernels, not just newer versions.

> Cascardo.
>> That's why I am asking to accept this for kinetic.
>> I will upload lttng-module for kinetic only because the change is minimal,
>> see it in the buglink
>> and there are higher changes to get it approved.
>> Roxana
>>> Cascardo.
>>>> Thanks,
>>>> Roxana
>>>>>> The actual fix for lttng-modules requires 2 steps:
>>>>>> 1. Expose UTS_UBUNTU_BASE_RELEASE_ABI variable equal with the parent
>>>>>> abi in <generated/utsrelease.h>. For backports with multiple
>>>>>> inheritance (like jammy:linux-aws-5.19) the version from debian.master
>>>>>> (kinetic:linux) will be used. In case there is no inheritance, the
>>>>>> actual kernel abi will be used. This is needed for jammy, kinetic and
>>>>>> lunar but it may be needed in the future for similar use cases,
>>>>>> so it makes sense to push this to all releases.
>>>>>> This patch will cover the first part.
>>>>>> 2. Use this variable in the lttng-module to construct
>>>>>> LTTNG_UBUNTU_BASE_VERSION_CODE and then use it for versioning:
>>>>>> The lttng-module fix will be done only for lunar, kinetic, and jammy.
>>>>>> [ Where problems would occur ]
>>>>>> This patch alone should not have a big impact on our kernels.
>>>>>> It will only expose a variable in the headers.
>>>>>> In terms of the lttng-modules solution:
>>>>>> We may notice new failures in ubuntu_lttng_smoke_test once this
>>>>>> compiles and run. But it was tested locally and the results were good,
>>>>>> so the probability is very very low.
>>>>>> [ Test Plan ]
>>>>>> Since we hit backward issues in the past, multiple kernel versions
>>>>>> were tested out to make sure this is the proper fix.
>>>>>> General test instructions:
>>>>>> 1. Install kinetic (desired version)
>>>>>> 2. sudo apt install lttng-modules-dkms(or dpkg -i)
>>>>>> 3. Depending on the version it may work or not
>>>>>> 4. Additionally lttng-modules regression test should be triggered to
>>>>>> confirm everything works fine.
>>>>>> My testing:
>>>>>> I build fixed lttng-modules with the fixes described in launchpad.
>>>>>> A. Kinetic kernel where I expose UTS_UBUNTU_BASE_RELEASE_ABI.
>>>>>> The module built and rts passed:
>>>>>> 09:32:05 INFO | PASSED (simple system call tracing with babeltrace)
>>>>>> 09:32:05 INFO | Summary: 7 passed, 0 failed
>>>>>> B. The break commit was introduced in Versions between 42
>>>>>> and 44 (depending on where this fix will land) will break.
>>>>>> C. Tested with a kernel version that does not have the break commit:
>>>>>> 5.19.0-41 to see if it's backwards compatible. Results were good.
>>>>>> D. Tested with jammy-aws-5.19 that has this patch. It works fine.
>>>>>> E. Tested with jammy-aws-5.19.0-1024 that does not have the break
>>>>>> commit to see backwards compatibility. It succeeded.
>>>>>> Roxana Nicolescu (1):
>>>>>>      UBUNTU: [Packaging] Expose base (parent) abi for derivatives
>>>>>>     debian/rules.d/ | 9 +++++++++
>>>>>>     debian/rules.d/ | 4 ++++
>>>>>>     2 files changed, 13 insertions(+)
>>>>>> -- 
>>>>>> 2.34.1
>>>>>> -- 
>>>>>> kernel-team mailing list
>>>>>> kernel-team at

More information about the kernel-team mailing list