[Bug 2065856] Re: SEGFAULTs in v4.3 with GlusterFS
Christian Ehrhardt
2065856 at bugs.launchpad.net
Fri Jun 21 08:15:33 UTC 2024
Patch review
First of all it is not as complex as it seems, it does the very same multiple times.
But I still found some things I need to ask.
Overall, once you had a look at those 4 points, addressed or explained
them the next sponsor coming by (or myself if I do not miss the ping due
to your reply) could sponsor that to Oracular and for the Noble SRU.
#1
I saw the context changed around
PTHREAD_RWLOCK_destroy(&my_fd->fdlock);
which formerly was in many *_free_state functions in the version that is in noble.
That is part of the removing of all custom free functions used as function pointer in the ops struct.
What is left is that instead of through the function pointer like
op_ctx->fsal_export->exp_ops.free_state(
it now calls directly
free_state(pfid->state);
And in these calls the right state is selected that needs to be processed.
The new free_state now checks for bad state.
Then only if a custom state free function is registered calls that or to gsh_free as final fallback.
Those are all set to NULL as seen in `grep -Hrn init_state src/* -A1`.
So gsh_free it is which is in src/include/abstract_mem.h
But that is just:
254 static inline void
255 gsh_free(void *p)
256 {
257 »···free(p);
258 }
Maybe I just do not see it yet, but could you explain why the proposed changes would not break the locking which seems to be no more done?
I'm sure the SRU team would want to know as well before accepting.
#2
Further there is a comment lost in the struct of *_state_fd
Upstream added:
/** state MUST be first to use default free_state */
The comment is not functional and therefore not a huge problem.
But I also see no reason why it is not added, it would only apply other
backports harder right?
If there is a reason why that isn't added let explain it.
And if there is none the next revision could have that change to be more similar to the upstream patch right?
#3 the aforementioned versioning and changelog
I think this should be
3.1 oracular
nfs-ganesha (4.3-8ubuntu2) oracular; urgency=medium
* Fix null pointer ref in free_state (LP: #2065856)
- d/p/20-fix-null-ptr-ref-in-free-state.diff
3.2 noble
nfs-ganesha (4.3-8ubuntu1.1) noble; urgency=medium
* Fix null pointer ref in free_state (LP: #2065856)
- d/p/20-fix-null-ptr-ref-in-free-state.diff
#4 dep-3 headers
As mentioned I'm happy you have them.
But since I came to the point to ask you to refresh your proposal anyway.
On one hand since it is adapted we'd change origin->backport
And on the other I see no reason why we should not use exactly the text from the upstream patch.
Was there a reason to reformat it?
I'd suggest starting it with
Description: free_state may not always be able to be called with a valid export
Instead of trying to use the export attached to a state to call
the FSAL free_state, use a function pointer which if non-NULL
is used, otherwise we just gsh_free the state (which works for all
in-tree FSALs).
Origin: backport, https://github.com/nfs-ganesha/nfs-ganesha/commit/336abcba0b9e0dae0aadb4657c311d04862f2028
Bug: https://github.com/nfs-ganesha/nfs-ganesha/issues/904
Bug-Ubuntu: https://launchpad.net/bugs/2065856
Author: Frank S. Filz <ffilzlnx at mindspring.com>
Reviewed-By: Frank S. Filz <ffilzlnx at mindspring.com>
Last-Update: 2024-05-16
---
** Bug watch added: github.com/nfs-ganesha/nfs-ganesha/issues #904
https://github.com/nfs-ganesha/nfs-ganesha/issues/904
** Changed in: nfs-ganesha (Ubuntu Mantic)
Status: New => Won't Fix
** Changed in: nfs-ganesha (Ubuntu Noble)
Status: New => Confirmed
** Changed in: nfs-ganesha (Ubuntu Oracular)
Status: In Progress => Confirmed
** Changed in: nfs-ganesha (Ubuntu Noble)
Status: Confirmed => Incomplete
** Changed in: nfs-ganesha (Ubuntu Oracular)
Status: Confirmed => Incomplete
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nfs-ganesha in Ubuntu.
https://bugs.launchpad.net/bugs/2065856
Title:
SEGFAULTs in v4.3 with GlusterFS
Status in nfs-ganesha package in Ubuntu:
Incomplete
Status in nfs-ganesha source package in Mantic:
Won't Fix
Status in nfs-ganesha source package in Noble:
Incomplete
Status in nfs-ganesha source package in Oracular:
Incomplete
Bug description:
[ Impact ]
* Users experience SEGFAULTs when Ganesha fails to communicate with
Gluster, causing the NFS share to disappear.
* The patch refactors how free_space() is called to avoid the null
pointer deference.
[ Test Plan ]
* I experienced crashes about once per day on volumes with heavy
large file random access. I've been running the proposed patch since
2024-05-16 and haven't experienced a single crash.
* The test plan is to run the packages from proposed for at least 5
days, under the same load as when the bug happened, and confirm that
the crashes reported in this bug no longer happen.
[ Other Info ]
* Commit which added this patch: https://github.com/nfs-ganesha/nfs-ganesha/commit/336abcba0b9e0dae0aadb4657c311d04862f2028
* Issue discussion: https://github.com/nfs-ganesha/nfs-ganesha/issues/904
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-ganesha/+bug/2065856/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list