[Bug 1558967] Re: libfuse2: race in fuse_daemonize() causes ' Transport endpoint is not connected' (found with cmsfs-fuse)
Martin Pitt
martin.pitt at ubuntu.com
Tue Jul 12 07:02:12 UTC 2016
Hello bugproxy, or anyone else affected,
Accepted fuse into xenial-proposed. The package will build now and be
available at https://launchpad.net/ubuntu/+source/fuse/2.9.4-1ubuntu3.1
in a few hours, and then in the -proposed repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
enable and use -proposed. Your feedback will aid us getting this update
out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed. In either case, details of your testing will help
us make a better decision.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance!
** Changed in: fuse (Ubuntu Xenial)
Status: Triaged => Fix Committed
** Tags added: verification-needed
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to fuse in Ubuntu.
https://bugs.launchpad.net/bugs/1558967
Title:
libfuse2: race in fuse_daemonize() causes ' Transport endpoint is not
connected' (found with cmsfs-fuse)
Status in Ubuntu on IBM z Systems:
Triaged
Status in fuse package in Ubuntu:
Fix Released
Status in fuse source package in Xenial:
Fix Committed
Bug description:
== Comment: #21 - Hendrik Brueckner - 2016-03-16 06:44:09 ==
Package: libfuse2
Version: 2.9.4-1ubuntu2
The cmsfs-fuse program is used to transfer files from a CMSFS dasd (on
z/VM) to Linux. The procedure is to mount, copy files, umount. All
commands are issued from within an application over an SSH connection.
The problem is that the copy intermittently fails with "Transport
endpoint is not connected". The procedure is as follows:
#mount cmsfs
sudo /usr/bin/cmsfs-fuse /dev/dasdb /usr/wave/wavedisk
# copy file
/bin/cp -f /usr/wave/wavedisk/WAVEDATA.SCRIPT /usr/wave/wavedata
/bin/cp: cannot stat '/usr/wave/wavedisk/WAVEDATA.SCRIPT': Transport endpoint is not connected
#umount
umount /usr/wave/wavedisk
Because the application uses JSCH to issue the commands, I worked on a
non-Java reproducer using SSH.
The problem can be easily re-created with ssh as follows:
root at r3559004:~# ssh -t root at localhost "cmsfs-fuse /dev/disk/by-path/ccw-0.0.0190 /CMSFS"
Connection to localhost closed.
root at r3559004:~# ls /CMSFS
ls: cannot access '/CMSFS': Transport endpoint is not connected
Problem analysis will follow but not that is not specific to cmsfs-fuse; the problem might also occur with other fuse file systems that are mounted through an SSH connection.
== Comment: #23 - Hendrik Brueckner - 2016-03-16 07:07:30 ==
After debugging and some code review on the libfuse library, I think that
we identified the root cause. As suggested, the problem is not related
to cmsfs-fuse directly.
The cmsfs-fuse main program calls into the libfuse library() using the
fuse_main() function. The fuse_main() function later calls the
fuse_daemonize() to fork the daemon process to handle the fuse file
system I/O.
The fuse_daemonize() look at follows:
180 int fuse_daemonize(int foreground)
181 {
182 if (!foreground) {
183 int nullfd;
184
185 /*
186 * demonize current process by forking it and killing the
187 * parent. This makes current process as a child of 'init'.
188 */
189 switch(fork()) {
190 case -1:
191 perror("fuse_daemonize: fork");
192 return -1;
193 case 0:
194 break;
195 default:
196 _exit(0);
197 }
198
199 if (setsid() == -1) {
200 perror("fuse_daemonize: setsid");
201 return -1;
202 }
203
204 (void) chdir("/");
205
206 nullfd = open("/dev/null", O_RDWR, 0);
207 if (nullfd != -1) {
208 (void) dup2(nullfd, 0);
209 (void) dup2(nullfd, 1);
210 (void) dup2(nullfd, 2);
211 if (nullfd > 2)
212 close(nullfd);
213 }
214 }
215 return 0;
216 }
The fuse_daemonize() function calls fork() as usual. The child proceeds with setsid() and then redirecting its file descriptors to /dev/null etc. The parent process, simply exits.
The child's functions and the parent's exit creates a subtle race.
This is seen with an SSH connection. The SSH command "ssh -t
root at localhost "cmsfs-fuse /dev/disk/by-path/ccw-0.0.0190 /CMSFS"
calls the cmsfs-fuse on an allocated pseudo-terminal device (-t
option).
If the parent exits, the SSH command receives that its command has
been executed and closes the connection, that means, it closes the
master side of the pseudo-terminal. This causes a HUP signal being
sent to the process group on the pseudo-terminal. The child might not
have completed the setsid() call and hence becomes terminated. Note
that fuse sets up its signal handler later after fuse_daemonize() has
complete.
Even if the child has the chance to disassociate from it's parent
process group to become it's own process group with setsid(), the
child still has the pseudo-terminal opened as stdin, stdout, and
stderr. So the pseudo-terminal still behave as controlling terminal
and might cause a SIGHUP to be issued at closing the the master side.
To solve the problem, the parent has to wait until the child (the fuse
daemon process) has completed its processing, that means, has become
its own process group with setsid() and closed any file descriptors
pointing to the pseudo-terminal.
For example, using a pipe as follows could solve the problem:
The parent waits on the pipe, then exits:
read(waiter[0], &completed, sizeof(completed));
_exit(0);
The child signals its completion (after redirecting its file descriptors) with:
completed = 1;
write(waiter[1], &completed, sizeof(completed));
== Comment: #24 - Gerald Schaefer - 2016-03-16 08:18:20 ==
The race can also be triggered w/o ssh, by using "setsid -c", and I can also reproduce it w/o cmsfs-fuse but with sshfs:
root at s3545003:~# setsid -c sshfs geraldsc at tuxmaker: sshfs/
root at s3545003:~# ls sshfs
ls: cannot access 'sshfs': Transport endpoint is not connected
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1558967/+subscriptions
More information about the foundations-bugs
mailing list