[Bug 537133] Re: mountall issues with NFS root filesystem
Ryan Tandy
537133 at bugs.launchpad.net
Tue Sep 10 23:56:53 UTC 2013
I'm joining this discussion late (I'm not really affected as my diskless
clients are all ro+aufs), so please let me know what I'm doing wrong in
my testing and which other information I can provide.
The improvement in quantal and raring compared to precise is limited. It
doesn't hang any more, but it still doesn't successfully remount /.
On all three releases, with 'rw' in the kernel command line, mountall
spawns 'mount /', it returns immediately, and the system boots
successfully.
On precise, with 'ro' in the kernel command line, I don't know for sure
what happens as mountall hangs and never offers to drop to a shell. (Is
there some way to get a rescue shell?)
On quantal and raring, with 'ro' in the kernel command line, mountall
never spawns 'mount /' (although it does send the mounting event; I
suppose that's the part where it's waiting for statd?) and /tmp waits
forever for /. I do, however, get the offer to skip mounting or drop to
a shell, so I'm able to recover the output from mountall. If I drop to a
shell and run mountall, booting finishes successfully.
I don't see any difference between 'local-filesystems' and 'virtual-
filesystems' in /etc/init/statd.conf, on any release.
I also don't see any difference from adding 'nolock' to the options in
fstab, which I thought was supposed to make it work. Is there an extra
step I'm missing? It's definitely visible in mountall's info about /.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mountall in Ubuntu.
https://bugs.launchpad.net/bugs/537133
Title:
mountall issues with NFS root filesystem
Status in “mountall” package in Ubuntu:
Incomplete
Status in “portmap” package in Ubuntu:
Fix Released
Status in “mountall” source package in Lucid:
Confirmed
Status in “portmap” source package in Lucid:
Fix Released
Status in “mountall” source package in Maverick:
Won't Fix
Status in “portmap” source package in Maverick:
Fix Released
Status in “mountall” source package in Precise:
Incomplete
Status in “portmap” source package in Precise:
Fix Released
Status in “mountall” source package in Quantal:
Incomplete
Status in “portmap” source package in Quantal:
Fix Released
Status in “mountall” source package in Raring:
Incomplete
Status in “portmap” source package in Raring:
Fix Released
Bug description:
Binary package hint: mountall
I think I've found two bugs in mountall-2.7 related to nfsroots. This
report describes both, since working around one exposes the other.
The first bug is a dependency issue, circular and otherwise. Unless
the nfsroot is mounted with "nolock", NFS locking depends on
rpc.statd, which depends on portmap, which depends on the "local-
filesystems" event (in /etc/init/portmap.conf). mountall will never
provide this event because it treats the rootfs as "local" even if
it's networked, for the sake of daemons that need to wait for the
rootfs to be remounted rw.
The problem is that portmap.conf needs access to /etc (ro), /var/run
(rw) and /lib/init/rw (rw). A dependency on "local-filesystems"
essentially means / and /tmp. Changing the dependency to "virtual-
filesystems" would be more correct, but I'm not entirely certain that
remounting / should depend on any general *-filesystems events.
It gets messier in statd.conf, which doesn't call out any filesystem
dependencies, yet requires portmap to be running. It tries to directly
"start portmap" which fails because the mentioned filesystems aren't
writable yet. Portmap and statd will start successfully later, but not
in time to satisfy the rootfs dependency.
The second bug is when one tries to work around the above problems by
specifying "nolock" in /etc/fstab for the nfsroot. In this case we
land in mountall.c at the bottom of run_mount() where the is_remote()
test causes spawn() to be called with wait=FALSE. spawn() then calls
nih_child_add_watch() which is supposed to eventually call back to
spawn_child_handler(), but it appears to fail to connect:
spawn: mount -n -a -t nfs -o remount,nolock 16.1.1.2:/export/romano /
spawn: mount / [272]
spawn: calling nih_child_add_watch for /
init: job_process_handler: Ignored event 1 (0) for process 272
The third line is debugging I added. If spawn_child_handler() had been
called, we would have seen an additional line:
mount / [272] exited normally
I didn't dig into libnih to figure out why this isn't working. Rather
I changed the test on which wait=FALSE depends, since it seems like
mountall should be waiting for the rootfs. This works, see attached
patch, though it only fixes the non-ideal "nolock" case.
ProblemType: Bug
Architecture: amd64
Date: Thu Mar 11 01:18:06 2010
DistroRelease: Ubuntu 10.04
Package: mountall 2.7
ProcEnviron:
LC_COLLATE=C
PATH=(custom, user)
LANG=en_US.utf8
SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-15.22-generic
SourcePackage: mountall
Uname: Linux 2.6.32-15-generic x86_64
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mountall/+bug/537133/+subscriptions
More information about the foundations-bugs
mailing list