root filesystem full and having trouble clearing it!!!
Gavin McCullagh
gmccullagh at gmail.com
Fri Mar 16 13:55:51 UTC 2007
Hi,
[ I've found a fix, but I'd like to send this anyway, to ask what the
correct solution is ]
We're running Edgy for a number of thin clients (about 20 just now).
I'm having a nasty problem and I can't spot the reason. The root
filesystem is full so users can't login. However, I can't pinpoint where
the data is to free it up.
gavinmc at medlycott:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md0 9.2G 8.8G 416K 100% /
varrun 2.0G 168K 2.0G 1% /var/run
varlock 2.0G 0 2.0G 0% /var/lock
procbususb 10M 108K 9.9M 2% /proc/bus/usb
udev 10M 108K 9.9M 2% /dev
devshm 2.0G 0 2.0G 0% /dev/shm
/dev/md2 115G 5.0G 104G 5% /backups
/dev/md1 9.2G 335M 8.4G 4% /var
brooks:/home 130G 5.5G 118G 5% /home
brooks:/shared 130G 5.5G 118G 5% /shared
ltspfs 125M 16K 125M 1% /tmp/.jmcclean-ltspfs/floppy0
ltspfs 38G 820M 37G 3% /tmp/.jmcclean-ltspfs/atadisk-hda1
Looking at the disk usage (I've snipped out /dev and /proc), I can't see
where all the disk usage is.
gavinmc at medlycott:~$ sudo du -hs /*
4.8G /backups
3.5M /bin
36M /boot
0 /cdrom
13M /etc
2.3G /home
4.0K /initrd
0 /initrd.img
0 /initrd.img.old
259M /lib
48K /lost+found
44K /media
4.0K /mnt
350M /opt
905M /proc
180K /root
5.6M /sbin
2.4M /shared
4.0K /srv
0 /sys
du: cannot access `/tmp/.jmcclean-ltspfs/floppy0': Permission denied
du: cannot access `/tmp/.jmcclean-ltspfs/atadisk-hda1': Permission denied
86M /tmp
2.3G /usr
207M /var
0 /vmlinuz
0 /vmlinuz.old
/tmp was very full so I removed a big load of files called /tmp/fileXXXXXXX
all of which were 32MB in size (I gather they're ndb swap files?).
However, they've not freed up any space. I presume this is because some
process (nbd-server) still has them open?
[My Solution]
There were tonnes of old processes running under the nbdserver user each
of which looked like:
nobody 6776 0.0 0.0 1656 468 ? S Mar15 0:00 /bin/sh /usr/sbin/nbdswapd
nobody 6779 0.0 0.0 3248 740 ? S Mar15 0:00 /bin/nbd-server 0 /tmp/fileHiJv50
Many of them were still there from February. I can't see why that would be, but it
looks like they never stopped when the thin client went down. So, I used
this to kill the February ones:
ps aux |grep nbd |grep Feb | awk '{ print $2}' | xargs sudo kill
I have now freed up 2.6GB space and all is going back to normal for now.
However, it appears this is going to bite us again in a week or two. There
are currently 103 such processes which is about five times are total number
of thin clients.
Can someone explain why these processes are hanging around using up so much
disk space. Is it a bug or something we've done wrong?
Can I and should I put the network swap files on a different partition?
Should we just turn off network swap?
Gavin
More information about the edubuntu-devel
mailing list