Big problem!

don Paolo Benvenuto donpaolo at gsi.it
Wed Jan 31 10:29:49 GMT 2007


El mié, 31-01-2007 a las 09:28 +0000, Gavin McCullagh escribió:

> > Since two weeks I have a big problem with my edubuntu ltsp installation.
> > 
> > All were working without problem, but the hd failed, and I have to
> > reinstall all the stuff.
> 
> Just to be clear what has changed:
> 
>  - What version of Edubuntu were you running before and what version are
>    you running now after reinstall?  Was your previous version upgraded or
>    did you install that version directly.

I was running edgy, upgrade from dapper, upgraded from hoary, and now
againg, edgy upgraded from dapper installed from the cd

>  - This is still the same physical server, right?

yes

>  - You replaced a hard disk.  Was this with an identical new hard disk or
>    did something change?  Are you still using the same disk controller?  Is
>    it IDE, SATA or SCSI?  Are you using RAID of any sort?

no, all the same: SATA

> > Normally I mantain the server all the time swithced on. When I boot the
> > clients in the morning, almost all boot up, one or two don't boot and
> > they stop with an arror in the nfs mounting, so that the busybox shell
> > appears.
> >
> > But if I switch the clients off and boot them again, almost all don't
> > boot, and stop with the nfs mounting problem.
> 
> Did all thin clients always boot before you reinstalled or did you see this
> problem sometimes and now it's got much worse?

no, before the reinstall all the clients booted without problems

> If you boot each client one at a time, do you get less failures than if you
> boot many together?  Can you get back to most clients working by rebooting
> the server or restarting the nfs server?
> 	sudo /etc/init.d/nfs-kernel-server restart
> 
> > I'm quite sure it's not a software problem, and I was wondering if it
> > could be a hardware one. So that I ask you: how do I realize what is
> > failing? What should I check?
> 
> On the server, you could 
> 
> 1. check the output of the command dmesg for errors.

the only thing I see is that for eth0 (the nic of the clients there is a
line saying

 eth0: Setting full-duplex based on MII#1 link partner capability of
45e1.

but there isn't a line  like " eth1: ADMtek Comet rev 17 at Port 0xb800,
00:08:A1:A0:6C:1D, IRQ 74.":

dmesg |grep eth
[17179593.904000] eth1: ADMtek Comet rev 17 at Port 0xb800,
00:08:A1:A0:6C:1D, IRQ 74.
[17179593.984000] eth1:  setting full-duplex.
[17179596.932000] eth0: Setting full-duplex based on MII#1 link partner
capability of 45e1.
[17179624.224000] eth1: no IPv6 routers present
[17179624.744000] eth0: no IPv6 routers present


> 2. check the logfiles, particularly /var/log/syslog and /var/log/messages
>    for errors

I can't see anything significant

> 3. look at the output of "/sbin/ifconfig" and look at the error count.  It
>    should generally be zero.

there are 2 errors on the interface of the clients

> 3. look at the output of 
> 	sudo /usr/sbin/ethtool eth0
>    to see what speed the NIC is negociated at, etc.

$ sudo /usr/sbin/ethtool eth0
Settings for eth0:
No data available

Then is it a problem with the nic?

On the contrary:

$ sudo /usr/sbin/ethtool eth1
Settings for eth1:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 24
        Transceiver: internal
        Auto-negotiation: on
        Current message level: 0x00000001 (1)
        Link detected: yes


> > My net wiring: the server's nic is connected to two 24-ports switches
> > (D-Link), and every switch is connected to 20 clients, 40 clients for
> > all.
> >
> > When the problem appears, it affects the clients of both switches.
> 
> Is everything 100Mbit/sec or have you some Gigabit (or 10Mbit/s) equipment?
> I presume one switch plugs into the other which does create a bottleneck
> for the switch further away from the server.  This would not have been new
> after a reinstall though.
> 
> > If a client doesn't boot, preferebly isn't one of the clients nearest to
> > server.
> 
> You mean the clients nearer the server boot correctly more reliably than
> those further away?  Do those machines have something in common, eg type of
> machine, what switch they're on? 

The nearest clients are on the same switch, but all the clients are the
same hd.


-- 
don Paolo Benvenuto

http://guaricano.diocesi.genova.it
è il diario che scrivo, principalmente io, ma anche altri:
puoi trovarvi la vita della missione, giorno per giorno

Contribuisci a wikipedia, l'enciclopedia della quale tu sei l'autore e
il revisore: http://it.wikipedia.org




More information about the edubuntu-users mailing list