[Bug 129723] sata_nv port resets on 2.6.20-16.29

Sean Dague sean at dague.net
Wed Aug 1 15:20:24 UTC 2007


Public bug reported:

Binary package hint: linux-image-generic

Last night, while pushing some data via NFS, and some data via FTP over
GigE to my data server (running fiesty, all upgrades).  Something causes
a port reset on sata_nv controller.  This offlined 2 of 6 Seagate 320 GB
drives that make up my RAID5 software array, putting the array into a
broken state.  I force assembled the array to get 5 drives online, which
passes e2fsck, and am now in a 2 day rebuild cycle for the 6th drive
after reading it.

Attached are the logs of the event.  I had previously been running a
custom 2.6.18.3 kernel, under all kinds of load, with no issues.

Jul 31 20:02:01 gallifrey -- MARK --
Jul 31 20:10:41 gallifrey kernel: [179384.992000]          res 40/00:00:6f:b0:9d/00:00:00:00:00/00 Emask 0x4 (
timeout)
Jul 31 20:10:41 gallifrey kernel: [179384.992000]          res 40/00:00:ff:b3:9d/00:00:00:00:00/00 Emask 0x4 (
timeout)
Jul 31 20:10:42 gallifrey kernel: [179385.304000] ata1: soft resetting port
Jul 31 20:10:42 gallifrey kernel: [179385.304000] ata2: soft resetting port
Jul 31 20:10:42 gallifrey kernel: [179385.460000] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jul 31 20:10:42 gallifrey kernel: [179385.460000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jul 31 20:11:12 gallifrey kernel: [179415.468000] ata1.00: qc timeout (cmd 0x27)
Jul 31 20:11:12 gallifrey kernel: [179415.468000] ata1.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors 
= 0
Jul 31 20:11:12 gallifrey kernel: [179415.468000] ata1: failed to recover some devices, retrying in 5 secs
Jul 31 20:11:12 gallifrey kernel: [179415.468000] ata2.00: qc timeout (cmd 0x27)
Jul 31 20:11:12 gallifrey kernel: [179415.468000] ata2.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors 
= 0
Jul 31 20:11:12 gallifrey kernel: [179415.468000] ata2: failed to recover some devices, retrying in 5 secs
Jul 31 20:11:17 gallifrey kernel: [179420.472000] ata1: hard resetting port
Jul 31 20:11:17 gallifrey kernel: [179420.472000] ata2: hard resetting port
Jul 31 20:11:17 gallifrey kernel: [179421.104000] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jul 31 20:11:17 gallifrey kernel: [179421.104000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata1.00: qc timeout (cmd 0x27)
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata1.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors 
= 0
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata1.00: limiting speed to UDMA/133:PIO3
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata1: failed to recover some devices, retrying in 5 secs
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata2.00: qc timeout (cmd 0x27)
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata2.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors 
= 0
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata2.00: limiting speed to UDMA/133:PIO3
Jul 31 20:11:47 gallifrey kernel: [179451.112000] ata2: failed to recover some devices, retrying in 5 secs
Jul 31 20:11:52 gallifrey kernel: [179456.116000] ata1: hard resetting port
Jul 31 20:11:52 gallifrey kernel: [179456.116000] ata2: hard resetting port
Jul 31 20:11:53 gallifrey kernel: [179456.748000] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jul 31 20:11:53 gallifrey kernel: [179456.748000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jul 31 20:12:14 gallifrey gconfd (sdague-16640): starting (version 2.18.0.1), pid 16640 user 'sdague'
Jul 31 20:12:14 gallifrey gconfd (sdague-16640): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory
" to a read-only configuration source at position 0
Jul 31 20:12:14 gallifrey gconfd (sdague-16640): Resolved address "xml:readwrite:/home/sdague/.gconf" to a wri
table configuration source at position 1
Jul 31 20:12:14 gallifrey gconfd (sdague-16640): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults"
 to a read-only configuration source at position 2
Jul 31 20:12:14 gallifrey gconfd (sdague-16640): Resolved address "xml:readonly:/var/lib/gconf/debian.defaults
" to a read-only configuration source at position 3
Jul 31 20:12:14 gallifrey gconfd (sdague-16640): Resolved address "xml:readonly:/var/lib/gconf/defaults" to a 
read-only configuration source at position 4
Jul 31 20:12:20 gallifrey gconfd (sdague-16640): Resolved address "xml:readwrite:/home/sdague/.gconf" to a wri
table configuration source at position 0
Jul 31 20:12:23 gallifrey kernel: [179486.756000] ata1.00: qc timeout (cmd 0x27)
Jul 31 20:12:23 gallifrey kernel: [179486.756000] ata1.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors 
= 0
Jul 31 20:12:23 gallifrey kernel: [179486.756000] ata1.00: disabled
Jul 31 20:12:23 gallifrey kernel: [179486.756000] ata2.00: qc timeout (cmd 0x27)
Jul 31 20:12:23 gallifrey kernel: [179486.756000] ata2.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors 
= 0
Jul 31 20:12:23 gallifrey kernel: [179486.756000] ata2.00: disabled
Jul 31 20:12:24 gallifrey kernel: [179487.260000] ata1: EH complete
Jul 31 20:12:24 gallifrey kernel: [179487.260000] ata2: EH complete
Jul 31 20:12:24 gallifrey kernel: [179487.260000] sd 0:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.260000] end_request: I/O error, dev sda, sector 249584368
Jul 31 20:12:24 gallifrey kernel: [179487.260000] sd 1:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.260000] end_request: I/O error, dev sdb, sector 249583712
Jul 31 20:12:24 gallifrey kernel: [179487.260000] sd 1:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.260000] end_request: I/O error, dev sdb, sector 249584280
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 0:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sda, sector 249584632
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 1:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sdb, sector 249583104
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 0:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sda, sector 249584640
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584640 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584648 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584656 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584664 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584672 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584680 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584688 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584696 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584704 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584712 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584720 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584728 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584736 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584744 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584752 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584760 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584768 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584776 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584784 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584792 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584800 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584808 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584816 on s
da).
Jul 31 20:12:24 gallifrey kernel: [179487.264000] raid5:md0: read error not correctable (sector 249584824 on s
da).
...... (a lot of those)
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 1:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sdb, sector 249583704
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 0:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sda, sector 249583104
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 1:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sdb, sector 249584504
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sd 0:0:0:0: SCSI error: return code = 0x00040000
Jul 31 20:12:24 gallifrey kernel: [179487.264000] end_request: I/O error, dev sda, sector 249583704
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sda : READ CAPACITY failed.
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sda : status=0, message=00, host=4, driver=00 
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sda : sense not available. 
Jul 31 20:12:24 gallifrey kernel: [179487.264000] sda: Write Protect is off
Jul 31 20:12:24 gallifrey kernel: [179487.300000] RAID5 conf printout:
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  --- rd:6 wd:4
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  disk 0, o:0, dev:sda
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  disk 1, o:0, dev:sdb
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  disk 2, o:1, dev:sdc
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  disk 3, o:1, dev:sdd
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  disk 4, o:1, dev:sde
Jul 31 20:12:24 gallifrey kernel: [179487.300000]  disk 5, o:1, dev:sdf
Jul 31 20:12:24 gallifrey kernel: [179487.308000] RAID5 conf printout:
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  --- rd:6 wd:4
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 1, o:0, dev:sdb
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 2, o:1, dev:sdc
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 3, o:1, dev:sdd
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 4, o:1, dev:sde
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 5, o:1, dev:sdf
Jul 31 20:12:24 gallifrey kernel: [179487.308000] RAID5 conf printout:
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  --- rd:6 wd:4
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 1, o:0, dev:sdb
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 2, o:1, dev:sdc
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 3, o:1, dev:sdd
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 4, o:1, dev:sde
Jul 31 20:12:24 gallifrey kernel: [179487.308000]  disk 5, o:1, dev:sdf
Jul 31 20:12:24 gallifrey kernel: [179487.324000] RAID5 conf printout:
Jul 31 20:12:24 gallifrey kernel: [179487.324000]  --- rd:6 wd:4
Jul 31 20:12:24 gallifrey kernel: [179487.324000]  disk 2, o:1, dev:sdc
Jul 31 20:12:24 gallifrey kernel: [179487.324000]  disk 3, o:1, dev:sdd
Jul 31 20:12:24 gallifrey kernel: [179487.324000]  disk 4, o:1, dev:sde
Jul 31 20:12:24 gallifrey kernel: [179487.324000]  disk 5, o:1, dev:sdf
Jul 31 20:12:24 gallifrey kernel: [179487.324000] lost page write due to I/O error on md0
Jul 31 20:12:24 gallifrey last message repeated 9 times
Jul 31 20:12:55 gallifrey kernel: [179518.424000] printk: 5191 messages suppressed.
Jul 31 20:12:55 gallifrey kernel: [179518.424000] lost page write due to I/O error on md0
Jul 31 20:12:55 gallifrey last message repeated 5 times
Jul 31 20:13:00 gallifrey kernel: [179523.408000] printk: 14 messages suppressed.
Jul 31 20:13:00 gallifrey kernel: [179523.408000] lost page write due to I/O error on md0

** Affects: linux-meta (Ubuntu)
     Importance: Undecided
         Status: New

-- 
sata_nv port resets on 2.6.20-16.29
https://bugs.launchpad.net/bugs/129723
You received this bug notification because you are a member of Kernel
Bugs, which is a bug contact for linux-meta in ubuntu.




More information about the kernel-bugs mailing list