Projet

Général

Profil

Anomalie #105

Probleme disque h6

Ajouté par Laurent GUERBY il y a plus de 12 ans. Mis à jour il y a plus de 12 ans.

Statut:
Fermé
Priorité:
Normal
Assigné à:
Catégorie:
-
Début:
01/11/2011
Echéance:
% réalisé:

100%

Temps estimé:

Description

Depuis le 15 octobre /dev/sdb sur h6 a des soucis

Historique

#1 Mis à jour par Laurent GUERBY il y a plus de 12 ans

[11640694.395199] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[11640694.395221] ata6.00: failed command: WRITE DMA EXT
[11640694.395238] ata6.00: cmd 35/00:10:b7:27:63/00:00:44:00:00/e0 tag 0 dma 8192 out
[11640694.395240] res 40/00:00:06:42:be/00:00:1f:00:00/e0 Emask 0x4 (timeout)
[11640694.395275] ata6.00: status: { DRDY }
[11640694.395285] ata6: hard resetting link
[11640695.327577] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[11640695.465658] ata6.00: configured for UDMA/33
[11640695.465670] ata6: EH complete

root@h6:~# hdparm -i /dev/sdb

/dev/sdb:

Model=ST2000DL003-9VT166, FwRev=CC32, SerialNo=5YD36RHC
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=3907029168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 *udma2 udma3 udma4 udma5 udma6
AdvancedPM=no WriteCache=enabled
Drive conforms to: unknown: ATA/ATAPI-4,5,6,7
  • signifies the current active mode

root@h6:~# hdparm -t /dev/sdb

/dev/sdb:
Timing buffered disk reads: 2 MB in 6.29 seconds = 325.69 kB/sec

root@h6:~# smartctl -l selftest /dev/sdb
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
  1. 1 Short offline Self-test routine in progress 90% 3627 -
  2. 2 Short offline Interrupted (host reset) 90% 3624 -
  3. 3 Short offline Interrupted (host reset) 00% 3623 -
  4. 4 Short offline Interrupted (host reset) 00% 3623 -
  5. 5 Short offline Interrupted (host reset) 00% 3623 -
  6. 6 Short offline Aborted by host 90% 3623 -

#2 Mis à jour par Laurent GUERBY il y a plus de 12 ans

A faire :
- gnt-node migrate h6
- shutdown sur h6
- debrancher cable reseau
- changer cable SATA
- rallumer
- verifier etat disque
- rebrancher reseau
- hbal -L --no-disk-moves -X

#3 Mis à jour par Laurent GUERBY il y a plus de 12 ans

root@h1:~# gnt-instance migrate nagios
Instance nagios will be migrated. Note that migration might impact the
instance if anything goes wrong (e.g. due to bugs in the hypervisor).
Continue?
y/[n]/?: y
Tue Nov 1 11:42:00 2011 Migrating instance nagios.tetaneutral.net
Tue Nov 1 11:42:00 2011 * checking disk consistency between source and target
Tue Nov 1 11:42:01 2011 * switching node h4.tetaneutral.net to secondary mode
Tue Nov 1 11:42:02 2011 * changing into standalone mode
Tue Nov 1 11:42:12 2011 * changing disks into dual-master mode
Tue Nov 1 11:42:37 2011 * wait until resync is done
Tue Nov 1 11:42:41 2011 - progress: 0.0%
Failure: command execution error:
Cannot resync disks on node h6.tetaneutral.net: DRBD device <<class 'ganeti.bdev.DRBD8'>: unique_id: ('91.224.149.156', 12589, '91.224.149.154', 12589, 27, '96f921d41d9db0cf91bf2ae5fac60145b9ea801d'), children: [<<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('kvmvg', 'c6a58fad-2c71-4830-a810-d6c87f9b6a67.disk0_data'), children: [], 254:55, /dev/kvmvg/c6a58fad-2c71-4830-a810-d6c87f9b6a67.disk0_data>, <<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('kvmvg', 'c6a58fad-2c71-4830-a810-d6c87f9b6a67.disk0_meta'), children: [], 254:56, /dev/kvmvg/c6a58fad-2c71-4830-a810-d6c87f9b6a67.disk0_meta>], 147:27, /dev/drbd27> is not in sync: stats=<ganeti.bdev.DRBD8Status object at 0x2612990>
root@h1:~# gnt-instance migrate nagios
Instance nagios will be migrated. Note that migration might impact the
instance if anything goes wrong (e.g. due to bugs in the hypervisor).
Continue?
y/[n]/?: y
Tue Nov 1 11:43:28 2011 Migrating instance nagios.tetaneutral.net
Tue Nov 1 11:43:28 2011 * checking disk consistency between source and target
Failure: command execution error:
Disk disk/0 is degraded or not fully synchronized on target node, aborting migrate.

=> solution gnt-instance migrate --cleanup nagios

#4 Mis à jour par Laurent GUERBY il y a plus de 12 ans

premier essai en changeant port SATA et cable => le disque h6 avait toujours une performance tres faible relativement a h5/4/1 (via hdparm -t).

deuxieme essai => changement du disque lui meme => retour a une performance normale du disque

Conclusion : mauvais disque ...

Formats disponibles : Atom PDF