Discussion:
[arch-general] btrfs raid 10 fileserver with ata errors
niya levi via arch-general
2017-01-13 02:22:05 UTC
Permalink
hi everyone
i have a fileserver with 6 1tb disks
i get the following errors repeated constantly

------------------------------------------------------------------------------------------------------
ata17.00 exception Emask 0x0 Sact 0x0 SErr 0x0 action 0x6 frozen
failed command: READ DMA
cmd c8/00:30: tag 14 dma 65536 in
res 40/00:ff:00 Emask 0x4 (timeout)
status {DRDY}

----------------------------------------------------------------------------------------------------------

after some googling it's been suggested that it's either a hard drive,
the sata controller or the sata cables.
how do i go about diagnosing and fixing the problem,
any suggestions or guidance would be appreciated.
shadrock
Kyle McNally via arch-general
2017-01-13 15:51:21 UTC
Permalink
-----Original Message-----
Of niya levi via arch-general
Sent: Thursday, January 12, 2017 9:22 PM
Subject: [arch-general] btrfs raid 10 fileserver with ata errors
hi everyone
i have a fileserver with 6 1tb disks
i get the following errors repeated constantly
----------------------------------------------------------------------------------------------
--------
ata17.00 exception Emask 0x0 Sact 0x0 SErr 0x0 action 0x6 frozen
failed command: READ DMA
cmd c8/00:30: tag 14 dma 65536 in
res 40/00:ff:00 Emask 0x4 (timeout)
status {DRDY}
----------------------------------------------------------------------------------------------
------------
after some googling it's been suggested that it's either a hard drive,
the sata controller or the sata cables.
how do i go about diagnosing and fixing the problem,
any suggestions or guidance would be appreciated.
shadrock
I've had this problem before. IIRC, you can match up the ata17.00 with what drive it's talking about by looking at your kernel boot messages. The first thing I would do is switch out the SATA cable and see if the problem persists. If that doesn't work, run a scan of the drive using the manufacturers scan program.
Dennis Anderson via arch-general
2017-01-13 19:26:55 UTC
Permalink
On Jan 12, 2017 6:22 PM, "niya levi via arch-general" <
arch-***@archlinux.org> wrote:

hi everyone
i have a fileserver with 6 1tb disks
i get the following errors repeated constantly

------------------------------------------------------------
------------------------------------------
ata17.00 exception Emask 0x0 Sact 0x0 SErr 0x0 action 0x6 frozen
failed command: READ DMA
cmd c8/00:30: tag 14 dma 65536 in
res 40/00:ff:00 Emask 0x4 (timeout)
status {DRDY}

------------------------------------------------------------
----------------------------------------------

after some googling it's been suggested that it's either a hard drive,
the sata controller or the sata cables.
how do i go about diagnosing and fixing the problem,
any suggestions or guidance would be appreciated.
shadrock


Hi,
You can also run self diagnostics with smartd, that may tell you more
niya levi via arch-general
2017-01-14 09:39:01 UTC
Permalink
This post might be inappropriate. Click to display it.
Karol Babioch
2017-01-19 07:02:52 UTC
Permalink
Hi,
Post by niya levi via arch-general
how do i go about diagnosing and fixing the problem,
any suggestions or guidance would be appreciated.
shadrock
Check all cables and re-plug them. To narrow down the problem you can
also swap arond disks, so you know whether it is the disk itself or the
controller and/or cabling. But problems like these are a sometimes a
PITA to fix.

Best regards,
Karol Babioch
niya levi via arch-general
2017-01-19 17:46:10 UTC
Permalink
Date: Thu, 19 Jan 2017 08:02:52 +0100
Subject: Re: [arch-general] btrfs raid 10 fileserver with ata errors
Content-Type: text/plain; charset="utf-8"
Hi,
Post by niya levi via arch-general
how do i go about diagnosing and fixing the problem,
any suggestions or guidance would be appreciated.
shadrock
Check all cables and re-plug them. To narrow down the problem you can
also swap arond disks, so you know whether it is the disk itself or the
controller and/or cabling. But problems like these are a sometimes a
PITA to fix.
Best regards,
Karol Babioch
hi Karol
i have replaced one sata cable which seem to have reduced the error
warnings,
i still get the odd failed cmd periodically , i am getting some more
sata cables before further test.
thanks
shadrock

Loading...