BSD

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Various disk errors just showed up in dmesg :(

    8 answers - 5364 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Well this is fantastic! All my hardware seems to be dieing on me as of
    late! It's just not funny any more!
    Jun 27 22:25:48 bone /netbsd: wd1a: DMA error writing fsbn 118105728 of
    118105728-118105747 (wd1 bn 118105791; cn 117168 tn 7 sn 6), retrying
    Jun 27 22:25:48 bone /netbsd: wd1: soft error (corrected)
    Jun 27 22:26:05 bone /netbsd: wd1a: device fault writing fsbn 110309504
    of 110309504-110309535 (wd1 bn 110309567; cn 109434 tn 1 sn 32), retrying
    Jun 27 22:26:05 bone /netbsd: wd1: soft error (corrected)
    Jun 28 01:33:23 bone /netbsd: wd1a: device fault writing fsbn 119347360
    of 119347360-119347363 (wd1 bn 119347423; cn 118400 tn 3 sn 34), retrying
    Jun 28 01:33:23 bone /netbsd: wd1: soft error (corrected)
    Jun 28 01:33:34 bone /netbsd: wd1a: device fault writing fsbn 119412992
    of 119412992-119413023 (wd1 bn 119413055; cn 118465 tn 5 sn 20), retrying
    Jun 28 01:33:35 bone /netbsd: wd1: soft error (corrected)
    Jun 28 01:33:53 bone /netbsd: wd1a: device fault writing fsbn 758016 of
    758016-758047 (wd1 bn 758079; cn 752 tn 1 sn 0), retrying
    Jun 28 01:33:53 bone /netbsd: wd1: soft error (corrected)
    Jun 28 01:49:05 bone /netbsd: wd1a: device fault writing fsbn 119372416
    of 119372416-119372447 (wd1 bn 119372479; cn 118425 tn 1 sn 16), retrying
    Jun 28 01:49:05 bone /netbsd: wd1: soft error (corrected)
    Jun 28 01:56:36 bone /netbsd: wd1a: device fault reading fsbn 4992064 of
    4992064-4992159 (wd1 bn 4992127; cn 4952 tn 8 sn 7), retrying
    Jun 28 01:56:38 bone /netbsd: wd1: soft error (corrected)
    Jun 28 07:25:47 bone /netbsd: wd1a: device fault writing fsbn 13261056
    of 13261056-13261087 (wd1 bn 13261119; cn 13155 tn 13 sn 60), retrying
    Jun 28 07:25:47 bone /netbsd: wd1: soft error (corrected)
    Jun 28 07:25:59 bone /netbsd: wd1a: device fault writing fsbn 113115672
    of 113115672-113115675 (wd1 bn 113115735; cn 112217 tn 15 sn 54), retrying
    Jun 28 07:26:00 bone /netbsd: wd1: soft error (corrected)
    Jun 28 07:26:38 bone /netbsd: wd1a: DMA error writing fsbn 101559196 of
    101559196-101559199 (wd1 bn 101559259; cn 100753 tn 3 sn 46), retrying
    Jun 28 07:26:38 bone /netbsd: wd1: soft error (corrected)
    Jun 28 07:26:49 bone /netbsd: wd1a: device fault writing fsbn 110312256
    of 110312256-110312287 (wd1 bn 110312319; cn 109436 tn 13 sn 12), retrying
    Jun 28 07:26:50 bone /netbsd: wd1: soft error (corrected)
    Jun 28 07:27:07 bone /netbsd: wd1a: DMA error writing fsbn 113115672 of
    113115672-113115675 (wd1 bn 113115735; cn 112217 tn 15 sn 54), retrying
    Jun 28 07:27:08 bone /netbsd: wd1: soft error (corrected)
    Jun 28 09:18:04 bone /netbsd: wd1a: DMA error writing fsbn 110255456 of
    110255456-110255487 (wd1 bn 110255519; cn 109380 tn 7 sn 38), retrying
    Jun 28 09:18:04 bone /netbsd: wd1: soft error (corrected)
    Jun 28 09:30:22 bone /netbsd: wd1a: DMA error writing fsbn 1068672 of
    1068672-1068675 (wd1 bn 1068735; cn 1060 tn 4 sn 3), retrying
    Jun 28 09:30:23 bone /netbsd: wd1: soft error (corrected)
    Jun 28 13:14:55 bone /netbsd: wd1a: device fault writing fsbn 758016 of
    758016-758047 (wd1 bn 758079; cn 752 tn 1 sn 0), retrying
    Jun 28 13:14:55 bone /netbsd: wd1: soft error (corrected)
    Jun 28 17:32:33 bone /netbsd: wd1a: device fault writing fsbn 110255584
    of 110255584-110255615 (wd1 bn 110255647; cn 109380 tn 9 sn 40), retrying
    Jun 28 17:32:34 bone /netbsd: wd1: soft error (corrected)
    Jun 28 23:04:10 bone /netbsd: wd1a: error reading fsbn 2539072 of
    2539072-2539199 (wd1 bn 2539135; cn 2518 tn 15 sn 46), retrying
    Jun 28 23:04:10 bone /netbsd: wd1: (obsolete (address mark not found),
    no media/write protected, id not found, uncorrectable data error)
    Jun 28 23:04:12 bone /netbsd: wd1: soft error (corrected)
    This disk is probably not even a month old either! Brand new Seagate
    7200.9. I've checked the smart status and it's showing no remapped bad
    blocks, which is reassuring
    SMART supported, SMART enabled
    id value thresh crit collect reliability description raw
    1 116 6 yes online positive Raw read error rate
    110385059
    3 100 0 yes online positive Spin-up time 0
    4 100 20 no online positive Start/stop count 17
    5 100 36 yes online positive Reallocated sector count 0
    7 70 30 yes online positive Seek error rate
    10919452
    9 100 0 no online positive Power-on hours count 509
    10 100 97 yes online positive Spin retry count 0
    12 100 20 no online positive Device power cycle count 66
    187 100 0 no online positive Unknown 0
    189 100 0 no online positive Unknown 0
    190 58 45 no online positive Unknown
    707330090
    194 42 0 no online positive Temperature
    42 Lifetime max/min 0/31
    195 78 0 no online positive Hardware ECC Recovered
    98608344
    197 1 0 no online positive Current pending sector
    4294967295
    198 1 0 no offline positive uncorrectable
    4294967295
    199 200 0 no online positive Ultra DMA CRC error count 0
    200 100 0 no offline positive Write error rate 0
    202 100 0 no online positive Data address mark errors 0
    Is this most likely just the cable? That last error looks quite
    worrying I do believe that I have another disk attached on the same
    cable as that drive though, so I would have thought I would also be
    seeing errors on wd0. Hmm
  • No.1 | | 619 bytes | |

    6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:
    Is this most likely just the cable? That last error looks quite
    worrying I do believe that I have another disk attached on the same
    cable as that drive though, so I would have thought I would also be
    seeing errors on wd0. Hmm

    Cable, wonky controller bios, memory, any number of things. Cold boot
    it and see if it continues.

    I had this problem when I added a network card. I assume this is
    likely PC hardware, which is often junk. No offense, it's just the
    case. I have to deal with it too.

    Andy
  • No.2 | | 1652 bytes | |

    Andy Ruhl wrote:
    6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:

    >Is this most likely just the cable? That last error looks quite
    >worrying I do believe that I have another disk attached on the same
    >cable as that drive though, so I would have thought I would also be
    >seeing errors on wd0. Hmm


    Cable, wonky controller bios, memory, any number of things. Cold boot
    it and see if it continues.

    I had this problem when I added a network card. I assume this is
    likely PC hardware, which is often junk. No offense, it's just the
    case. I have to deal with it too.

    Andy

    Speedy reply!

    Don't think it would be the controller's BIS as it worked without any
    of these types of errors on FreeBSD (with the same cables too, but hey).
    Maybe the controller is on it's way out, but then it's really odd it's
    only affecting that one drive :-)

    I'm waiting on a hard disk to arrive, quite ironically. Should be
    tomorrow with a bit of luck. I'd really like to hold off rebooting and
    messing about with things (again!) until then. Are these errors actually
    harmful (for now) if they're being "corrected"? The disk is on a RAID-1
    so

    It is indeed PC hardware. I know, alot of it is rubbish :-)

    Doing a `dd if=/dev/rwd1d of=/dev/null bs=16k` is bringing up a fair
    amount of these errors, and I don't think any of the numbers relate to
    the numbers seen previously in the logs. I really hope this is just a
    mysteriously cable issue

    Thanks!!
  • No.3 | | 1131 bytes | |

    Andy Ruhl wrote:
    6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:

    >Is this most likely just the cable? That last error looks quite
    >worrying I do believe that I have another disk attached on the same
    >cable as that drive though, so I would have thought I would also be
    >seeing errors on wd0. Hmm


    Cable, wonky controller bios, memory, any number of things. Cold boot
    it and see if it continues.

    I had this problem when I added a network card. I assume this is
    likely PC hardware, which is often junk. No offense, it's just the
    case. I have to deal with it too.

    Andy

    , I've just seen something quite disturbing :)

    Recall the SMART status from before:

    12 100 20 no online positive Device power cycle count 66

    this is now

    12 100 20 no online positive Device power cycle count 83

    NetBSD 'resetting the device' (I have no idea if it is or not), wouldn't
    cause this would it? I think I am off to check the power plug isn't
    loose or something!!
  • No.4 | | 1965 bytes | |

    Mark Cullen wrote:
    Andy Ruhl wrote:

    >6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:
    >>

    Is this most likely just the cable? That last error looks quite
    worrying I do believe that I have another disk attached on the same
    cable as that drive though, so I would have thought I would also be
    seeing errors on wd0. Hmm
    >>
    >>
    >>

    >Cable, wonky controller bios, memory, any number of things. Cold boot
    >it and see if it continues.
    >>

    >I had this problem when I added a network card. I assume this is
    >likely PC hardware, which is often junk. No offense, it's just the
    >case. I have to deal with it too.
    >>

    >Andy
    >>


    , I've just seen something quite disturbing :)

    Recall the SMART status from before:

    12 100 20 no online positive Device power cycle count 66

    this is now

    12 100 20 no online positive Device power cycle count 83

    NetBSD 'resetting the device' (I have no idea if it is or not), wouldn't
    cause this would it? I think I am off to check the power plug isn't
    loose or something!!

    Well that appears to be it. I literally just touched the plug (I know I
    know, it was still on but I can't really tell if it's the power plug
    while it's off!) and I heard the hard disk make a funny high pitched
    noise. The "Device power cycle count" went up to 84, and I got one of
    those odd errors pop up in dmesg (without the dd running).

    So glad NetBSD has SMART support! Sorry for the noise, and at the same
    time thanks for the support (in general, this list has been a very
    helpful place for me) :-)

    Cheers!
  • No.5 | | 1733 bytes | |

    6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:
    Andy Ruhl wrote:
    6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:
    >
    >Is this most likely just the cable? That last error looks quite
    >worrying I do believe that I have another disk attached on the same
    >cable as that drive though, so I would have thought I would also be
    >seeing errors on wd0. Hmm
    >
    >

    Cable, wonky controller bios, memory, any number of things. Cold boot
    it and see if it continues.

    I had this problem when I added a network card. I assume this is
    likely PC hardware, which is often junk. No offense, it's just the
    case. I have to deal with it too.

    Andy
    --
    , I've just seen something quite disturbing :)

    Recall the SMART status from before:

    12 100 20 no online positive Device power cycle count 66

    this is now

    12 100 20 no online positive Device power cycle count 83

    NetBSD 'resetting the device' (I have no idea if it is or not), wouldn't
    cause this would it? I think I am off to check the power plug isn't
    loose or something!!

    Yeah, those old style (non newer SATA style) power plugs are horrible.
    I've had one of those go bad too. I think you'd see different errors
    though. But I don't know how to interpret that SMART output.

    I literally got errors just like this when I added a card to my box,
    on 1 of the 4 drives on the controller. Took out the card, everything
    is fine.

    PC hardware, I'm telling ya. I know that's a "dust bin" type answer,
    but it's the truth.

    Andy
  • No.6 | | 2876 bytes | |

    Andy Ruhl wrote:
    6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:

    >Andy Ruhl wrote:
    >6/28/06, Mark Cullen <mark.r.cullen (AT) gmail (DOT) comwrote:
    >>
    >>Is this most likely just the cable? That last error looks quite
    >>worrying I do believe that I have another disk attached on the same
    >>cable as that drive though, so I would have thought I would also be
    >>seeing errors on wd0. Hmm
    >>
    >>

    >Cable, wonky controller bios, memory, any number of things. Cold boot
    >it and see if it continues.
    >>

    >I had this problem when I added a network card. I assume this is
    >likely PC hardware, which is often junk. No offense, it's just the
    >case. I have to deal with it too.
    >>

    >Andy
    >>
    >>

    >, I've just seen something quite disturbing :)
    >>

    >Recall the SMART status from before:
    >>

    >12 100 20 no online positive Device power cycle
    >count 66
    >>

    >this is now
    >>

    >12 100 20 no online positive Device power cycle
    >count 83
    >>

    >NetBSD 'resetting the device' (I have no idea if it is or not), wouldn't
    >cause this would it? I think I am off to check the power plug isn't
    >loose or something!!


    Yeah, those old style (non newer SATA style) power plugs are horrible.
    I've had one of those go bad too. I think you'd see different errors
    though. But I don't know how to interpret that SMART output.

    Yeah, it's one of those ugly 4-pin molex jobbies. I've actually never
    had such a thing happen to me before.

    I literally got errors just like this when I added a card to my box,
    on 1 of the 4 drives on the controller. Took out the card, everything
    is fine.

    That's really quite odd! That was with a network card you say? The only
    PCI cards I have in that box are two Intel NIC's as it goes. If the
    error's dont go away I will try moving the cards about.

    PC hardware, I'm telling ya. I know that's a "dust bin" type answer,
    but it's the truth.

    You're right there :) It seems quite hard to find decent hardware. Even
    this DFI board I have, which people tend to make a fuss about, seems to
    have a few BIS issues. I switched the IDE cables for primary and
    secondary IDE around and it refused to PST until I cleared the CMS.
    It's all just oh-so-very-annoying!

    Andy
  • No.7 | | 702 bytes | |

    I've just replaced the PSU with a spare I had laying about (dont
    particularly trust it mind) and I'm no longer seeing any odd messages,
    so it *may* well have just been a dodgy molex connector. I suppose it
    could have been heat related, or something else that happens over a
    period of time, so I shalln't get my hopes up just yet. I'll have to
    give it another week or two before I am happy. I find it far too odd
    that it'd run perfectly fine for 7 days, and then these errors would
    just suddenly appear!

    Something tells me it would be quite a smart move for me to rebuild that
    disk from the other disk in the array, just to make sure :)
  • No.8 | | 319 bytes | |

    It occurred to me that Andy Ruhl wrote in gmane.os.netbsd.general:

    PC hardware, I'm telling ya. I know that's a "dust bin" type answer,
    but it's the truth.

    Sadly, it doesn't help much, since there's no real alternative for consumers.
    And yes, that includes Apple.

Re: Various disk errors just showed up in dmesg :(


max 4000 letters.
Your nickname that display:
In order to stop the spam: 7 + 6 =
QUESTION ON "BSD"

EMSDN.COM