Databases

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • SCSI disk: still the way to go?

    14 answers - 437 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hi guys,
    I have to update a Linux box with PostgreSQL on it, essentially for data
    warehousing purposes. I had set it up about 3 years ago and at that time the
    best solution I had been recommended was to use SCSI disks with hardware
    RAID controllers.
    Is this still the way to go or things have recently changed? Any other
    suggestion/advice? What about SAN?
    Thanks.
    Cheers,
    Riccardo
  • No.1 | | 815 bytes | |

    Tue, 2006-05-30 at 16:28, Riccardo Inverni wrote:
    Hi guys,

    I have to update a Linux box with PostgreSQL on it, essentially for
    data warehousing purposes. I had set it up about 3 years ago and at
    that time the best solution I had been recommended was to use SCSI
    disks with hardware RAID controllers.

    Is this still the way to go or things have recently changed? Any
    other suggestion/advice? What about SAN?

    Actually, modern SATA server drives are now considered competitive with
    the proper RAID controller.

    Nowadays most people seem to recommend the Areca controllers. I haven't
    used them myself, but would be happy to test them some day.

    (end of broadcast)
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org
  • No.2 | | 1021 bytes | |

    SAS and SATA will give you the best throughput for your array total. U320
    is limited to 320MB/channel.

    Alex

    5/30/06, Scott Marlowe <smarlowe (AT) g2switchworks (DOT) comwrote:

    Tue, 2006-05-30 at 16:28, Riccardo Inverni wrote:
    Hi guys,

    I have to update a Linux box with PostgreSQL on it, essentially for
    data warehousing purposes. I had set it up about 3 years ago and at
    that time the best solution I had been recommended was to use SCSI
    disks with hardware RAID controllers.

    Is this still the way to go or things have recently changed? Any
    other suggestion/advice? What about SAN?

    Actually, modern SATA server drives are now considered competitive with
    the proper RAID controller.

    Nowadays most people seem to recommend the Areca controllers. I haven't
    used them myself, but would be happy to test them some day.

    (end of broadcast)
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org
  • No.3 | | 979 bytes | |

    How much money do you want to spend? If you don't care, SAN is probably the way
    to go.

    How much data do you have to store? If you can afford to fit it onto scsi, scsi
    probably is still the way to go.

    , sata arrays have come a long way in 3 years, and they are by FAR the
    cheapest solution out there. Do some research and see if they're good enough for
    you.

    Tue, 30 May 2006, Riccardo Inverni wrote:

    Hi guys,

    I have to update a Linux box with PostgreSQL on it, essentially for data
    warehousing purposes. I had set it up about 3 years ago and at that time the
    best solution I had been recommended was to use SCSI disks with hardware
    RAID controllers.

    Is this still the way to go or things have recently changed? Any other
    suggestion/advice? What about SAN?

    Thanks.

    Cheers,
    Riccardo

    (end of broadcast)
    TIP 5: don't forget to increase your free space map settings
  • No.4 | | 1214 bytes | |

    Scott Marlowe wrote:
    Tue, 2006-05-30 at 16:28, Riccardo Inverni wrote:
    >Hi guys,
    >>

    >I have to update a Linux box with PostgreSQL on it, essentially for
    >data warehousing purposes. I had set it up about 3 years ago and at
    >that time the best solution I had been recommended was to use SCSI
    >disks with hardware RAID controllers.
    >>

    >Is this still the way to go or things have recently changed? Any
    >other suggestion/advice? What about SAN?


    Actually, modern SATA server drives are now considered competitive with
    the proper RAID controller.

    And for a DW application they are the most megabyte per dollar you can by.

    Nowadays most people seem to recommend the Areca controllers. I haven't
    used them myself, but would be happy to test them some day.

    I have heard good things about the Areca, but I have never used them. I
    have had excellent luck with the LSI controllers however.

    Sincerely,

    Joshua D. Drake

    (end of broadcast)
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org
  • No.5 | | 2962 bytes | |

    Compare these two drives:

    Prices:

    - SAS - ~$950
    SATA - ~$320

    For a third of the price you can have 90% of the throughput performance,
    which is probably where you will be most stressing your drives in a data
    warehouse.

    I have only seen good benchmarks from LSI's MegaRAID controllers for SCSI in
    linux, I have seen good results from LSI, 3Ware (now AMCC) and Areca in
    Linux for their SATA products (in RAID 10). There are plenty of large drive
    number chasis out there with SATA hot swap bays if you want them. Tyan
    makes a great dual CPU board with two independant PCI-X buses. that will
    give 1066MB/sec total through put each which I have great benchmark number
    from.

    it's possible to reach these numbers with SAN, but it will cost major major
    $$s. Each FC line in a SAN is typically 2Gb last time I checked, so you
    need multiple channels to acheive a max of 1066MB/sec throughput per PCI-X
    bus. If you run the numbers, you theoretically need 24 drives in a RAID 10
    to get max throughput (Areca makes a 24 channel SATA card:
    - Although I
    couldn't find one with multilane support). I have seen chassis that can
    hold 40 drives. If you go for the 74Gig cousin that has similar throughput,
    which you can get EM for $160/each you are talking just about $6400 in
    drives, plus about $4k for the chasis (
    ), plus about $5k for
    other components (depending on RAM/CPU), so a massively kick ass whitebox
    can be had for about $16k that will acheive close to the maximum theoretical
    throughput acheivable in a single server for MB/sec.

    Now there are arguments to be had about splitting up table spaces etc, but I
    present this as a concrete example of components that can be had for not
    alot of money to build a majorly kick ass server using SATA technology.

    Alex

    5/30/06, Ben <bench (AT) silentmedia (DOT) comwrote:

    How much money do you want to spend? If you don't care, SAN is probably
    the way
    to go.

    How much data do you have to store? If you can afford to fit it onto scsi,
    scsi
    probably is still the way to go.

    , sata arrays have come a long way in 3 years, and they are by
    FAR the
    cheapest solution out there. Do some research and see if they're good
    enough for
    you.

    Tue, 30 May 2006, Riccardo Inverni wrote:

    Hi guys,

    I have to update a Linux box with PostgreSQL on it, essentially for
    data
    warehousing purposes. I had set it up about 3 years ago and at that time
    the
    best solution I had been recommended was to use SCSI disks with hardware
    RAID controllers.

    Is this still the way to go or things have recently changed? Any other
    suggestion/advice? What about SAN?

    Thanks.

    Cheers,
    Riccardo
    --
    (end of broadcast)
    TIP 5: don't forget to increase your free space map settings
  • No.6 | | 1014 bytes | |

    Hello,

    3 weeks ago I install a PostgreSQL 8.1.3 server on Windows 2003 server standar edition.
    The box is a NEC express5800 TM800 with 4 SATA/300 250 Gb 7200 rpm in RAID10 (0+1).
    It works fine, faster than the old server ALTS with SCSI-3 disks Ultra160 10000 rpm.
    I upgraded to 8.1.4.
    The embded RAID controler is based on an Intel chipset.
    SATA seem convenient.

    Luc
    Message
    From: Riccardo Inverni
    To: pgsql-general (AT) postgresql (DOT) org
    Sent: Tuesday, May 30, 2006 11:28 PM
    Subject: [GENERAL] SCSI disk: still the way to go?

    Hi guys,

    I have to update a Linux box with PostgreSQL on it, essentially for data warehousing purposes. I had set it up about 3 years ago and at that time the best solution I had been recommended was to use SCSI disks with hardware RAID controllers.

    Is this still the way to go or things have recently changed? Any other suggestion/advice? What about SAN?

    Thanks.

    Cheers,
    Riccardo
  • No.7 | | 1432 bytes | |

    riccardo.inverni (AT) gmail (DOT) com ("Riccardo Inverni") writes:
    ** I have to update a Linux box with PostgreSQL on it, essentially
    for data warehousing purposes. I had set it up about 3 years ago and
    at that time the best solution I had been recommended was to use
    SCSI disks with hardware RAID controllers. ** Is this still the way
    to go or things have recently changed? Any other suggestion/advice?
    What about SAN?

    You're probably better off with SATA, now.

    SCSI disks may individually be faster and more reliable than SATA
    disks, but you can probably get 3x as many SATA disks for the price of
    the SCSI disks, and 3x more *probably* makes up for the deficiences,
    given a good SATA host adapter. (Areca, 3Ware are all well regarded.)

    SAN doesn't change the question; you'll still hold much the same
    debate, whether to compose the SAN of SCSI or SATA disk, and the
    answers will be similar.

    The challenge you'll see on Linux is that Very Large Filesystems are
    somewhat novel.

    When we were trying to do DW stuff on Linux + + FibreChannel +
    EMC DiskArray, we too frequently found filesystems keeling over. It
    was neither cheap nor reliable.

    At some point, I want to try FreeBSD++Areca+SATA Array, and see
    if that gives a better answer for this. I'm afraid I don't trust
    Linux for this sort of thing anymore :-(.
  • No.8 | | 480 bytes | |


    When we were trying to do DW stuff on Linux + + FibreChannel +
    EMC DiskArray, we too frequently found filesystems keeling over. It
    was neither cheap nor reliable.

    When is When? Not trying to start a flame war but I am curious as to
    your specifications to make this stuff work. Was it kernel 2.4 or 2.6?
    It 2.6, which? What filesystem are we talking about?

    Are we talking the last 12 months? earlier then that?

    Sincerely,

    Joshua D. Drake
  • No.9 | | 2140 bytes | |

    jd (AT) commandprompt (DOT) com ("Joshua D. Drake") writes:
    >When we were trying to do DW stuff on Linux + + FibreChannel +
    >EMC DiskArray, we too frequently found filesystems keeling over. It
    >was neither cheap nor reliable.
    >

    When is When? Not trying to start a flame war but I am curious as to
    your specifications to make this stuff work. Was it kernel 2.4 or 2.6?
    It 2.6, which? What filesystem are we talking about?

    Are we talking the last 12 months? earlier then that?

    The project ended last fall, roughly speaking.

    And that was with kernel 2.6; 2.4 was a complete non-starter as far as
    was concerned.

    If memory serves, 2.6.13 was about the best option, but it turned out
    to be pretty easy to toast filesystems.

    When Josh presented last year at SCN
    <>, he had
    a "sidebar" where he discussed the contortions of kernel versioning he
    had to go through in order to get PostgreSQL to play reasonably well
    with + Disk Array; it seemed quite similar to our experience,
    particularly in that he had to pick very specific kernel versions in
    order to get a modicum of stability.

    The trouble seems to be that what with the vast amounts of hacking
    Gitting into the Linux kernel, somewhere in between [FibreChannel
    Drivers | SCSI processing layer | VFS | FileSystems], things aren't
    anywhere near completely stable on AMD64.

    There's not one place to pin down: it's somewhere in the interfacing
    between all of these "layers."

    If you take out any of the "exotic" parts, things get better:
    - introduces 64 bittedness, and changes memory addressing over
    "plain old Intel."
    - "Everyone" runs ATA, so funky FibreChannel is exotic enough that it
    doesn't get used enough to get easily debugged.

    But for real high performance, you *want* 64 bits, and FibreChannel
    interfaces. And Linux just isn't ready for that. Nor is *BSD, I
    expect, for that matter, but they're more straightforward about
    documenting what *isn't* expected to work.
  • No.10 | | 1389 bytes | |

    5/31/06, Joshua D. Drake <jd (AT) commandprompt (DOT) comwrote:
    When we were trying to do DW stuff on Linux + + FibreChannel +
    EMC DiskArray, we too frequently found filesystems keeling over. It
    was neither cheap nor reliable.

    I completely agree with the above statements with a small objection.
    At our place we tried out a 70k$ SAN from a major vendor and hooked up
    all the 2g fibre cables only to find out the box could only do around
    50mb/sec in real world bonnie++/dd tests. It was a huge messthe
    performance team from the vendor couldn't do anything about it except
    to try and upsell us to the 200k$ product. All the time we could
    never get hard numbers about what the box was supposed to do, etc.
    Meanwhile the sales reps were lecturing us about 'enterprise this,
    enterprise that'barf.

    now for the objection:
    I think you guys need to take a look Xyratex, specifically their FC
    attached SAS enclosure. It is dual 4gb fc and can hook up SAS or SATA
    drives. best of all, it's cost competitve with attached scsi for
    total system cost. The flexibilty of being able to hook up SATA or SAS
    is great.

    merlin

    (end of broadcast)
    TIP 9: In versions below 8.0, the planner will ignore your desire to
    choose an index scan if your joining column's datatypes do not
    match
  • No.11 | | 863 bytes | |

    mmoncure (AT) gmail (DOT) com ("Merlin Moncure") writes:
    Xyratex

    From their web site, they sound like they'll be as challenging to get
    straight answers from as any of the other disk array vendors :-(.

    And there's nothing about what I see there that seems to address
    anything at all about the "instability hiding in there somewhere"
    problem. I see no reason at all for a Xyratex FC array to be the
    slightest bit more stable than any other vendor's product.

    The only reason I'd be interested is if I knew that their products
    were priced at some fraction substantially lower than 1/1 of the
    competing products from EMC, IBM, and such. And it seems pretty clear
    that that would involve the usual irritating sets of vendor visits and
    negotiations, which amount to, "No, it's not gonna be cheap."
  • No.12 | | 222 bytes | |

    Hi Alex,
    thanks for the answer (thanks to the other guys too!).
    SATA - ~$320
    Is there a particular reason why you chose a SATA-150 drive? What about
    SATA-300?
    Cheers,
    Riccardo
  • No.13 | | 927 bytes | |

    5/31/06, Chris Browne <cbbrowne (AT) acm (DOT) orgwrote:
    mmoncure (AT) gmail (DOT) com ("Merlin Moncure") writes:
    Xyratex

    From their web site, they sound like they'll be as challenging to get
    straight answers from as any of the other disk array vendors :-(.

    valid concerns. I don't have an answer yet except to say that it is
    price competitive with attached scsimuch (much) cheaper than the
    major SAN vendors. Let's put it this way, we were quoted a price
    about half what a major san vendor charges for their 2gbit fc product
    with less cache. Also at 16 drives for 3u space its about as dense
    storage as you can get.

    They were willing (through their retailer) to set us up with a 30 day
    trial on the box. results to follow.

    merlin

    (end of broadcast)
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org
  • No.14 | | 436 bytes | |

    Maximum througput of a single drive is around 80MB/second, a 300MB/sec
    interface won't change that.

    Alex

    6/1/06, Riccardo Inverni <riccardo.inverni (AT) gmail (DOT) comwrote:

    Hi Alex,

    thanks for the answer (thanks to the other guys too!).

    SATA - ~$320
    --
    Is there a particular reason why you chose a SATA-150 drive? What about
    SATA-300?

    Cheers,
    Riccardo
    --

Re: SCSI disk: still the way to go?


max 4000 letters.
Your nickname that display:
In order to stop the spam: 2 + 1 =
QUESTION ON "Databases"

EMSDN.COM