[TriLUG] how to set up a cluster

Mon Sep 25 00:25:52 EDT 2006

Hi,

I am interested in setting up a linux cluster.  Actually, VERY interested,
since my job depends on it.  :)

I have the following hardware:

Dell 2950 server x2, each has PERC4 RAID controller and the Dell remote
access client (DRAC)
    8 GB RAM
    (2) x 74 GB for the OS
    (10) x 300 GB for storage
    Core 2 duo processors (I think)
    LPE 11000 HBA's x 2 per machine

Switch: unknown

Fiber connections

OS:  Red Hat Enterprise Server (AS) 4 EMT64 - latest update (3 I think)

SAN:  SAN:  StorageTek D280 (volumes/LUN's provided for test, not the whole
san)

enterprise app:  Perforce

The goal is to cluster the application so that there is no possibility of
downtime.  This is to be a dev environment that will have quite a lot of
users.  The organization has already used Perforce and likes that, just
wants to migrate to a Linux/SAN/GFS environment.

I have worked with clustering and also a lot with linux, but not together,
so that is my challenge.  I am wondering how things like LVM and NFS come
into play with the GFS once it is all up and running.  It is a given that
SAMBA will probably have to be running on there at some point, not sure how
that plays into the mix.

Also I am worried about block size.  Perforce is a CVS type database that
will store code as flat (tiny) text files, even only storing updates.  Great
for text storage.  However, this is a multimedia type company, and much of
the data may be full multimedia files (jpeg, video, game stuff, music, you
name it).  Therefore if a developer writes 10 edits to a C++ application in
text form, it only stores the changes and not even the whole text file each
time.  However (how's this for contrast?)  ----if a developer edits a video
clip and stores it ten times, perforce saves it ten separate times, each at
least as big as the first.  So although I hear that I should avoid the 64k
block size, I don't know what to go to, realistically.  If anyone has
specifically grappled with this I would love to know more about how/why you
decided whatever you did for your situation, and whether GFS handled it
alone or whether you had to also tell the application to use a certain bs.

Does anyone know of a good starting point, best practices, HOWTO's, etc.?  I
am reviewing Karl Knopper's book 'Enterprise Linux Cluster'.  I have to get
this cranked out NOW.  I really need some sort of guidelines or outline
since I need to set this up as a project.  Any information is GREATLY
appreciated.

Thanks
Marc