[TriLUG] SAN file locking

Joseph Mack NA3T jmack at wm7d.net
Tue Dec 20 12:57:41 EST 2011


On Mon, 19 Dec 2011, bak wrote:

> hosts still pretend that SANs give them a bunch of 
> contiguous blocks on physical disks instead of a 
> complicated metadisk abstraction of 
> maybe-allocated-or-deduplicated blocks which is too 
> complicated to explain just now. :)
>
> So as far as I know, nobody has taken the step of putting 
> metadata on another disk,

I think google keeps their metadata on separate disks so it 
can be searched faster. Not sure about the details.

> because it would require rethinking the way the underlying 
> stuff works and it's not entirely clear what problem would 
> be solved.
>
> What might be more likely is that a SAN will say "this set 
> of blocks is getting asked for an awful lot, I'm going to 
> keep it in cache / on an SSD until further notice."

yes.


>>> Some operating systems are OK with having a read-only 
>>> filesystem attached. But solutions like this
>>
>> you mean (ro) solutions?
>
> Yup. As opposed to clustered filesystems I suppose :)
>
>>> for the SAN space are not there, because the problem to 
>>> be solved would have to be
>>>
>>> -- Useful even with a read-only filesystem
>>> -- Requiring the sort of low-latency performance SAN provides
>>> -- Not more cheaply and easily deployed with a r/o NFS export
>>
>> I'm sorry. I don't know what you're trying to say here. I 
>> don't even get enough to ask you a question about it. Can 
>> you try again.
>
> Sure. Let me go back a step. Nobody buys SAN equipment as 
> general storage. It's just too expensive to be a hammer 
> for every nail. But it works well for the things mentioned 
> earlier -- VMWare, OLTP, etc.

OK

> So a SAN presenting blocks read-only to a bunch of hosts in order to
> solve a problem like 'let me share homedirs or /usr for a big stack of
> servers' would only be likely if the homedirs needed to be extremely
> fast and low latency, so fast that just deploying a (much cheaper) NAS
> exporting via NFS wouldn't do the job, and if you had a situation where
> not being able to write to homedirs or /usr was acceptable to the
> server's OS.

I was thinking only /usr being (ro) while /home was (rw) 
(see the scenario at the end of my posting here)

> The SAN way to solve the 'only pay once for /usr, get it for all the
> servers' would be to have the servers as VMWare guests, and have all of
> their root disks on the same storage container on the SAN, then use
> dedupe to squeeze the data down to just one on-disk instance of those
> blocks in /usr.
>
> --bak

At my work (beowulfs, large compute clusters) the machines 
and the disks are all one unit (DAS). However I've always 
thought that the storage and the computing should be 
separate. You upgrade racks on a staggered 3yr schedule, 
eventually replacing the whole machine every 3yrs. Disks 
get added to when you need more storage, and old small disks 
get replaced with bigger disks. User's jobs 
run anywhere on the cluster and applications and home 
directories would be mounted as needed. Speed to/from 
storage isn't a particular requirement as most of the time 
is spent computing.

I assume that SAN is out this situation because of cost and 
the speed not being needed. In this case a big NAS box would 
be fine.

Joe

-- 
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!



More information about the TriLUG mailing list