[TriLUG] Time-outs on a journelized filesystem

Jon Carnes jonc at nc.rr.com
Thu Feb 12 11:03:49 EST 2004


On Thu, 2004-02-12 at 01:37, Ian Meyer wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Alright, I have an on-topic question here.
> 
> On some forums I am on, there are a number if irrational people that 
> enjoy little more than to belittle Linux. One of them posted what you 
> see below in an attempt to attack it.
> 
> I am just curious (a) if this is a widespread problem (which I doubt, 
> since I haven't heard about it) and (b) if it is, what's the ETA for a 
> patch?
> 
> Thanks
> Ian
> 
> The quote:
> "Here's how this works: if you have a disk with many writes and reads, 
> the operating
> system hasto queue these. If you have a big queue, or heavy load, the 
> problem
> appears when you have writes trying to happen before reads. The 
> programs needing to
> read will be stopped until the queue flushes. Or, if you don´t 
> necessarily have a
> big queue, the problem can appear from deadlock resulting from a 
> process called
> "kjournald" waiting for a system call when there is a particularly long 
> write
> request in the queue. So, all programs depending on the kjournald 
> process begin to
> wait for kjournald to commit its writes, which never really happens, 
> and all
> programs begin to cascade into an "uninterruptible sleep" state, and 
> begin to stop.
> In theory, if you wait, your system can return, but this can be a long 
> wait with
> significant downtime. None of us can wait, and the only solution is to 
> reboot."
> 
What he's talking about can happen on a journaled file system running on
a single IDE drive. But the conditions to bring it about would drive any
such system into the dirt regardless of the OS.

The cure here would be to either use a SCSI disk subsystem across lots
of small disks (speeding up your write speeds by a log of 2), or to drop
back to a non-journalized file system like ext2 (though even the
non-journalized filesystem won't help that much - it just won't lock
up).

When I setup a heavy access server like a Web filter for a county school
system or a Mail router/scanner for a large organization, I still use
ext2 for the /var filesystems and use LVM to spread out the read/writes
across multiple SCSI disks. As a consequence the servers I build are
rock solid and handle the huge loads with lots of breathing room for
growth.

Special situations like servers under tremendous load require
specialized setups.  A generic - out of the box - setup will not work as
well.  That is true regardless of the OS.

BTW: I would be surprised if the journalized filesystem problem was
still a concern in 2.6

Jon Carnes




More information about the TriLUG mailing list