[TriLUG] Disk space calculations in Linux
Scott G. Hall
ScottGHall at BellSouth.Net
Fri Jan 25 15:25:48 EST 2008
<div class="moz-text-flowed" style="font-family: -moz-fixed">
Don't forget the '-k' option to ls -- it will list file sizes in blocks
similar to the way du does (use "ls -lk").
However, I noticed that for directories, 'ls -lk' lists them as 0 size,
but du correctly recognizes that
directories are just special files containing lists of other files, and
take up space too -- you'll notice
that large directories need several blocks to contain a large list of files.
But since space is allocated in disk allocation units (ie. clusters for
some filesystem types), you can
tweak the arguments to both ls and du by adding the "-B nnnn" argument,
where nnnn is the size
of a disk allocation unit for that filesystem type.
Note that even with the du command and -B option, you still don't match
the output of the df
command. Remember that the overhead on disks include swap space
partitions, space set aside
for alternate sectors for replacing ailing bad sectors in a filesystem,
partition table and boot loader
space requirements. And also remember that for some filesystem types
there are "sparse" files
that contain no data in big chunks of the file, and so space is
conserved and not allocated for those
bytes of a file that have no data.
Now mince that with persistent file-sharing data and transaction logging
of filesystem writes and
disk usage starts to become a little convoluted at the least.
Jeremy Portzer wrote:
> William Sutton wrote:
>> While we're discussing... how much space gets wasted in overhead of
>> files that allocate a particular block size but don't use all of the
>> blocks?
>
> I think you mean, files that don't use all the bytes in a block.
>
> That is an important difference - du - disk usage - will list the
> actual disk usage. The output of du will always be in increments of
> the file system block size (I'm not quite sure exactly how this is
> determined, but in most of my ext3 filesystems, this unit seems to be
> 4096 bytes, determined by running "dump2fs" - there may be simpler way
> to show this).
>
> For example, the following three files have these sizes shown by ls:
>
> $ ls -l dump*
> -rw-r--r-- 1 root root 0 Jan 24 20:40 dump0.txt
> -rw-r--r-- 1 root root 1 Jan 24 20:40 dump1.txt
> -rw-r--r-- 1 root root 120176 Jan 24 20:34 dump-hda6.txt
> -rw-r--r-- 1 root root 77492 Jan 24 20:34 dump-hdc1.txt
> -rw-r--r-- 1 root root 40910 Jan 24 20:33 dump.txt
>
> But du shows this:
>
> $ du -b dump*
> 0 dump0.txt
> 4096 dump1.txt
> 126976 dump-hda6.txt
> 81920 dump-hdc1.txt
> 40960 dump.txt
>
> Notice that a zero-byte file takes zero space on disk, but a 1-byte
> file takes 4096 bytes on disk, and all other files always use
> increments of 4096.
>
> For this reason, when you care about the actual space on disk, you
> should use "du" and not "ls". This difference normally doesn't amount
> to much, but it can if you have lots of very small files.
>
> Not sure if this answers a question anyone asked. :-)
>
> --Jeremy
>
>
</div>
--
Scott G. Hall
Raleigh, NC, USA
ScottGHall at BellSouth.Net
More information about the TriLUG
mailing list