[TriLUG] copying files
Joseph Mack NA3T
jmack at wm7d.net
Tue Jun 19 22:15:42 EDT 2012
On Tue, 19 Jun 2012, Jeff Schornick wrote:
> On Tue, Jun 19, 2012 at 9:39 PM, Joseph Mack NA3T <jmack at wm7d.net> wrote:
>> I haven't used rsync. So after the initial phase, both ends know the files
>> at each end and when I add a new file at one end, rsync will notice and just
>> handle it?
>
> Not quite.
>
> On each synchronization run, rsync creates a local list from the
> source directory, while simultaneously creating the analogous list on
> the remote end. This means if you have 1000 files, you may be looking
> at 1000 fstats on each end. However, these checks are both done
> locally on the corresponding machines. As long as the target system's
> local file I/O isn't significantly slower than the source machine's,
> you shouldn't be introducing any additional delay.
>
> After both lists have been generated, rsync uses a minimal amount of
> network traffic to compare the lists and generate a final list of
> which files need to be updated. As expected, only those files are
> sent over the network.
>
> After the synchronization is complete, the generated lists get tossed
> out as dirty laundry. There is no long running daemon which attempts
> to keep them up-to-date in realtime. However, I imagine someone has
> created a slick piece of code using inotify to do just that.
OK, so I'd have to invoke rsync every 5 mins. Assembling the
list of files at each end has to be done anyhow (eg find).
Presumably 1000 fstats take the same time no matter whether
find or rsync then processes the list. The problem then is
comparing the lists at each end.
cp -auv is really slow
rsync you say is fast (and I believe you).
but I already have my list from `find`, so there's no extra
cost if I use find.
The copy of the files takes the same time no matter which
way I assembled the list of files to be copied.
So `find` followed by `cp --parents` or `cpio` seems to be
it.
Alan points out the resilience of rsync. This is a good
feature, but as it turns out (and I didn't say this), I
don't mind loosing an occassional file, but throughput is
high priority. The backup machine is writing files from many
sources and it only has a few seconds to service a source
machine, or it will fall over with the load.
Joe
--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!
More information about the TriLUG
mailing list