[TriLUG] rsync question

Smith, Brett bsmith at bloodhoundinc.com
Tue May 4 14:37:03 EDT 2004


CUT&PASTED for your Knowledge...
I take no credit for this one.
remember to use the ssh option. 



TITLE: Using rsync to mirror data between servers

Introduction
This LinuxAnswer describes how to mirror 2 systems using rsync over ssh.
I will only talk about a live server and a backup server where the backup
server will connect to the live server to pull the data that is to be backed
up.


Assumptions
1) You know how to open up a terminal and type a few basic commands.
2) You have a working ssh server and client installed. If not then see:
ftp://ftp.ca.openbsd.org/pub/OpenBS...ortable/INSTALL
3) You have private/public keys generated to allow passwordless logins to
the live server form the backup server. If not then see LinuxAnswer Public
key authentication with ssh. In relation that howto the backup server is the
client and the live server is the server.


Why would you want to?
There are many reasons so I'll just list a few:
1) Data transfer is fast as rsync only copies modified files
2) Running it over ssh encrypts the data transfer so it is more secure than
other methods


The real howto
1) Decide on the directories you need to backup on the live server assuming
it is a webserver this may be "/home/httpd"
2) Decide on the options you want. The most common I would use are:
-a Archive mode this is a combination of "-rlptgoD" basically it works
recursively and maintains file information such as creation dates,
permissions etc. See the man page for detailed info.
-v Increase the verbosity. This will let you see what is transferred
-z Compress data so that it is a quicker transfer
--delete-after Delete any files that have been deleted on the live server
-e ssh Most importantly, run the transfer over an ssh connection
A full list can be obtains from "man rsync".
3) Try a dry run on the backup server with "-n" to make sure any typos don't
totally screw your system. This will just show what would be done:
rsync -e ssh -avzn --delete-after user at liveserver:/home/httpd /home
4) If everything went as expected you can give it a go without -n
rsync -e ssh -avz --delete-after user at liveserver:/home/httpd /home
You should get the info about the files being transferred. Running it again
should be quicker as very little has probably changed.
5) That should be it, just try creating and deleting a few files and run
rsync to make sure the changes occur


Automating the process
The obvious answer running the rsync commands on the backup server via cron.
A basic example being to mirror every hour on the hour:
0 * * * * rsync -e ssh -avz user at liveserver:/home/httpd /home 2>&1 >
/var/log/hourly_backup.log
Then remove deleted files every night:
30 0 * * * rsync -e ssh --delete-after -avz user at liveserver:/home/httpd
/home 2>&1 > /var/log/nightly_backup.log

Additional Stuff::
In my case, I used rsync to mirror the webserver tree /www/web_x to a second
server's /www/ using ssh.
The options for rsync are simple, but have to be adapted to the situation.

rsync -vrougt --stats --rsh='ssh' --exclude-from=./rsync.excludes
user at 192.168.xxx.xxx:/www/web_x/* /www/web_x_mirror >
/var/log/rsync/rsync_log_$(date +%Y%m%d_%H%M)

in detail

you run rsync from a cronjob on the spare server.


-v means only to be verbose, use it to have a detailed report in the
log-file
-r means to sync recursive, descending into subtrees of website_x
-u means rsync shall update the existing files. Doing everytime you sync a
complete transmission of the files results in a immense dataflow, -u saves
time and money (nerves, too). You need the following 3 options to make -u
work properly.
-o owner preserve, keep the original owner for the files
-g group preserve, keep the original group for the files
-t time preserve, keep the timestamp
--stats makes the logfile prettier and more verbose
--rsh='ssh' means to use ssh as transfer-agent (read extra configuration
here)
--exclude-from=./rsync.excludes is the file, which says, which files should
not be synced. important for config files, which differ from system to
system.
user at 192.168.xxx.xxx is the user on the remote system and the remote
system's ip
: separates the remote ip from the remote path
/www/web_x/* are the files to sync
/www/web_x_mirror is the local path to which shall be synced
> /var/log/rsync_log_$(date +%Y%m%d_%H%M) is the logfile for the sync. the
$(date +%Y%m%d_%H%M) makes every logfile unique, appending the year(%Y), the
month(%m), the day(%d), the hour(%H) and the minute(%M) 

This cronjob is run every 6 hours, to make a possible server-death not as
bad as it could be.
If you have a running sendmail-configuration on your server, you can use a
bash script to run rsync with above options, and, in case of emergency,
you'll get an email.

A bash-script doing the job could look like this:

======================================================
#!/bin/bash
#
# RSYNC proto-script to mirror filesystem from main-server to spare-server
# needs file(rsync.excludes) in same directory
# written by Jan Schenk 20020613
#
if ping -c 1 -w 5 192.168.0.1 > /dev/null ;
then
rsync -vrougt --stats --rsh='ssh' --exclude-from=./rsync.excludes
johndoe at 192.168.0.1:/wwwroot/site1/* /wwwroot/site1_mirror >
/var/log/rsync/rsync_log_$(date +%Y%m%d_%H%M)
echo 'main server is up'
else
echo 'rsync for $(date +%Y%m%d_%H%M) didn't work properly, please check
back' | sendmail -F rsync.cronjob@ -- johndoe at kc26.net
fi
======================================================



-----Original Message-----
From: Lee [mailto:elfick at trilug.org]
Sent: Tuesday, May 04, 2004 2:22 PM
To: Triangle Linux Users Group discussion list
Subject: Re: [TriLUG] rsync question


Roy Vestal wrote:

>Before we get started, yes I'm a *nix admin who has *never* used rsync. Now
>with that out of the way:
>
>I'd like to setup an rsync to copy a mirror for our internal site. I have
>already used wget to download the mirror to my internal server but I'd like
>to rsync it to stay up to date. I'd like to run it as a cron job.
>  
>
Roy,
A mirror of what? From where? For instance, lets say it is the gentoo 
source mirror on ibiblio. As I understand it the command to keep it up 
to date would be:
"rsync www.ibiblio.org::gentoo /path/to/your/mirror"
Note the double colon that indicates you are connecting to an rsync server.
-- 
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
TriLUG PGP Keyring         : http://trilug.org/~chrish/trilug.asc

This email message is for the sole use of the intended recipients(s) and may contain confidential and privileged information of Bloodhound Software, Inc.. Any unauthorized review, use, disclosure is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.



More information about the TriLUG mailing list