[Trilug-ontopic] [bash] copying `find`ed files across network

Thomas Gardner tmg at pobox.com
Thu May 24 08:32:47 EDT 2012

On 5/23/12, Tom Roche <Tom_Roche at pobox.com> wrote:
> [...]
> Worth checking:
> me at local:~$ ssh t
> ...
> me at remote:~ $ pushd /work/MOD3EVAL/nsu/boundary/BLD_ddm_saprc07tc
> me at remote:/work/MOD3EVAL/nsu/boundary/BLD_ddm_saprc07tc $ find . -type f |
> grep -ve 'CCTM\|CVS\|~$\|\.o$' | wc -l
>> 266
> me at remote:/work/MOD3EVAL/nsu/boundary/BLD_ddm_saprc07tc $ tar cfvz - $(find
> . -type f | grep -ve 'CCTM\|CVS\|~$\|\.o$') > /tmp/tar.gz
> ...
>> ./s_emis_defn.mod
> me at remote:/work/MOD3EVAL/nsu/boundary/BLD_ddm_saprc07tc $ tar tfz
> /tmp/tar.gz | wc -l
>> 266
> So it appears the problem is not with the `find`, or with `tar` args,
> but with `ssh` or the network pipe ... or am I missing something?

No, you've changed the problem.  You've broken the problem into steps.
The problem is not with find, tar, or ssh, but with how you quoted
the original command.  Doing:

$ ssh remote
$ cmd1 args $(cmd2 args)

is not the same as:

$ ssh remote "cmd1 args $(cmd2 args)"

In the second case, that $(cmd2 args) thing will run before the ssh
command is run, and therefore still on your local machine.  To prove
this to yourself, try running this command (don't change the quoting):

$ ssh remote "echo The name of this machine is $(hostname)."

and look at what you get.  Do you get the remote machine's name or
the name of the machine you issued the command on?  You should get
the name of the machine you issued the command on.  The same reason
you get that is the same reason doing:

$ ssh remote "cd whatever ; tar cfvz - $(find yadda)"

doesn't work right.  The $(find yadda) thing gets expanded by your
shell as part of the pre-execution evaluation of the command line.
In other words, it is expanded (in the case of the $(...) construct,
the command is run) before a process is even forked off to exec ssh
in.  If that's true, then where does that command inside the $(...)
have to run?  Since ssh hasn't even started up yet at this point,
it can only be running on your local machine.

Notice how if you just change the above sample to:

$ ssh remote 'echo The name of this machine is $(hostname).'

it does work right (you now get the name of the remote machine).
That's because the single quote does protect the command line from
expansion before the command is run, therefore the $(...) thing
does get passed to the remote machine so the shell over there
has to expand it.

One more side note:  I think in the original problem, you were doing
something akin to:

$ ssh remote "cd somewhere ; tar cf - ." | tar xf -

If you keep doing that, one of these days it will almost certainly
cause you a great deal of grief.  The problem is:  What happens
if the directory you specified doesn't exist on the remote machine
(say you had a typo in it)?  The answer is that the cd will fail,
but the remote tar will still run, sending your home directory back
through the pipe to be expanded into your current directory by the
local tar.  If the directory you were sitting in when you ran that
command was already populated (i.e. you were trying to merge the two
directories), now it's going to be a bit of a pain to remove the stuff
you just plunked down into it from your $HOME without removing what
was there before.  Instead, substitute for that ; an && like so:

$ ssh remote "cd somewhere && tar cf - ." | tar xf -

This way, if the cd fails, the tar won't run on the remote machine,
and you'll be very grateful it didn't.

I think you mentioned something about ``school'' in an earlier note.
As such, I'm going to guess you're just starting out.  As such, I'll
give you one more piece of advice:  If you're planning on using Unix
to any great extent in your career, some day, when you're on break
from school, sit down with a couple pots of coffee and spend a few
days really studying how the command line is interpreted (evaluated)
before the command is run, and also study quoting.  Looking back
at decades in this business, I'm just absolutely amazed at how few
people in the business have taken the time to come to even a surface
understanding of these very simple (not to be confused with easy)
concepts.  If you take that advice, you'll probably be able to build
a career on it.  Ask me how I know that.  :-)  It's really not all
that hard.  I don't have all the steps memorized, much less remember
exactly what order each step is done in (both of which are actually
very important), but even with my surface knowledge of it, I've
practically built a career around it.  I guess I should go back and
relearn it again myself.  Where's that guy who was supposed to bring
me more coffee?


More information about the Trilug-ontopic mailing list