[TriLUG] Mirror remote website

Tanner Lovelace clubjuggler at gmail.com
Wed Nov 16 22:55:23 EST 2005


On 11/16/05, Douglas Ward <binaryflow at gmail.com> wrote:
> Wget returns an error that says something to the effect of "You cannot get a
> directory listing from the server." I suppose that means the server I am
> trying to mirror is set up to prevent mirroring?

No, it's just that by default the server doesn't create indexes.  Wget
and the like, however, really do web crawling, so what you need to do is
give it the main page and let it crawl from there.  The option for this
in wget is "-m" (for mirror).  Add "-np" (no parent) if there are links
above the url that you don't want to mirror.

Realize, of course, that if the website is something like php or
any kind of cgi files, all you will get is the resulting html output,
not the source of the files.  There really isn't a good way to do that
over http if you need the original source.

Cheers,
Tanner
--
Tanner Lovelace
clubjuggler at gmail dot com
http://wtl.wayfarer.org/
(fieldless) In fess two roundels in pale, a billet fesswise and an
increscent, all sable.



More information about the TriLUG mailing list