[TriLUG] Web Site Indexing

erik at underhanded.org erik at underhanded.org
Tue Jan 4 20:20:12 EST 2005


On Wed, Jan 05, 2005 at 01:17:46AM +0000, erik at underhanded.org wrote:
> Well, I'm not sure if this is exactly what you want, but may make for a
> quicker job if you have to write it yourself (and know perl).  But I had
> to recently write a script to pull out all the various style tags from a
> large website, and it could probably be modified to pull out stuff from
> <title> tags and such.  I'll attach it here, feel free to mangle it and
> do whatever.
> 
> If you don't have direct access to the file structure, the following may
> help with getting a local copy:
> 
> wget -A htm,html -r -l 20 http://www.website.tld/
> 
> Hope it helps. ;)

Some reason GPG got screwed up and complained of a bad sig.  Blah.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://www.trilug.org/pipermail/trilug/attachments/20050105/9164490f/attachment.pgp>


More information about the TriLUG mailing list