[TriLUG] Web Site Indexing
Lance A. Brown
lance at bearcircle.net
Tue Jan 4 17:04:28 EST 2005
Greetings,
A non-profit organization I volunteer time to is working towards migrating
their website to some kind of CMS platform to hopefully make it easier to
manager, etc. I've been asked if I can provide an inventory of the
material on their existing site to help them get a grip on the scale of the
task. They have several thousand pages currently.
From the requester: "The idea is that we would look at all the htm and
html files and grab the filename, title, keywords, and all the links" and
"... it would print out in export from something, looking like an excel
spread sheet."
I could write a tool to do this, but I don't really have the time. There
must be tools available to crawl a website and generate these kinds of
reports, but I'm not finding them. F/OSS is preferred, but I'm willing to
recommend a commercial solution if it'll do the job.
Can anyone offer a pointer?
Thanks,
--[Lance]
--
Celebrate The Circle: http://www.celebratethecircle.org/
Carolina Spirit Quest: http://www.carolinaspiritquest.org/
My LiveJournal: http://www.livejournal.com/users/labrown/
GPG Fingerprint: 409B A409 A38D 92BF 15D9 6EEE 9A82 F2AC 69AC 07B9
More information about the TriLUG
mailing list