[TriLUG] Tracking File Downloads
Ken MacKenzie
ken at mack-z.com
Sun Jan 25 10:43:25 EST 2015
I have a site architecture is:
CentOS 6.4 I think
nginx
mysql
Drupal 7 (installed from repos)
This site serves a podcast. Kind of a new endeavor. Well in this process
I have discovered tracking podcast users is rather a difficult ordeal.
Podcatchers of course make Google Analytics rather useless.
So what I have setup is goaccess and a script to parse through the nginx
logs and then work them through goaccess to build html reports that I can
access through Drupal.
But here is the other thing, I want to filter out failed downloads and
crawlers, podcatchers just scanning for new episodes but not downloading
them. In theory this is as close as I could get to determining "listens"
So my grep string:
grep .mp3 combined.log|grep 200| grep -v HEAD|goaccess -a > report.html
Ok that is a pseudo version.
Does that seem sensible. I eliminated HEAD requests from the report as I
noticed the podcatchers use that to confirm the file presence when parsing
the feed.
Thoughts?
More information about the TriLUG
mailing list