[TriLUG] Awk question

Steve Litt slitt at troubleshooters.com
Tue Aug 7 23:39:00 EDT 2007


On Tuesday 07 August 2007 17:00, Mark Freeze wrote:
> >When you talk about fast, I assume you mean "fast to
> >implement". You don't really care how fast something like
> >this runs do you?
>
> Actually I do.  We run this file every day for one of our clients.  We
> originally wrote the script in Perl that imported the records into a
> MySQL database, then executed stored procedures to eliminate
> duplicates and summarize the packeted information.

My impression is that properly written awk runs very fast -- maybe a little 
faster than Perl.

Awk also develops very fast for problems of the type you're solving. See my 
awk content here:

http://www.troubleshooters.com/codecorn/awk/index.htm

and here

http://www.troubleshooters.com/lpm/200704/200704.htm#_An_awk_Primer

Back to your specific problem, if your fields are *always* separated by 
spaceDashSpace, I'd pipe it through sort into an awk program, with simple 
break logic, that summarizes. I'd imagine you're talking a ten or twenty line 
program at the most.

If the SpaceDashSpace isn't reliable, I'd run it through an initial awk 
program that takes varients with no space and the like, and outputs as a 
reliable SpaceDashSpace and then pipe that through sort and the summarization 
awk.

HTH

SteveT



More information about the TriLUG mailing list