[TriLUG] Was: Awk question Now: Awk, Perl, SQL

Don Jerman djerman at pobox.com
Wed Aug 8 11:34:32 EDT 2007


On 8/8/07, Mark Freeze <mfreeze at gmail.com> wrote:
> Hi Jeremy,
> Thanks for the SQL code.  We are going to give it a go this afternoon
> to see how it improves performance.
>
> Deleteing duplicates within the database is a seperate step that we
> take before we summarize the packeted data. We delete the duplicate
> records and then summarize the remaining data.  The reason that we
> import the dups in the first place is because we also do some
> calculation with them before we delete them. (We are now doing these
> calculations in Perl before the SQL import.)
>
> Thanks to everyone for the ideas.  (Especially Robert who just seems
> interested in pointing out that we've written some bad code. Very
> helpful Robert, very helpful...)
>
> Also, I'd still be interested in seeing if anyone would like to throw
> us a bone with an awk script to test.
>
> Regards,
> Mark.
>
It's frequently faster to create a new table with appropriate
constraints than to process deletes on a table. (ymmv, I don't install
MySQL)

Consider benchmarking CREATE TABLE AS SELECT ... vs your delete
routines. You can always follow that with dropping and renaming tables
and rebuilding indexes, if it's faster.



More information about the TriLUG mailing list