[TriLUG] Any SQL experts out there?

Sat Feb 26 14:41:54 EST 2005

Thanks - I'll work on it. As it happens for thsi case I only need to run
the query a few times, so performance isn't a problem, but the point is
well taken.

ap

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists at perrin.socsci.unc.edu * andrew_perrin (at) unc.edu

On Fri, 25 Feb 2005, Tom Bryan wrote:

> On Friday 25 February 2005 07:46 pm, Andrew Perrin wrote:
> > Actually it does work slightly modified - here's the query that works for
> > my application:
> >
> > leted=# Select s.partid, s.zip,  c.coder, c.ca_date, c.tone_angry
> > leted-# from participants s,code_assignments c
> > leted-# where c.ca_partid  = s.partid
> > leted-# and c.ca_date = (select min(c2.ca_date) from code_assignments c2
> > where c.ca_partid=c2.ca_partid)
> > leted-# ;
>
> I just thought that I'd point out that your query has a nested subquery (the
> select on the right of that equals in your where clause).  I'm not sure how
> postgreSQL actually handles this query, but I know that Oracle executes this
> type of query by running the correlated subquery for each row in the
> code_assignments (c) table.  For large tables (millions of rows), performance
> of this type of query can really suck.
>
> I would normally try to replace the correlated subquery with a view or an
> inline view.  So, for example (untested code follows)
>
> select
>     s.partid, s.zip,  c.coder, c.ca_date, c.tone_angry
> from
>     participants s,
>     code_assignments c,
>     (select min(c2.ca_date) min_date, c2.ca_partid
>         from
>             code_assignments c2
>         group by
>             c2.ca_partid) d
> where d.ca_partid = c.ca_partid
> and c.ca_partid = s.partid
> and c.ca_date = d.min_date
>
> Since the nested select appears here in the from clause, it can be executed
> once instead of once per row in the other table.  I'm pretty sure that
> postgreSQL supports this type of subquery.  If not, you can create a view and
> use that view for the "d" table in this query.  In either case, a query
> structured in this way is much easier for the RDBMS to optimize.
>
> ---Tom
>