Web lists-archives.com

Re: help with query to count rows while excluding certain rows




On Fri, Jan 1, 2016 at 2:12 PM, Peter Brawley
<peter.brawley@xxxxxxxxxxxxx> wrote:
> On 12/31/2015 0:51, Larry Martell wrote:
>>
>> I need to count the number of rows in a table that are grouped by a
>> list of columns, but I also need to exclude rows that have more then
>> some count when grouped by a different set of columns. Conceptually,
>> this is not hard, but I am having trouble doing this efficiently.
>>
>> My first counting query would be this:
>>
>> SELECT count(*)
>> FROM cst_rollup
>> GROUP BY target_name_id, ep, roiname, recipe_process,
>> recipe_product, recipe_layer, f_tag_bottom,
>> measname, recipe_id
>>
>> But from this count I need to subtract the count of rows that have
>> more then 50 rows with a different grouping:
>>
>> SELECT count(*)
>> FROM cst_rollup
>> GROUP BY target_name_id, ep, wafer_id
>> HAVING count(*) >= 50
>>
>> As you can see, the second query has wafer_id, but the first query does
>> not.
>>
>> Currently I am doing this in python, and it's slow. In my current
>> implementation I have one query, and it selects the columns (i.e.
>> doesn't just count), and I have added wafer_id:
>>
>> SELECT target_name_id, ep, roiname, recipe_process,
>> recipe_product, recipe_layer, f_tag_bottom,
>> measname, recipe_id, wafer_id
>> FROM cst_rollup
>>
>> Then I go through the result set (which can be over 200k rows) and I
>> count the number of rows with matching (target_name_id, ep, wafer_id).
>> Then I go through the rows again and regroup them without wafer_id,
>> but skipping the rows that have more then 50 rows for that row's
>> (target_name_id, ep, wafer_id).
>>
>> Is this clear to everyone what I am trying to do?
>
>
> If I've understand this correctly, the resultset you wish to aggregate on is
> ...
>
> select target_name_id, ep, wafer_id
> from cst_rollup a
> left join (               -- exclude rows for which wafer_id count >= 50
>   select name_id, ep, wafer, count(*) n
>   from cst_rollup
>   group by target_name_id, ep, wafer_id
>   having n >= 50
> ) b using ( target_name_id, ep, wafer_id )
> where b.target_name is null ;
>
> If that's so, you could assemble that resultset in a temp table then run the
> desired aggregate query on it, or you could aggregate on it directly as a
> subquery.

That query gives:

ERROR 1137 (HY000): Can't reopen table: 'a'

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql