Share via

Access query to return top 3 values within groups

Anonymous
2013-09-25T21:23:26+00:00

I'm using MS Access 2007. I'm trying to create a select query using the qbe to return the top 3 distributors for a list of customers based on the aggregate sales of each distributor (TP_HQ_CUST_NAME).  The below code brings back all distributors and a total of the sales but I only want to see the top 3. A customer may have anywhere from 1 to 6 distributores. I've looked at Allen Browne's explanation but it isn't making sense to me....sorry I'm new at this. I'm only using one table. 

I'm unable to post a jpg of my qbe but this is the sql that is generated:

SELECT

tbl_Local_Final_Reporting.CUST_ADDR_NUM,

Sum(tbl_Local_Final_Reporting.BASE_SLS_GP_DOL_CYTD)

AS

SumOfBASE_SLS_GP_DOL_CYTD,

tbl_Local_Final_Reporting.TP_HQ_CUST_NAME

FROM tbl_Local_Final_Reporting

GROUP BY

tbl_Local_Final_Reporting.CUST_ADDR_NUM,

tbl_Local_Final_Reporting.TP_HQ_CUST_NAME

ORDER BY

tbl_Local_Final_Reporting.CUST_ADDR_NUM,

Sum(tbl_Local_Final_Reporting.BASE_SLS_GP_DOL_CYTD) DESC;

so for instance, cust_addr 34135027 returns 5 distributors.  I only want to see the top 3.

cust_addr34133747 returns only 2 distributors and I want to see both.

Not sure what I'm doing wrong.  Thanks for the assistance!

Microsoft 365 and Office | Access | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments

Answer accepted by question author

Anonymous
2013-09-25T22:06:06+00:00

Rather than trying to amend the current query I'd suggest you create another query based on it.  If we assume your query is named qryTotalSales the second query would be like this:

SELECT Q1.*

FROM qryTotalSales AS Q1

WHERE

    (SELECT COUNT(*)+1

     FROM QryTotalSales AS Q2

     WHERE Q2.cust_addr_num = Q1.cust_addr_num

     AND Q2.sumofbase_sls_gp_dol_cytD > Q1.sumofbase_sls_gp_dol_cytd) <= 3

ORDER BY Q1.cust_addr_num, Q1.sumofbase_sls_gp_dol_cytd DESC;

Realistically you'll have to do this in SQL view, not query design view.

Was this answer helpful?

1 person found this answer helpful.
0 comments No comments

Answer accepted by question author

Anonymous
2013-09-26T17:41:58+00:00

Correlated subqueries are by nature rather slow, but 8000 rows is not a lot, so I'm surprised performance is so poor.  Proper indexing is important, particularly of columns used in joins or on which a query is restricted.  Also the ORDER BY clause in the first query is redundant and should be removed, and if the query is to be used as the basis for a report, the second query's ORDER BY clause should also be omitted.  A report ignores it and should be ordered by means of its internal sorting and grouping mechanism.

If a report is the end result, you could in fact revert to your original query, group the report by cust_addr_num and add a hidden unbound text box, txtCounter say, to the detail section, with a ControlSource property of  =1 and a Running Sum property of 'over group'.  Then in the detail section's Format event procedure put Cancel = Me.txtCounter  > 3.  The disadvantage of this, however, is that it will only include one of any ties for third place.  This will work in Print Preview or when the report is sent to a printer, but not in Report View.

Was this answer helpful?

0 comments No comments

4 additional answers

Sort by: Most helpful
  1. Anonymous
    2013-09-26T13:19:35+00:00

    Thanks for the quick response. Can you please explain how the second SELECT query works (after the WHERE statement)?  I'm not understanding what Q2 is.  Q1 is this new query, correct?  Thanks.

    Was this answer helpful?

    1 person found this answer helpful.
    0 comments No comments
  2. Anonymous
    2013-09-26T17:06:48+00:00

    Thank you for the explanation.  Makes total sense.  I tested it out with a couple of sample customer numbers and it works great.  When I run this for the entire data sampling (close to 8000 records for my first query), it runs and runs and after 3 1/2 hours, I still don't have any results.  Is there any way of speeding this up?

    Thanks again!

    Was this answer helpful?

    0 comments No comments
  3. Anonymous
    2013-09-26T16:36:40+00:00

    The SELECT statement within parentheses is a sub-query within the outer query's WHERE clause.  This type of subquery is known as a correlated subquery because it is restricted by the expression in its WHERE clause 'Q2.cust_addr_num = Q1.cust_addr_num'.  In this Q1 and Q2 are aliases for the two instances of your original query.  By giving each instance an alias like this they are differentiated, which allows the correlation.

    In effect the subquery runs once for each row returned by the outer query and restricts which of those rows are returned in the result table.   It does this by counting the rows where the cust_addr_num value is that of the outer query's current row, and the value of the sumofbase_sls_gp_dol_cytD column is greater than that of the same column in the current row returned by the outer query.  1 is added to the count, so for the row with the highest value of sumofbase_sls_gp_dol_cytD the count will be zero.  Adding 1 gives 1.  For the next highest the count is 1.  Adding one gives 2.  And so on.  If the value returned by the subquery is less than or equal to 3, then the row is returned by the outer query, i.e the top 3 rows per cust_addr_num value. QED.

    The name of the new query can be whatever you wish.  The aliases Q1 and Q2 only exist within the query; they are not the names of saved queries.

    Was this answer helpful?

    0 comments No comments