Saturday, January 23, 2016

Oracle 12c enhanced syntax for row-limiting (A.K.A Top-N queries)

Introduction

Prior to Oracle 12c we had to use the ROWNUM pseudo column in order to implement Top-N reporting. Following are few simple examples of Top-N queries:
  • Query the top 3 employees with the highest salaries
  • Query the first 2 employees who have joined the company

Using the ROWNUM pseudo column had worked well, however it had major disadvantages:
  • It's not a straight-forward syntax, especially if you need to implement query paging such as query the second block of the top 3 employees with the highest salaries.
  • It allows us specifying the Top-N rows but it doesn't allows specifying the Top-N percent (for example, query the top 25% percent of employees with the highest salaries.

In Oracle 12c, SQL SELECT syntax has been enhanced in order to address the 2 disadvantages with the ROWNUM pseudo column listed above. You can specify the number of rows or percentage of rows to return with the FETCH NEXT keywords. You can use the OFFSET keyword to specify that the returned rows begin with a row after the first row of the full result set. In the case of a negative OFFSET, the OFFSET value is treated as 0.

Demonstration

For the demonstration I will create a table of employees and populate it with 10 records:
SQL> CREATE TABLE EMP
  2  (
  3     id          NUMBER,
  4     name        VARCHAR2 (20),
  5     salary      NUMBER,
  6     hire_date   DATE,
  7     CONSTRAINT id_pk PRIMARY KEY (id)
  8  );
Table created.

SQL> insert into EMP value values (1, 'DAVID', 8500, '07-APR-15');
1 row created.

SQL> insert into EMP value values (2, 'JOHN', 10000, '17-MAR-15');
1 row created.

SQL> insert into EMP value values (3, 'JANE', 13500, '23-DEC-15');
1 row created.

SQL> insert into EMP value values (4, 'DAN', 15000, '02-JAN-15');
1 row created.

SQL> insert into EMP value values (5, 'RACHEL', 19000, '20-FEB-15');
1 row created.

SQL> insert into EMP value values (6, 'BRAD', 20000, '25-JUN-15');
1 row created.

SQL> insert into EMP value values (7, 'TIM', 15000, '16-MAR-15');
1 row created.

SQL> insert into EMP value values (8, 'KELLY', 9000, '28-APR-15');
1 row created.

SQL> insert into EMP value values (9, 'NICK', 7500, '04-FEB-15');
1 row created.

SQL> insert into EMP value values (10, 'ERIK', 6000, '09-JUL-15');
1 row created.

SQL> commit;
Commit complete.

SQL> select * from EMP;
        ID NAME                     SALARY HIRE_DATE
---------- -------------------- ---------- ---------
         1 DAVID                      8500 07-APR-15
         2 JOHN                      10000 17-MAR-15
         3 JANE                      13500 23-DEC-15
         4 DAN                       15000 02-JAN-15
         5 RACHEL                    19000 20-FEB-15
         6 BRAD                      20000 25-JUN-15
         7 TIM                       15000 16-MAR-15
         8 KELLY                      9000 28-APR-15
         9 NICK                       7500 04-FEB-15
        10 ERIK                       6000 09-JUL-15
10 rows selected.

In my first example I will query the first 2 employees who have joined the company:
SQL>   select *
  2    from EMP
  3    ORDER BY hire_date
  4    FETCH FIRST 2 ROWS ONLY;
        ID NAME                     SALARY HIRE_DATE
---------- -------------------- ---------- ---------
         4 DAN                       15000 02-JAN-15
         9 NICK                       7500 04-FEB-15
Now, let's demonstrate a query paging via the OFFSET clause. In this example, I will query the second block of the first 2 employees who have joined the company:
SQL>   select *
  2    from EMP
  3    order by hire_date
  4    offset 2 rows fetch next 2 ROWS ONLY;
        ID NAME                     SALARY HIRE_DATE
---------- -------------------- ---------- ---------
         5 RACHEL                    19000 20-FEB-15
         7 TIM                       15000 16-MAR-15
Now, I will query the top 3 employees with the with the highest salaries:
SQL>   select *
  2    from EMP
  3    order by salary DESC
  4    fetch first 3 rows only;
        ID NAME                     SALARY HIRE_DATE
---------- -------------------- ---------- ---------
         6 BRAD                      20000 25-JUN-15
         5 RACHEL                    19000 20-FEB-15
         4 DAN                       15000 02-JAN-15
In my sample table there are 2 employees with the salary of 15000 : DAN and TIM, but because I used the "ONLY" clause it limited the output to be exactly 3 rows. By replacing the "ONLY" clause with the "WITH TIES" clause, it will diplay all the top-3 employees even if there is more than one employee with the 3rd highest salary , as follows:
SQL>   select *
  2    from EMP
  3    order by salary DESC
  4    fetch first 3 rows with ties;
        ID NAME                     SALARY HIRE_DATE
---------- -------------------- ---------- ---------
         6 BRAD                      20000 25-JUN-15
         5 RACHEL                    19000 20-FEB-15
         4 DAN                       15000 02-JAN-15
         7 TIM                       15000 16-MAR-15

In my last example I will demonstrate how to query the Top-N percent using the PERCENT clause. The following query retrieves the top 40% of employees with the lowest salary:
SQL>   select *
  2    from EMP
  3    order by salary
  4    fetch first 40 percent rows only;
        ID NAME                     SALARY HIRE_DATE
---------- -------------------- ---------- ---------
        10 ERIK                       6000 09-JUL-15
         9 NICK                       7500 04-FEB-15
         1 DAVID                      8500 07-APR-15
         8 KELLY                      9000 28-APR-15


Additional Notes

  • If you specify a negative OFFSET number, OFFSET is treated as 0. If you specify NULL or a number greater than or equal to the number of rows that are returned by the query, 0 rows are returned.​
  • The FIRST or NEXT keywords can be used interchangeably. They are provided for semantic clarity.
  • The ROW keyword and the ROWS keyword can be used interchangeably. They are provided for semantic clarity.

Summary

The new enhanced syntax for row-limiting is one of my favorite new features of Oracle 12c. It provides an extremely straight-forward syntax for implementing a Top-N and query paging instead of using less convenient syntax via the ROWNUM pseudo column as we used to do prior to Oracle 12c.

    2 comments: