Find the Nth highest salary from the “employees” table.
It’s the most frequently searched question on Google and one of the most commonly asked interview questions. There are several methods to tackle this query. Let me walk you through a few of these methods.
By the end of this blog, I’ll reveal which method is the most efficient and which one is the least efficient. So, stick around.
I’ll be using Oracle Database 21c, and for the demonstration, I’ll employ a dummy employees table with the following structure:
CREATE TABLE employees ( employee_id NUMBER PRIMARY KEY, employee_name VARCHAR2(100), salary NUMBER );
I’ve even inserted 10 rows of data into this table, which will resemble something like this:
You can download the script from my GitHub page. Simply click here – [GitHub link].
Find the nth highest salary from the “employees” table.
In this demonstration, we’ll focus on finding the 2nd highest salary. Here are a few approaches to accomplish this:
Method 1: Using DENSE_RANK() Function with CTE
WITH ranked_salaries AS ( SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rank FROM employees ) SELECT salary FROM ranked_salaries WHERE rank = 2;
In this method, we first create a Common Table Expression (CTE) and name it “ranked_salaries.” The core of this CTE is the SELECT statement within it. In this SELECT statement, we initially retrieve data from the “salary” column. We then employ the DENSE_RANK() window function, which assigns ranks to each entry in the “salary” column in descending order.
One advantage of the DENSE_RANK() function is that it assigns the same rank to duplicate entries, which is useful when multiple employees have the same salary. We use the OVER clause to sort the rows of the “salary” column in descending order. Upon execution, the Oracle Engine treats the result returned by the CTE as a temporary table named “ranked_salaries.” This result set contains two columns: “salary” and “rank.” Since we have a CTE, we require a subsequent part, which in our case is another SELECT statement following our CTE. In that SELECT statement, we simply retrieve the salary from our CTE and use the WHERE clause to filter the results based on the rank, which in this case is 2. As a result, this will display the 2nd highest salary.
Method 2: Using SELF JOIN
SELECT MAX(e1.salary) AS second_highest_salary FROM employees e1 WHERE e1.salary < (SELECT MAX(salary) FROM employees e2);
In this approach, we employ a self-join to find the second highest salary. The first SELECT statement retrieves the maximum salary using the MAX() aggregate function, aliasing it as “second_highest_salary,” from the “employees e1” table, where “e1” serves as an alias for the “employees” table.
The WHERE clause filters the results, ensuring that the result set only contains employees whose salaries are less than the maximum salary in the “employees” table.
Next, there’s a subquery. This subquery calculates the maximum salary from the same “employees” table using the alias “e2.” Essentially, this subquery identifies the highest salary in the employees table.
During execution, the subquery runs first and returns the highest salary from the “employees” table. Following that, the outer query executes, and the WHERE clause of the outer query employs the result returned by the subquery to filter the results. As a result, you obtain the 2nd highest salary from the employee table.
Method 3: Using a Subquery with LIMIT and OFFSET
Please note that this process is compatible only with Oracle Database version 12c or later.
SELECT salaryFROM employees ORDER BY salary DESC OFFSET 1 ROWS FETCH NEXT 1 ROWS ONLY;
In this method, the SELECT statement used is fairly straightforward, returning the salary value from the “employees” table and sorting it in descending order, thanks to the “ORDER BY salary DESC” clause. The key components introduced here are the OFFSET and FETCH clauses used in the final line of the query:
OFFSET clause: This clause instructs the database to skip a specified number of rows from the result set. For instance, using “OFFSET 1” instructs the database to skip the top row, which is the 1st highest salary. We skip this row because we’re interested in finding the 2nd highest salary. If you were looking for the 3rd highest salary, you’d use “OFFSET 2.” This way, the top 2 rows from the result set would be skipped.
However, the problem with OFFSET alone is that even after skipping rows, the result set would still contain all the preceding rows except for the top 1 that we skipped using “OFFSET 1.” For instance, if your table has 100 rows, after skipping the top 1 row, all the remaining 99 rows would still be displayed.
To address this issue, we utilize the FETCH clause:
FETCH clause: The “FETCH NEXT 1 ROWS ONLY” clause fetches only the next 1 row from the result set after skipping the first one. This ensures that we see only the desired 2nd highest salary from the “employees” table.
Method 4: Using a Correlated Subquery
SELECT DISTINCT salary FROM employees e1 WHERE 1 = (SELECT COUNT(DISTINCT salary)FROM employees e2 WHERE e2.salary >= e1.salary);
In this method, we continue to use the basic SELECT statement to retrieve distinct values from the “Salary” column. However, to find the 2nd highest salary, we apply a filter using the WHERE clause. The WHERE clause condition incorporates a correlated subquery. This correlated subquery counts distinct salaries in the “employees” table (aliased as “e2”) that are greater than or equal to the salary of the current employee in the outer query (aliased as “e1”). The result of this subquery is used to filter the outer query. Keep in mind that correlated subqueries execute first. Therefore, this inner subquery runs for each row in the “employees” table (aliased as “e1”). The outer query selects distinct salaries from the “employees” table where the condition in the subquery is true.
Method 5: Using a NTH_VALUE Windows Function
SELECT DISTINCT NTH_VALUE(salary, 2) OVER (ORDER BY salary DESC) AS second_highest_salary FROM employees OFFSET 1 ROWS FETCH NEXT 1 ROWS ONLY;
In this approach, we utilize the same SELECT statement as before but introduce an in-built window function, NTH_VALUE(). As the name suggests, the NTH_VALUE() function returns the Nth value in an ordered set. It accepts two arguments: the first is the column name or expression you want to retrieve, and the second is an integer specifying which row within the specified column you want to obtain. Using the “OVER” keyword, we define a frame of rows within the result set over which the NTH_VALUE() window function operates. In our case, it’s a sorted list of values from the salary column.
Among all the queries discussed above, determining the most efficient one depends on various factors related to your database design. However, if we consider the criteria of minimizing operations and performing well on larger datasets, we can make an informed assessment. The most efficient approach among the options discussed is query number 3, which involves finding the Nth highest salary using a subquery with the LIMIT and OFFSET clauses. Here’s why:
• Efficiency: This method employs standard SQL OFFSET and FETCH NEXT clauses, allowing it to directly retrieve the second-highest salary without the need for additional subqueries or Common Table Expressions (CTEs).
• Optimized Execution: The OFFSET clause efficiently skips the first row, which is the highest salary, while the FETCH NEXT clause retrieves the subsequent row, which is the second-highest salary. This process is straightforward and optimized for obtaining the desired result.
In summary, while efficiency can vary depending on your specific database and dataset, query number 3 stands out as a practical and optimized solution for finding the Nth highest salary. It reduces unnecessary complexity and performs well in most scenarios, making it a reliable choice.