SQL ANSI: Implement Binary Search on a Sorted Array
Algorithm description:
This SQL function implements the binary search algorithm to find the index of a target value within a sorted array represented by a database table. It efficiently narrows down the search space by repeatedly dividing the array in half. This is crucial for optimizing search operations on large, ordered datasets where linear scans would be too slow.
Algorithm explanation:
The `binary_search_in_array` function takes the table name, value column, index column, target value, and array size as input. It initializes `v_low` to 0 and `v_high` to `p_array_size - 1`. The core of the algorithm is a `WHILE` loop that continues as long as `v_low <= v_high`. In each iteration, it calculates the middle index `v_mid`. Dynamic SQL is used to fetch the value at `v_mid` from the specified table and column. If the `v_current_value` matches `p_target_value`, the index `v_mid` is returned. If `v_current_value` is less than `p_target_value`, `v_low` is updated to `v_mid + 1`; otherwise, `v_high` is updated to `v_mid - 1`. If the loop completes without finding the target, -1 is returned. The time complexity is O(log N), where N is the size of the array, due to the halving of the search space in each step. The space complexity is O(1) for variables, plus the overhead of dynamic SQL execution. Edge cases handled include an empty or invalid-sized array, and the target value not being present in the array. The invariant maintained is that the target value, if present, always lies within the range `[v_low, v_high]` inclusive.
Pseudocode:
Function binary_search_in_array(array_table, value_col, index_col, target, array_size):
IF array_size <= 0 THEN
RETURN -1
END IF
low = 0
high = array_size - 1
WHILE low <= high:
mid = low + floor((high - low) / 2)
current_value = fetch value from array_table at index_col = mid, from value_col
IF current_value == target THEN
RETURN mid
ELSE IF current_value < target THEN
low = mid + 1
ELSE
high = mid - 1
END IF
END WHILE
RETURN -1 -- Target not found
Exercise library
Library
Playable exercises
Choose Exercise
Filter, sort, and preview algorithm before you choose one to play.
Total9published items
Pages2pages available
Browse published coding scenarios with filters, sorting, and pagination.
Loading scenarios…
⚠
Something went wrong. Please retry.
🗂
No scenarios to show yet. Adjust the filters or refresh.
goal: Find the Nth highest salary using CTEs.input: An 'employees' table with a 'salary' column and a parameter N.output: The Nth highest salary, or an empty set if N is invalid or no such salary exists.
Canonical Algorithm Text
WITH RankedSalaries AS ( -- Assign a rank to each distinct salary in descending order SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) as salary_rank FROM ( -- Select only distinct salaries to avoid ranking duplicates multiple times SELECT DISTINCT salary FROM employees )...
Description Algorithm
This SQL query finds the Nth highest salary using Common Table Expressions (CTEs) and the DENSE_RANK() window function. It's designed to be flexible, allowing you to specify any rank 'N'. This is useful for salary analys...
Explanation
The query first defines a CTE named 'RankedSalaries'. Inside the CTE, it selects distinct salaries from the 'employees' table. Then, it applies the DENSE_RANK() window function to these distinct salaries, ordering them in descending order. DENSE_RANK() assigns a rank to each unique salary, ensuring no gaps in ranking even if there are duplicate salaries. The main query then selects the salary from 'RankedSalaries' where the assigned rank matches the input parameter ':N'. This approach correctly handles duplicate salaries and ensures that if 'N' is larger than the number of distinct salaries, or if the table is empty, an empty result set is returned. The time complexity is typically O(N log N) or O(N) depending on the database's implementation of window functions and sorting, and space complexity is O(N) for storing distinct salaries and their ranks within the CTE.
Pseudocode
1. Define a CTE 'RankedSalaries'. 2. Inside the CTE: a. Select distinct salaries from the employees table. b. Assign a dense rank to each distinct salary in descending order. 3. Select the salary from 'RankedSalaries' wh...
Published
SQL ANSISQL (sql-ansi)
SQL ANSI: Generate Fibonacci Sequence up to N terms
goal: Generate the Fibonacci sequence up to N terms.input: An integer N representing the number of terms.output: A list of N Fibonacci numbers in ascending order.
Canonical Algorithm Text
-- This SQL script generates the Fibonacci sequence up to N terms. -- It uses a procedural approach with variables to iteratively calculate each term. -- Declare a variable to hold the number of terms (N) -- In a real stored procedure or function, this would be a parameter. -- Fo...
Description Algorithm
This SQL script generates the Fibonacci sequence up to a user-defined number of terms (N). It iteratively calculates each subsequent number by summing the two preceding ones, starting with 0 and 1. The sequence is stored...
Explanation
The script uses a procedural approach within a SQL block (often found in stored procedures or anonymous blocks). It initializes two variables, `a` and `b`, to the first two Fibonacci numbers (0 and 1). A `WHILE` loop iterates `N-2` times (after handling base cases for N=1 and N=2). In each iteration, it calculates `next_fib` as the sum of `a` and `b`, then updates `a` to `b` and `b` to `next_fib`. The newly calculated `next_fib` is inserted into a temporary table `FibonacciSequence` along with its term number. The loop continues until `N` terms are generated. The time complexity is O(N) because each term is calculated once. The space complexity is O(N) to store the sequence in the temporary table. Edge cases handled include N=0 (empty sequence), N=1 (sequence [0]), and N=2 (sequence [0, 1]). The invariant is that at the start of each loop iteration, `a` and `b` hold the two most recently generated Fibonacci numbers, and `i` represents the count of terms already processed or about to be inserted.
Pseudocode
Initialize a temporary table `FibonacciSequence` with columns `term_number` and `fib_value`. Set N = desired number of terms. Set a = 0. Set b = 1. Set i = 1. IF N >= 1 THEN Insert (1, a) into `FibonacciSequence`. END IF...
Published
SQL ANSISQL (sql-ansi)
SQL ANSI: Find the Longest Consecutive Sequence of Identical Values
goal: Find the longest consecutive sequence of identical values.input: A table with an ordering column (e.g., `id`) and a value column (e.g., `value_column`).output: The value and the length of its longest consecutive sequence.
Canonical Algorithm Text
-- This query finds the longest consecutive sequence of identical values in a column. -- It uses window functions to identify the start of each sequence and calculate its length. WITH -- Step 1: Assign a group identifier to consecutive identical values. -- We use LAG to check the...
Description Algorithm
This SQL query identifies the longest consecutive sequence of identical values within a specified column. It cleverly uses window functions like `LAG` and `SUM` to group consecutive identical values and then `COUNT` to d...
Explanation
The query employs a series of Common Table Expressions (CTEs) to break down the problem. `GroupStarts` uses `LAG` to compare each row's `value_column` with the previous row's value (ordered by `id`). If they differ, `is_new_group` is marked as 1, indicating the start of a new sequence. `GroupIDs` then uses a running `SUM` of `is_new_group` to assign a unique `group_id` to all rows belonging to the same consecutive sequence. `SequenceLengths` groups by `value_column` and `group_id` to count the number of rows in each distinct sequence, thus determining its `sequence_length`. Finally, `RankedSequences` ranks these sequences by `sequence_length` in descending order. The main query selects the `value_column` and `sequence_length` from the top-ranked sequence. The time complexity is O(N log N) due to the ordering required by window functions, where N is the number of rows. Space complexity is O(N) for storing intermediate results. Edge cases handled include NULL values (ignored), empty tables (no results), and tables with a single row (the sequence length will be 1). The invariant is that `group_id` uniquely identifies contiguous blocks of identical `value_column` values.
Pseudocode
1. Create CTE `GroupStarts`: - Select `id`, `value_column`. - For each row, compare `value_column` with the previous row's `value_column` (ordered by `id`). - If they are different, set `is_new_group` to 1, otherwise 0....
Published
SQL ANSISQL (sql-ansi)
SQL ANSI: Find Top N Customers by Total Purchase Amount
goal: Find the top N customers by total purchase amount.input: Customers table (customer_id, customer_name), Purchases table (purchase_id, customer_id, amount)output: List of customer names and their total purchase amounts for the top N customers.
Canonical Algorithm Text
WITH CustomerTotal AS ( -- Calculate the total purchase amount for each customer SELECT c.customer_id, c.customer_name, SUM(COALESCE(p.amount, 0)) AS total_purchase_amount FROM Customers c LEFT JOIN Purchases p ON c.customer_id = p.customer_id GROUP BY c.customer_id, c.customer_n...
Description Algorithm
This SQL query identifies the top N customers who have spent the most money. It first calculates the total purchase amount for each customer, then ranks them, and finally selects the top N. This is useful for marketing c...
Explanation
The query uses Common Table Expressions (CTEs) for clarity and modularity. The `CustomerTotal` CTE calculates the sum of purchase amounts for each customer, using `COALESCE` to treat NULL purchase amounts as 0, ensuring accurate aggregation. The `LEFT JOIN` ensures all customers are included, even those with no purchases. The `RankedCustomers` CTE then assigns a rank to each customer based on their `total_purchase_amount` in descending order using the `ROW_NUMBER()` window function. Finally, the main query selects customers whose rank is less than or equal to N. The time complexity is dominated by the aggregation and ranking, typically O(M log M) or O(M) depending on the database's internal sorting/hashing, where M is the number of customers. Space complexity is O(M) for storing intermediate results. Edge cases include customers with no purchases (handled by LEFT JOIN and COALESCE), and N being larger than the total number of customers (which will simply return all customers ranked).
Pseudocode
1. Create a temporary result set `CustomerTotal`: - Join `Customers` and `Purchases` tables on `customer_id`. - For each customer, calculate the sum of `amount`, treating NULLs as 0. - Group results by `customer_id` and...
goal: Calculate the median value of a numeric column.input: A table with a numeric column (e.g., `your_table_name` with `numeric_column`).output: The median value of the `numeric_column`.
Canonical Algorithm Text
WITH OrderedValues AS ( -- Assign a row number to each value after ordering SELECT numeric_column, ROW_NUMBER() OVER (ORDER BY numeric_column) as rn, COUNT(*) OVER () as total_count FROM your_table_name -- Replace with your actual table name WHERE numeric_column IS NOT NULL -- Ex...
Description Algorithm
This SQL query calculates the median value of a numeric column in a table. It first orders all non-NULL values and assigns them a rank. Then, it identifies the middle one or two values based on whether the total count of...
Explanation
The query uses Common Table Expressions (CTEs) for clarity. `OrderedValues` assigns a sequential row number (`rn`) to each non-NULL `numeric_column` value after ordering them. It also calculates the `total_count` of non-NULL values using a window function. `MedianCandidates` then filters these ordered values to select the row(s) that represent the middle element(s). The condition `rn IN (FLOOR((total_count + 1) / 2.0), CEIL((total_count + 1) / 2.0))` correctly identifies the middle row for odd counts (both `FLOOR` and `CEIL` yield the same middle index) and the two middle rows for even counts. The final `SELECT` statement calculates the `AVG()` of the `numeric_column` from `MedianCandidates`. If there's only one candidate (odd count), `AVG()` returns that value. If there are two candidates (even count), `AVG()` returns their average. The time complexity is dominated by the sorting step, which is O(N log N), where N is the number of non-NULL rows. Space complexity is O(N) for storing the ordered values and row numbers. Edge cases handled include NULL values in the `numeric_column` (they are excluded) and both odd and even numbers of data points.
Pseudocode
1. Create a CTE `OrderedValues`: - Select `numeric_column` from the table. - Assign `ROW_NUMBER()` ordered by `numeric_column` to `rn`. - Assign `COUNT(*)` over all rows to `total_count`. - Filter out rows where `numeric...
goal: Find the Nth highest salary.input: An `Employees` table with a `salary` column.output: The Nth highest salary.
Canonical Algorithm Text
WITH RankedSalaries AS ( -- Assign a dense rank to each distinct salary in descending order. -- DENSE_RANK is used to handle ties correctly, ensuring that if multiple employees -- have the same salary, they receive the same rank, and the next rank is consecutive. SELECT salary, D...
Description Algorithm
This SQL query efficiently finds the Nth highest salary from a table of employees. It uses the `DENSE_RANK` window function to assign a unique rank to each distinct salary, treating ties appropriately. The query then sel...
Explanation
The query utilizes a Common Table Expression (CTE) called `RankedSalaries`. Inside the CTE, `DENSE_RANK() OVER (ORDER BY salary DESC)` assigns a rank to each distinct salary value. `DENSE_RANK` is crucial here because it assigns consecutive ranks even if there are gaps in the salaries (e.g., if salaries are 10000, 9000, 7000, `DENSE_RANK` would assign ranks 1, 2, 3 respectively, whereas `RANK` would assign 1, 2, 4). The `ORDER BY salary DESC` ensures that higher salaries get lower ranks (e.g., rank 1 for the highest salary). The `WHERE salary IS NOT NULL` clause filters out any records without salary information. The main query then selects the `salary` from `RankedSalaries` where `salary_rank` equals the desired `N`. The `LIMIT 1` is a safeguard, though with `DENSE_RANK`, if `N` is valid, there should ideally be only one distinct salary for that rank. The time complexity is O(M log M) due to the sorting involved in the window function, where M is the number of employees with non-NULL salaries. Space complexity is O(M) for storing the ranked salaries. Edge cases include NULL salaries (handled), fewer than N distinct salaries (no result or an error depending on `N` and data), and `N` being invalid (e.g., 0 or negative, which would yield no results).
Pseudocode
1. Create a CTE `RankedSalaries`: - Select `salary` from the `Employees` table. - Assign `DENSE_RANK()` ordered by `salary` DESC to `salary_rank`. - Filter out rows where `salary` is NULL. 2. Select `salary` from `Ranked...
goal: Find customers who have not placed any orders.input: Customers table (customer_id, first_name, last_name), Orders table (order_id, customer_id, order_date).output: List of customer IDs, first names, and last names for customers with no orders.
Canonical Algorithm Text
-- This query identifies customers who have not placed any orders. -- It demonstrates a robust approach using LEFT JOIN and checking for NULLs in the joined table. -- Define the main Customers table and the Orders table. -- We assume Customers has customer_id, first_name, last_na...
Description Algorithm
This SQL query identifies customers who have not placed any orders. It achieves this by performing a `LEFT JOIN` from the `Customers` table to the `Orders` table and then filtering for customers where the join resulted i...
Explanation
The primary method uses a `LEFT JOIN` between `Customers` and `Orders` on `customer_id`. The `LEFT JOIN` ensures that all customers from the `Customers` table are included in the result set. For customers who have placed orders, their details will be joined with their corresponding order information. For customers who have not placed any orders, the columns from the `Orders` table (like `order_id`) will be `NULL`. The subsequent `WHERE coi.order_id IS NULL` clause effectively filters the result set to include only those customers for whom no matching order was found. The time complexity is typically O(N + M) or O(N log N + M log M) depending on indexing, where N is the number of customers and M is the number of orders. The `NOT EXISTS` alternative is often more efficient as it can short-circuit and doesn't require sorting the entire `Orders` table. Space complexity is O(N) for the intermediate result set. Edge cases handled include customers with no orders (the primary goal), and implicitly, customers with orders (they are excluded). The invariant is that a customer is included in the final output if and only if there is no corresponding record in the `Orders` table for their `customer_id`.
Pseudocode
1. Create a CTE `CustomerOrderInfo`: - Perform a `LEFT JOIN` between `Customers` (aliased as `c`) and `Orders` (aliased as `o`) on `c.customer_id = o.customer_id`. - Select `c.customer_id`, `c.first_name`, `c.last_name`,...
Published
SQL ANSISQL (sql-ansi)
SQL ANSI: Implement Binary Search on a Sorted Array
goal: Implement binary search in SQL.input: Table representing a sorted array (e.g., `sorted_numbers` with `idx` and `val` columns), target value, array size.output: The index of the target value if found, otherwise -1.
Canonical Algorithm Text
CREATE OR REPLACE FUNCTION binary_search_in_array( p_array_table_name TEXT, p_value_column TEXT, p_index_column TEXT, p_target_value ANYELEMENT, p_array_size INT ) RETURNS INT AS $$ DECLARE v_low INT := 0; v_high INT; v_mid INT; v_current_value ANYELEMENT; v_query TEXT; v_found_i...
Description Algorithm
This SQL function implements the binary search algorithm to find the index of a target value within a sorted array represented by a database table. It efficiently narrows down the search space by repeatedly dividing the...
Explanation
The `binary_search_in_array` function takes the table name, value column, index column, target value, and array size as input. It initializes `v_low` to 0 and `v_high` to `p_array_size - 1`. The core of the algorithm is a `WHILE` loop that continues as long as `v_low <= v_high`. In each iteration, it calculates the middle index `v_mid`. Dynamic SQL is used to fetch the value at `v_mid` from the specified table and column. If the `v_current_value` matches `p_target_value`, the index `v_mid` is returned. If `v_current_value` is less than `p_target_value`, `v_low` is updated to `v_mid + 1`; otherwise, `v_high` is updated to `v_mid - 1`. If the loop completes without finding the target, -1 is returned. The time complexity is O(log N), where N is the size of the array, due to the halving of the search space in each step. The space complexity is O(1) for variables, plus the overhead of dynamic SQL execution. Edge cases handled include an empty or invalid-sized array, and the target value not being present in the array. The invariant maintained is that the target value, if present, always lies within the range `[v_low, v_high]` inclusive.
Pseudocode
Function binary_search_in_array(array_table, value_col, index_col, target, array_size): IF array_size <= 0 THEN RETURN -1 END IF low = 0 high = array_size - 1 WHILE low <= high: mid = low + floor((high - low) / 2) curren...
My Favorite Exercise
You have not saved any favorite exercises yet.
Custom exercise
Paste your own algorithm/text
Max 20,000 characters. (Dangerous HTML/script fragments will be blocked)
Normalized preview:
Publish details
Shown only if you opt to submit for review. Required when consent is checked.