This SQL challenge comes from LeetCode. We need to find missing subtasks from the Executed table. We know which subtasks have executed and how many subtasks there should be.
Your first instinct might be to generate all subtask ids using a recursive CTE. Instead, we can use the generate_series function, i.e.
select
task_id
,generate_series(1, subtasks_count) as subtask_id
from Tasks;
For each row in the Tasks
table, a number of rows equal to subtasks_count
is generated. Each of these rows will have the same task_id
but a different subtask_id
(ranging from 1 to subtasks_count
).
The result of the query would be:
task_id | subtask_id |
---|
1 | 1 |
1 | 2 |
1 | 3 |
2 | 1 |
2 | 2 |
3 | 1 |
3 | 2 |
3 | 3 |
3 | 4 |
The anti join subtasks_expanded ▷ Executed will produce the desired result set.
with subtasks_expanded as (
select task_id, generate_series(1, subtasks_count) as subtask_id
from Tasks
)
select s.task_id, s.subtask_id
from subtasks_expanded s
left join Executed e on e.task_id = s.task_id and e.subtask_id = s.subtask_id
where e.subtask_id is null
Table: Tasks
Table: Executed