SQL: User Purchase Platform

SQL: User Purchase Platform

This SQL challenge comes from LeetCode. We need to find the total user counts and amount spent for each date in three mutually exclusive categories: 1. mobile; 2. desktop; 3. mobile and desktop.

Approach

We create a truth table of user and platform usage for each date to help segment results correctly.

This:

select 
     user_id
    ,spend_date
    ,max(case when platform = 'mobile' then true else false end) as mobile
    ,max(case when platform = 'desktop' then true else false end) as desktop
from Spending
group by user_id, spend_date

yields a truth table like this:

| user_id | spend_date | mobile | desktop |
| ------- | ---------- | ------ | ------- |
| 1       | 2019-07-01 | 1      | 1       |
| 2       | 2019-07-01 | 1      | 0       |
| 2       | 2019-07-02 | 1      | 0       |
| 3       | 2019-07-01 | 0      | 1       |
| 3       | 2019-07-02 | 0      | 1       |

user_platform acts as the primary dataset that determines the structure and scope of the final result set.

  • Its scope and granularity establishes the scope of analysis, i.e. daily platform usage. It contains one row for each user_id and spend_date combination, plus platform usage flags.
  • The left join in the query below ensures user_platform dictates what spend_dates and users appear in the result set. We need to guarantee every spend_date for each platform combination is represented in the result, even when no spending occurred.

Separating queries into one for each platform category (mobile-only, desktop-only, mobile and desktop) has several benefits:

  • Clear logical separation. Each query focuses on a single platform category, making it easy to follow and debug
  • Easier maintenance. Changes to one category does not effect the other queries
  • Each query can have tailored aggregation logic
  • Processing each category independently avoids unintended overlaps or logical conflicts
  • Use of union takes advantage of parallelism, depending on the rdbms.
  • Each individual query can be validated and tested independently.

Code

with user_platform as (
select 
     user_id
    ,spend_date
    ,max(case when platform = 'mobile' then true else false end) as mobile
    ,max(case when platform = 'desktop' then true else false end) as desktop
from Spending
group by user_id, spend_date
)
select 
    p.spend_date
    ,'both' as platform
    ,coalesce(sum(s.amount), 0) as total_amount
    ,coalesce(count(distinct s.user_id), 0) as total_users
from user_platform p
left join Spending s on p.user_id = s.user_id 
    and p.spend_date = s.spend_date
    and (p.mobile and p.desktop)
group by spend_date
union 
select 
    p.spend_date
    ,'mobile' as platform
    ,coalesce(sum(s.amount), 0) as total_amount
    ,coalesce(count(distinct s.user_id), 0) as total_users
from user_platform p
left join Spending s on p.user_id = s.user_id 
    and p.spend_date = s.spend_date
    and (p.mobile and not p.desktop)
group by spend_date
union 
select 
    p.spend_date
    ,'desktop' as platform
    ,coalesce(sum(s.amount), 0) as total_amount
    ,coalesce(count(distinct s.user_id), 0) as total_users
from user_platform p
left join Spending s on p.user_id = s.user_id 
    and p.spend_date = s.spend_date
    and (not p.mobile and p.desktop)
group by spend_date