SQL
The order SQL actually runs (not the order you write it):
| Step | Clause | What it does |
|---|
| 1 | FROM / JOIN | Picks tables, joins them |
| 2 | WHERE | Filters rows before grouping |
| 3 | GROUP BY | Groups rows |
| 4 | HAVING | Filters groups after aggregation |
| 5 | SELECT | Picks columns, computes expressions |
| 6 | DISTINCT | Removes duplicates |
| 7 | ORDER BY | Sorts results |
| 8 | LIMIT / OFFSET | Caps output rows |
This order is why you cannot use a column alias in WHERE but can in ORDER BY.
| Join | Keeps |
|---|
INNER JOIN | Only matching rows from both sides |
LEFT JOIN | All left rows + matching right (NULL if no match) |
RIGHT JOIN | All right rows + matching left (NULL if no match) |
FULL OUTER JOIN | All rows from both sides (NULLs where no match) |
CROSS JOIN | Every combination (cartesian product) |
LEFT JOIN ... WHERE right.id IS NULL | Anti-join: left rows with no match |
| Function | Notes |
|---|
COUNT(*) | Counts all rows including NULLs |
COUNT(col) | Counts non-NULL values in column |
COUNT(DISTINCT col) | Counts unique non-NULL values |
SUM() / AVG() | Ignores NULLs |
MIN() / MAX() | Works on numbers, strings, dates |
STRING_AGG(col, ',') | Concatenates values (Postgres) |
ARRAY_AGG(col) | Collects into array (Postgres) |
| Function | What it does | Gaps after ties? |
|---|
ROW_NUMBER() | Unique sequential number per row | N/A |
RANK() | Rank with gaps after ties | Yes (1,1,3,4) |
DENSE_RANK() | Rank without gaps after ties | No (1,1,2,3) |
LAG(col, n) | Value from n rows before | N/A |
LEAD(col, n) | Value from n rows after | N/A |
NTILE(n) | Divides rows into n buckets | N/A |
PERCENT_RANK() | Relative rank as 0-1 decimal | N/A |
SUM/AVG/COUNT() OVER() | Running or windowed aggregates | N/A |
| Pattern | Core technique |
|---|
| Find duplicates | GROUP BY col HAVING COUNT(*) > 1 |
| Dedup/latest per group | ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC) WHERE rn = 1 |
| Running total | SUM(col) OVER (ORDER BY date_col) |
| Previous/next value | LAG(col) OVER (ORDER BY ...) / LEAD(col) OVER (ORDER BY ...) |
| Nth highest value | DENSE_RANK() OVER (ORDER BY col DESC) WHERE rank = N |
| Pivot rows to columns | SUM(CASE WHEN category = 'X' THEN val ELSE 0 END) |
| Anti-join | LEFT JOIN ... WHERE right.id IS NULL or NOT EXISTS (...) |
| Safe division | x / NULLIF(y, 0) |
| Percent of total | val * 100.0 / SUM(val) OVER () |
| Consecutive day streaks | date - ROW_NUMBER() ... OVER (ORDER BY date) |