Skip to content

Commit eccb8fe

Browse files
authored
Update advanced_sql_class_notes.md
1 parent 643c010 commit eccb8fe

File tree

1 file changed

+91
-1
lines changed

1 file changed

+91
-1
lines changed

advanced_sql_class_notes.md

Lines changed: 91 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -567,7 +567,7 @@ GROUP BY 1,2,3,4,5,6,7,8
567567
```
568568

569569

570-
## 4.5 Self Joins
570+
## 4.5A Self Joins
571571

572572
We can join a table to itself by invoking it twice with two aliases. This can be useful, for example, to look up the previous day's order quantity (if any) for a given `CUSTOMER_ID` and `PRODUCT_ID`:
573573
@@ -610,6 +610,96 @@ QUANTITY,
610610
FROM CUSTOMER_ORDER c1
611611
```
612612
613+
614+
## 4.5B Recursive Self Joins
615+
616+
At some point of your career, you may encounter a table that is inherently designed to be self-joined. For instance, run this query:
617+
618+
```sql
619+
SELECT * FROM EMPLOYEE
620+
```
621+
622+
This is a table containing employee information, including their manager via a `MANAGER_ID` field. Here is a sample of the results below.
623+
624+
| ID | FIRST_NAME | LAST_NAME | TITLE | DEPARTMENT | MANAGER_ID |
625+
|----|------------|------------|---------------------|-------------|------------|
626+
| 13 | Pembroke | Killgus | Accountant I | Accounting | 10 |
627+
| 14 | Harper | Argontt | Director | Operations | 3 |
628+
| 15 | Fabio | Treversh | Manager | Operations | 14 |
629+
| 16 | Gerard | Morforth | Analyst | Operations | 15 |
630+
| 17 | Stephanus | Palatino | Senior Analyst | Operations | 15 |
631+
| 18 | Jennilee | Withers | Analyst | Operations | 15 |
632+
| 19 | Desdemona | Farmar | Business Consultant | Operations | 15 |
633+
| 20 | Ashlin | Creamen | Manager | Operations | 14 |
634+
| 21 | Daniel | Licquorish | Analyst | Operations | 20 |
635+
636+
This `MANAGER_ID` points to another `EMPLOYEE` record. If you want to bring in Daniel and his superior's information, this isn't hard to do with a self join.
637+
638+
```sql
639+
SELECT e1.FIRST_NAME,
640+
e1.LAST_NAME,
641+
e1.TITLE,
642+
e2.FIRST_NAME AS MANAGER_FIRST_NAME,
643+
e2.LAST_NAME AS MANAGER_LAST_NAME
644+
645+
FROM EMPLOYEE e1 INNER JOIN EMPLOYEE e2
646+
ON e1.MANAGER_ID = e2.ID
647+
648+
WHERE e1.FIRST_NAME = 'Daniel'
649+
```
650+
651+
| FIRST_NAME | LAST_NAME | TITLE | MANAGER_FIRST_NAME | MANAGER_LAST_NAME |
652+
|------------|------------|---------|--------------------|-------------------|
653+
| Daniel | Licquorish | Analyst | Ashlin | Creamen |
654+
655+
656+
But what if you wanted to display the entire hierarchy above Daniel? Well shoot, this is hard because now I have to do several self joins to daisy-chain my way to the top. What makes this even harder is I don't know how many self joins I will need to do. For cases like this, it can be helpful to leverage recursive queries.
657+
658+
A recursion is a special type of common table expression (CTE). Typically, you "seed" a starting value and then use `UNION` or `UNION ALL` to append the results of a query that uses each "seed", and the result becomes the next seed.
659+
660+
In this case, we will use a `RECURSIVE` common table expression to seed Daniel's ID, and then append each `MANAGER_ID` of each `EMPLOYEE_ID` that matches the seed. This will give a set of ID's for employees hierarchical to Daniel. We can then use these ID's to navigate Daniel's hierarchy via JOINS, IN, or other SQL operators.
661+
662+
```sql
663+
-- generates a list of employee ID's hierarchical to Ashlin
664+
665+
WITH RECURSIVE hierarchy_of_daniel(x) AS (
666+
SELECT 21 -- start with Daniel's ID
667+
UNION ALL -- append each manager ID recursively
668+
SELECT MANAGER_ID
669+
FROM hierarchy_of_daniel INNER JOIN EMPLOYEE
670+
WHERE EMPLOYEE.ID = hierarchy_of_daniel.x -- employee ID must equal previous recursion
671+
)
672+
673+
SELECT * FROM EMPLOYEE
674+
WHERE ID IN hierarchy_of_daniel;
675+
```
676+
677+
Recursive queries are a bit tricky to get right, but practice them if you have tables structured like this. Note they also can be used to improvise a set of consecutive values without creating a table. For instance, we can generate a set of consecutive integers. Here is how you create a set of integers from 1 to 1000.
678+
679+
```sql
680+
WITH RECURSIVE my_integers(x) AS (
681+
SELECT 1
682+
UNION ALL
683+
SELECT x + 1
684+
FROM my_integers
685+
WHERE x < 1000
686+
)
687+
SELECT * FROM my_integers
688+
```
689+
690+
You can apply the same concept to generate a set of chronological dates. This recursive query will generate all dates from today to '2030-12-31':
691+
692+
```sql
693+
WITH RECURSIVE my_dates(x) AS (
694+
SELECT date('now')
695+
UNION ALL
696+
SELECT date(x, '+1 day')
697+
FROM my_dates
698+
WHERE x < '2030-12-31'
699+
)
700+
SELECT * FROM my_dates
701+
```
702+
613703
## 4.6 Cross Joins
614704

615705
Sometimes it can be helpful to generate a "cartesian product", or every possible combination between two or more data sets using a CROSS JOIN. This is often done to generate a data set that fills in gaps for another query. Not every calendar date has orders, nor does every order date have an entry for every product, as shown in this query:

0 commit comments

Comments
 (0)