# Advent of Code 2020: "Report Repair" - Day 1

Posted by ads' corner on Tuesday, 2020-12-01
Posted in [Sql]

Started “Advent of Code” with the kid, the kid is polishing the Python (and English) skills. I thought I better do this in SQL.

The Task 1 for “Day 1” is: you get `200` numbers from the accounting department, find the two which in sum are 2020. This two numbers multiplied is the result of the task.

Task 2 is like task 1, except it’s using three numbers. All three in sum will be `2020`, and then multiply these three and that is the result.

## Preparation

It’s possible to resolve this on the commandline in the database, but given that I deal with `200` numbers I decided to feed them into a table:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 `````` ``````DROP TABLE IF EXISTS day1; CREATE TABLE day1 ( value INT ); COPY day1 FROM STDIN; xxxx xxxx .... xxxx \, DROP TABLE CREATE TABLE COPY 200 ``````

The result is `200` lines with numbers in the `accounting` table.

 ``````1 2 3 4 5 `````` ``````SELECT COUNT(*) FROM day1; count ------- 200 (1 row) ``````

A CTE in PostgreSQL allows to build a sub-result which is independent from the other data sets. It’s like a table which is created on the fly, and can be used in the main query as a table. In order to find the two numbers which sum to `2020`, I join two result sets of the table data:

 ``````1 2 3 4 5 6 7 8 `````` ``````WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ) SELECT ... FROM l1, l2 ``````

This will produce a cartesian (or cross) join of all rows from `l1` with `l2`, a total of `40.000` results. This result set needs to be limited to all rows where `l1.value` and `l2.value` together is `2020`:

 ``````1 2 3 4 5 6 7 8 9 `````` ``````WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ) SELECT l1.value AS l1, l2.value AS l2, l1.value + l2.value AS sum FROM l1, l2 WHERE l1.value + l2.value = 2020; ``````

And finally I need the multiplication of `l1.value` and `l2.value`:

 ``````1 2 3 4 5 6 7 8 9 `````` ``````WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ) SELECT l1.value AS l1, l2.value AS l2, l1.value + l2.value AS sum, l1.value * l2.value AS mul FROM l1, l2 WHERE l1.value + l2.value = 2020; ``````

The result is:

 ``````1 2 3 4 5 `````` `````` l1 | l2 | sum | mul ------+------+------+------- 28 | 1992 | 2020 | 55776 1992 | 28 | 2020 | 55776 (2 rows) ``````

`55776` is the result I’m looking for. Of course this result appears twice, because I run over the same data in both loops.

Basically task 2 is the same as task 1, except that I need to loop 3 times (3 CTE instead of 2), and compare three numbers instead of two:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 `````` ``````WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ), l3 AS ( SELECT value FROM day1 ) SELECT l1.value AS l1, l2.value AS l2, l3.value AS l3, l1.value + l2.value + l3.value AS sum, l1.value * l2.value * l3.value AS mul FROM l1, l2, l3 WHERE l1.value + l2.value + l3.value = 2020; ``````

The result:

 ``````1 2 3 4 5 6 7 8 9 `````` `````` l1 | l2 | l3 | sum | mul -----+-----+-----+------+----------- 983 | 314 | 723 | 2020 | 223162626 983 | 723 | 314 | 2020 | 223162626 314 | 983 | 723 | 2020 | 223162626 314 | 723 | 983 | 2020 | 223162626 723 | 983 | 314 | 2020 | 223162626 723 | 314 | 983 | 2020 | 223162626 (6 rows) ``````

Given that we run three times over the data, I now get 6 times the same result. The query runs a little bit longer, producing a cross join with `8.000.000` rows this time. And `223162626` is the number I’m looking for.

My PostgreSQL 12 does not optimize the CTEs, but does materialize all two respective three of them. That is a bit slow.

Categories: [Sql]