Advent of Code 2020: "Report Repair" - Day 1
Started "Advent of Code" with the kid, the kid is polishing the Python (and English) skills. I thought I better do this in SQL.
The Task 1 for "Day 1" is: you get 200 numbers from the accounting department, find the two which in sum are 2020. This two numbers multiplied is the result of the task.
Task 2 is like task 1, except it's using three numbers. All three in sum will be 2020, and then multiply these three and that is the result.
It's possible to resolve this on the commandline in the database, but given that I deal with 200 numbers I decided to feed them into a table:
DROP TABLE IF EXISTS day1; CREATE TABLE day1 ( value INT ); COPY day1 FROM STDIN; xxxx xxxx .... xxxx \, DROP TABLE CREATE TABLE COPY 200
The result is 200 lines with numbers in the "accounting" table.
SELECT COUNT(*) FROM day1; count ------- 200 (1 row)
A CTE in PostgreSQL allows to build a sub-result which is independent from the other data sets. It's like a table which is created on the fly, and can be used in the main query as a table. In order to find the two numbers which sum to 2020, I join two result sets of the table data:
WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ) SELECT ... FROM l1, l2
This will produce a cartesian (or cross) join of all rows from l1 with l2, a total of 40.000 results. This result set needs to be limited to all rows where l1.value and l2.value together is "2020":
WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ) SELECT l1.value AS l1, l2.value AS l2, l1.value + l2.value AS sum FROM l1, l2 WHERE l1.value + l2.value = 2020;
And finally I need the multiplication of l1.value and l2.value:
WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ) SELECT l1.value AS l1, l2.value AS l2, l1.value + l2.value AS sum, l1.value * l2.value AS mul FROM l1, l2 WHERE l1.value + l2.value = 2020;
The result is:
l1 | l2 | sum | mul ------+------+------+------- 28 | 1992 | 2020 | 55776 1992 | 28 | 2020 | 55776 (2 rows)
55776 is the result I'm looking for. Of course this result appears twice, because I run over the same data in both loops.
Basically task 2 is the same as task 1, except that I need to loop 3 times (3 CTE instead of 2), and compare three numbers instead of two:
WITH l1 AS ( SELECT value FROM day1 ), l2 AS ( SELECT value FROM day1 ), l3 AS ( SELECT value FROM day1 ) SELECT l1.value AS l1, l2.value AS l2, l3.value AS l3, l1.value + l2.value + l3.value AS sum, l1.value * l2.value * l3.value AS mul FROM l1, l2, l3 WHERE l1.value + l2.value + l3.value = 2020;
l1 | l2 | l3 | sum | mul -----+-----+-----+------+----------- 983 | 314 | 723 | 2020 | 223162626 983 | 723 | 314 | 2020 | 223162626 314 | 983 | 723 | 2020 | 223162626 314 | 723 | 983 | 2020 | 223162626 723 | 983 | 314 | 2020 | 223162626 723 | 314 | 983 | 2020 | 223162626 (6 rows)
Given that we run three times over the data, I now get 6 times the same result. The query runs a little bit longer, producing a cross join with 8.000.000 rows this time. And 223162626 is the number I'm looking for.
My PostgreSQL 12 does not optimize the CTEs, but does materialize all two respective three of them. That is a bit slow.
Display comments as Linear | Threaded